hanaml.CoxProportionalHazard is an R wrapper for SAP HANA PAL Cox proportional hazard model.

hanaml.CoxProportionalHazard(
  data = NULL,
  key = NULL,
  label = NULL,
  status = NULL,
  features = NULL,
  formula = NULL,
  tie.method = NULL,
  max.iter = NULL,
  tol = NULL,
  significance.level = NULL,
  calculate.hazard = NULL,
  output.fitted = NULL
)

Arguments

data

DataFrame
DataFrame containting the data.

key

character, optional
Name of the ID column.
Defaults to the first column if not provided.

label

character, optional
Name of the regression target column of Cox proportional hazard model, which specifies the time before a failure/death event occurs or data is right censored.
If not provided, it defaults to the 1st non-ID column of data.

status

character, optional
Name of the status column that indicates if the individual is an event or right-censored data.
If not specified, then data is assumed to have no status column, and all timestamps in label column are thus assumed to be associated with death/failure.

features

list/vector of character, optional
Names of the covariate columns.
Defaults to all non-ID, non-label, non-status columns if not specified.

formula

formula type, optional
Formula to be used for model generation. format = label~<feature_list> e.g.: formula=CATEGORY~V1+V2+V3
You can either give the formula, or a feature and label combination, but do not provide both.
Defaults to NULL.

tie.method

c("breslow", "efron"), optional
Specifies the method for dealing with tied events.
Defaults to "efron".

max.iter

integer, optional
Maximum number of iterations for numeric optimization in maximum log-likelihood estimation.
Defaults to 100.

tol

numeric, optional
Convergence(stopping) criterion for numeric optimization.
Defaults to 1e-8.

significance.level

numeric, optional
Significance level for the confidence interval of estimated coefficients.
Defaults to 0.05.

calculate.hazard

logical, optional Specifies whether or not to calculate hazard function as well as survival function.
Defaults to TRUE.

output.fitted

logical, optional
Specifies whether or not to output the fitted response.
Defaults to FALSE.

Value

Return a "CoxProportionalHazard" object with following values:

  • statistics: DataFrame
    Regression-related statistics, like mean square error, F-statistics, etc.

  • coefficients: DataFrame
    Fitted regression coefficients for the Cox PH model.

  • covariance: DataFrame
    Covariance values between features.

  • hazard: DataFrame
    Calculated cumulative baseline hazard function and survival function.

  • fitted: DataFrame
    Predicted linear predictors and exponential responses in risk space.

Details

Cox proportional hazard model is a special generalized linear model. It is a well-known realization-of-survival model that demonstrates failure or death at a certain time

Examples

Input DataFrame data:


> data$Collect()
  ID TIME STATUS X1 X2
1  1    4      1  0  0
2  2    3      1  2  0
3  3    1      1  1  0
4  4    1      0  1  0
5  5    2      1  1  1
6  6    2      1  0  1
7  7    3      0  0  1

Call the function:

cph <-  hanaml.CoxProportionalHazard(data = data,
                                     key = "ID",
                                     label = "TIME",
                                     status = "STATUS",
                                     tie.method = "efron",
                                     calculate.hazard = TRUE,
                                     output.fitted = TRUE)

Output:


> cph$coefficients
  VARIABLE_NAME      MEAN COEFFICIENT        SE     SCORE PROBABILITY   CI_LOWER CI_UPPER
1            X1 0.7142857   0.7811819 0.7975689 0.9794538   0.3273558 -0.7820244 2.344388
2            X2 0.4285714   0.9337832 1.4081100 0.6631465   0.5072367 -1.8260617 3.693628