Kaplan-Meier Survival Analysis — hanaml.kaplan.meier.survival.analysis • hana.ml.r

hanaml.kaplan.meier.survival.analysis is a R wrapper for SAP HANA PAL kaplan.meier.survival.analysis.

hanaml.kaplan.meier.survival.analysis(
  data = NULL,
  event.indicator = NULL,
  confidence.level = NULL
)

Arguments

data

DataFrame
DataFrame containting the sampled data points structured as follows:

Follow-up time : INTEGER or DOUBLE
Status indicator: INTEGER
Occurrence number of events at the follow-up time. (multiple rows for one follow-up time possible) : INTEGER
Group : INTEGER or VARCHAR

event.indicator

integer, optional
Specifies one value to indicate an event has occurred. Defaults to 1.

confidence.level

double, optional
specifies the confidence level for a two-sided confidence interval on the survival estimate.
Defaults to 0.95.

Value

Returns a list of DataFrames.

DataFrame 1
Estimation results after every event occurrence, structured as follows:
- GROUP: group.
- Time: event occurrence time.
- RISK_NUMBER: total number of each group before the event occurrences.
- EVENT_NUMBER: number of event occurrences.
- PROBABILITY: probability of surviving beyond event occurrence time.
- SE: standard error.
- CI_LOWER: lower bound of confidence interval.
- CI_UPPER: upper bound of confidence interval.
DataFrame 2
Log rank survival statistics of each group. Only valid for multiple groups
- GROUP: group.
- TOTAL_RISK: all individuals in the lifetime study.
- OBSERVED: number of observed events.
- LOGRANK_STAT: log rank test statistics.
DataFrame 3
Further statistics. Only valid for multiple groups
- STAT_NAME : statistics name e.g. Chi-square, df, p-value.
- STAT_VALUE : statistics value.

Details

Kaplan Meier is one of the best options to perform non-parametric estimation of the survival function when considering a long term study, where a series of possibly censored failure times are observed. It is often used to measure the time-to-death of patients after treatment or time-to-failure of machine parts

Examples

Input DataFrame data:


> data$Head(5)$Collect()
  TIME STATUS OCCURRENCES GROUP
1    9      1           1     2
2   10      1           1     1
3    1      1           2     0
4   31      0           1     1
5    2      1           1     0

Call the function:


> result <- hanaml.kaplan.meier.survival.analysis(data=data)

Results:


> result[[3]]$Collect()
  STAT_NAME STAT_VALUE
1    chiSqr  0.3279707
2        df  2.0000000
3   p-value  0.8487545