Distribution Fitting — hanaml.DistributionFit • hana.ml.r

hanaml.DistributionFit is a R wrapper for SAP HANA PAL Distribution Fitting

hanaml.DistributionFit(
  data,
  distr.type,
  optimal.method = NULL,
  censored = FALSE
)

Arguments

data	`DataFrame` DataFrame containting the data.
distr.type	`{'exp', 'gamma', 'normal', 'poisson', 'uniform', 'weibull'}` choose the probability distribution from: `'exp'` : Exponential distribution `'gamma'`: Gamma distribution `'normal'`: Normal distribution `'poisson'`: Poisson distribution `'uniform'`: Uniform distribution `'weibull'`: Weibull distribution .
optimal.method	`{'maximum.likelihood', 'median.rank'}, optional` Specifies the estimation method. `'maximum.likelihood'` : use maximum likelihood `'median.rank'`: median rank (Valid only when distr.type is 'weibull') . Defaults to 'maximum.likelihood'.
censored	`logical, optional` Specify if the data is censored of not. TRUE only valid when distr.type is 'weibull'. Defaults to FALSE.

Value

Returns a list of DataFrame:

DataFrame
The estimated parameter values.
- NAME Name of distribution parameters
- VALUE corresponding value
DataFrame
Statistics
- STAT_NAME name of statistics.
- STAT_VALUE Value of statistics.

Details

This algorithm adapts the parameters of the chosen probability distribution in a way, s.t. the resulting distribution fits the data. PAL support distribution fitting with Normal, Gamma, Weibull, Exponential, Poisson, and Uniform distribution.

Examples

Input DataFrame data:

> data$Head(5)$Collect()
   X
1 71
2 83
3 92
4 104
5 120

Call the function:

> result <- hanaml.DistributionFit(data=data,
                                   distr.type="weibull",
                                   optimal.method='maximum.likelihood')

Results:

> result[[1]]$Collect()
               NAME  VALUE
1 DISTRIBUTIONNAME WEIBULL
2            SCALE   244.4
3            SHAPE 2.06698
> result[[2]]$Collect()
      STAT_NAME STAT_VALUE
1 LOGLIKELIHOOD  -115.1138