Distribution Fitting — hanaml.DistributionFit • hana.ml.r

hanaml.DistributionFit is a R wrapper for SAP HANA PAL Distribution Fitting

hanaml.DistributionFit(
  data,
  distr.type,
  optimal.method = NULL,
  censored = FALSE
)

Arguments

data

DataFrame
DataFrame containting the data.

distr.type

{"exp", "gamma", "normal", "poisson", "uniform", "weibull"}
choose the probability distribution from:

"exp" : Exponential distribution.
"gamma": Gamma distribution.
"normal": Normal distribution.
"poisson": Poisson distribution.
"uniform": Uniform distribution.
"weibull": Weibull distribution.

optimal.method

{"maximum.likelihood", "median.rank"}, optional
Specifies the estimation method.

"maximum.likelihood" : use maximum likelihood
"median.rank": median rank (Valid only when distr.type is "weibull")

. Defaults to "maximum.likelihood".

censored

logical, optional
Specify if the data is censored of not. TRUE only valid when distr.type is "weibull".
Defaults to FALSE.

Value

Returns a list of DataFrames:

DataFrame
The estimated parameter values.
- NAME: name of distribution parameters.
- VALUE: corresponding value.
DataFrame
Statistics
- STAT_NAME: name of statistics.
- STAT_VALUE: Value of statistics.

Details

This algorithm adapts the parameters of the chosen probability distribution in a way, s.t. the resulting distribution fits the data. PAL support distribution fitting with Normal, Gamma, Weibull, Exponential, Poisson, and Uniform distribution.

Examples

Input DataFrame data:


> data$Head(5)$Collect()
   X
1 71
2 83
3 92
4 104
5 120

Call the function:


> result <- hanaml.DistributionFit(data=data,
                                   distr.type="weibull",
                                   optimal.method="maximum.likelihood")

Results:


> result[[1]]$Collect()
               NAME  VALUE
1 DISTRIBUTIONNAME WEIBULL
2            SCALE   244.4
3            SHAPE 2.06698
> result[[2]]$Collect()
      STAT_NAME STAT_VALUE
1 LOGLIKELIHOOD  -115.1138