hanaml.DistributionFit is a R wrapper for SAP HANA PAL Distribution Fitting

hanaml.DistributionFit(
  data,
  distr.type,
  optimal.method = NULL,
  censored = FALSE
)

Arguments

data

DataFrame
DataFrame containting the data.

distr.type

{'exp', 'gamma', 'normal', 'poisson', 'uniform', 'weibull'}
choose the probability distribution from:

  • 'exp' : Exponential distribution

  • 'gamma': Gamma distribution

  • 'normal': Normal distribution

  • 'poisson': Poisson distribution

  • 'uniform': Uniform distribution

  • 'weibull': Weibull distribution

.

optimal.method

{'maximum.likelihood', 'median.rank'}, optional
Specifies the estimation method.

  • 'maximum.likelihood' : use maximum likelihood

  • 'median.rank': median rank (Valid only when distr.type is 'weibull')

. Defaults to 'maximum.likelihood'.

censored

logical, optional
Specify if the data is censored of not. TRUE only valid when distr.type is 'weibull'.
Defaults to FALSE.

Value

Returns a list of DataFrame:

  • DataFrame
    The estimated parameter values.

    • NAME Name of distribution parameters

    • VALUE corresponding value

  • DataFrame
    Statistics

    • STAT_NAME name of statistics.

    • STAT_VALUE Value of statistics.

Details

This algorithm adapts the parameters of the chosen probability distribution in a way, s.t. the resulting distribution fits the data. PAL support distribution fitting with Normal, Gamma, Weibull, Exponential, Poisson, and Uniform distribution.

Examples

Input DataFrame data:

> data$Head(5)$Collect()
   X
1 71
2 83
3 92
4 104
5 120

Call the function:

> result <- hanaml.DistributionFit(data=data,
                                   distr.type="weibull",
                                   optimal.method='maximum.likelihood')

Results:

> result[[1]]$Collect()
               NAME  VALUE
1 DISTRIBUTIONNAME WEIBULL
2            SCALE   244.4
3            SHAPE 2.06698
> result[[2]]$Collect()
      STAT_NAME STAT_VALUE
1 LOGLIKELIHOOD  -115.1138