distribution_fit
- hana_ml.algorithms.pal.stats.distribution_fit(data, distr_type, optimal_method=None, censored=False)
Aims to fit a probability distribution for a variable according to a series of measurements to the variable. There are many probability distributions of which some can be fitted more closely to the observed variable than others.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- distr_type{'exponential', 'gamma', 'normal', 'poisson', 'uniform', 'weibull'}
Specify the type of distribution to fit.
- optimal_method{'maximum_likelihood', 'median_rank'}, optional
Specifies the estimation method.
Defaults to 'median_rank' when
distr_type
is 'weibull', 'maximum_likelihood' otherwise.- censoredbool, optional
Specify if
data
is censored of not. Only valid whendistr_type
is 'weibull'.Default to False.
- Returns:
- DataFrames
DataFrame 1 : fitting results, structured as follows:
NAME: name of distribution parameters.
VALUE: value of distribution parameters.
DataFrame 2 : fitting statistics.
Examples
>>> res, stats = distribution_fit(data=df, distr_type='weibull', optimal_method='maximum_likelihood') >>> res.collect() >>> stats.collect()