quantile
- hana_ml.algorithms.pal.stats.quantile(data, distr_info, col=None, complementary=False)
Evaluates the inverse of the cumulative distribution function (CDF) or the inverse of the complementary cumulative distribution function (CCDF) for a given probability p and probability distribution.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- distr_infodict
A python dictionary object that contains the distribution name and parameter. Supported distributions include: uniform, normal, weibull and gamma. Examples for illustration:
{'name':'normal', 'mean':0, 'variance':1.0}.
{'name':'uniform', 'min':0.0, 'max':1.0}.
{'name':'weibull', 'shape':1.0, 'scale':1.0}.
{'name':'gamma', 'shape':1.0, 'scale':1.0}.
You may change the parameter values followed by any of the supported distribution name listed as above.
- colstr, optional
Name of the column in the data frame that needs to be processed.
If not given, it defaults to the first column.
- complementarybool, optional
False: 'cdf'
True: 'ccdf'
Default to False.
- Returns:
- DataFrame
CDF results.
Examples
Original data:
>>> df.collect() DATACOL 0 0.3 1 0.5 2 0.632 3 0.8 >>> df_distr.collect() NAME VALUE 0 DistributionName Weibull 1 Shape 2.11995 2 Scale 277.698
Apply the quantile function:
>>> res = quantile(data=df, distr=df_distr) >>> res.collect() DATACOL QUANTILE 0 0.3 170.755854 1 0.5 233.608506 2 0.632 277.655075 3 0.8 347.586495