wilcoxon
- hana_ml.algorithms.pal.stats.wilcoxon(data, col=None, mu=None, test_type=None, correction=None)
Performs a one-sample or paired two-sample non-parametric test to check whether the median of the data is different from a specific value.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- colstr or a list of str, optional
Name of the data column that needs to be tested.
If not given, the input DataFrame must only have one or two columns.
- mufloat, optional
The location mu0 for the one sample test. It does not affect the two-sample test.
Defaults to 0.
- test_type{'two_sides', 'less', 'greater'}, optional
Specifies the alternative hypothesis type:
Default to "two_sides".
- corrctionbool, optional
Controls whether or not to include the continuity correction for the p value calculation.
Default to true.
- Returns:
- DataFrame
Test results, structured as follows:
STAT_NAME column, name of statistics.
STAT_VALUE column, value of statistics.
Examples
Original data:
>>> df.collect() X 0 85 1 65 ... 18 64 19 8
Perform the wilcox signed rank test:
>>> res = wilcoxon(data=df, mu=40, test_type='two_sides', correction=true)
Result:
>>> res.collect() STAT_NAME STAT_VALUE 0 statistic 158.5 1 p-value 0.011228240845317039