wilcoxon
- hana_ml.algorithms.pal.stats.wilcoxon(data, col=None, mu=None, test_type=None, correction=None)
Perform a one-sample or paired two-sample non-parametric test to check whether the median of the data is different from a specific value.
- Parameters
- dataDataFrame
DataFrame containing the data.
- colstr/ListofStrings, optional
Name of the data column that needs to be tested.
If not given, the input dataframe must only have one or two columns.
- mufloat, optional
The location mu0 for the one sample test. It does not affect the two-sample test.
Defaults to 0.
- test_type{'two_sides', 'less', 'greater'}, optional
Specifies the alternative hypothesis type:
Default to "two_sides".
- corrctionbool, optional
Controls whether or not to include the continuity correction for the p value calculation.
Default to true.
- Returns
- DataFrame
Test results, structured as follows:
STAT_NAME column, name of statistics.
STAT_VALUE column, value of statistics.
Examples
Original data:
>>> df.collect() X 0 85 1 65 2 20 3 56 4 30 5 46 6 83 7 33 8 89 9 72 10 51 11 76 12 68 13 82 14 27 15 59 16 69 17 40 18 64 19 8
Perform the wilcox signed rank test:
>>> res = wilcoxon(df, mu=40, test_type='two_sides', correction=true)
Result:
>>> res.collect() STAT_NAME STAT_VALUE 0 statistic 158.5 1 p-value 0.011228240845317039