median_test_1samp
- hana_ml.algorithms.pal.stats.median_test_1samp(data, col=None, mu=None, test_type=None, confidence_interval=None, thread_ratio=None)
Performs one-sample non-parametric test to check whether the median of the data is different from a user specified one.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- colstr, optional
Name of the data column that needs to be tested.
If not given, it defaults to the first column.
- mufloat, optional
The median of data. It only matters in the one sample test.
Defaults to 0.
- test_type{'two_sides', 'less', 'greater'}, optional
Specifies the alternative hypothesis type.
Default to "two_sides".
- confidence_intervalfloat, optional
Confidence interval for the estimated median.
Default to 0.95.
- thread_ratiofloat, optional
Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Default to 0.
- Returns:
- DataFrame
Test results, structured as follows:
STAT_NAME column, name of statistics.
STAT_VALUE column, value of statistics.
Examples
Original data:
>>> df.collect() X 0 85 1 65 ... 18 64 19 8
Perform the one-sample median test:
>>> res = onesample_median_test(data=df, mu=40, test_type='two_sides')
Result:
>>> res.collect() STAT_NAME STAT_VALUE 0 total number 20.000000 ... 5 CI for estimated median, upper bound 83.000000 6 sign test p-value 0.066457