ttest_paired
- hana_ml.algorithms.pal.stats.ttest_paired(data, col1=None, col2=None, mu=0, test_type='two_sides', conf_level=0.95)
Performs the t-test for the mean difference of two sets of paired samples.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- col1str, optional
Name of the column for sample1.
If not given, defaults to the first column.
- col2str, optional
Name of the column for sample2.
If not given, defaults to the first non-col1 column.
- mufloat, optional
Hypothesized difference between two underlying population means.
Defaults to 0.
- test_type{'two_sides', 'less', 'greater'}, optional
The alternative hypothesis type.
Defaults to 'two_sides'.
- conf_levelfloat, optional
Confidence level for alternative hypothesis confidence interval.
Defaults to 0.95.
- Returns:
- DataFrame
Statistics results.
Examples
Original data:
>>> df.collect() X1 X2 0 1.0 10.0 1 2.0 12.0 2 4.0 11.0 3 7.0 15.0 4 3.0 10.0
perform Paired Sample T-Test:
>>> ttest_paired(data=df).collect() STAT_NAME STAT_VALUE 0 t-value -14.062884 1 degree of freedom 4.000000 2 p-value 0.000148 3 _PAL_MEAN_DIFFERENCES_ -8.200000 4 confidence level 0.950000 5 lowerLimit -9.818932 6 upperLimit -6.581068