ttest_ind
- hana_ml.algorithms.pal.stats.ttest_ind(data, col1=None, col2=None, mu=0, test_type='two_sides', var_equal=False, conf_level=0.95)
Performs the T-test for the mean difference of two independent samples.
- Parameters:
- dataDataFrame
DataFrame containing the data.
- col1str, optional
Name of the column for sample1.
If not given, it defaults to the first column.
- col2str, optional
Name of the column for sample2.
If not given, it defaults to the first non-col1 column.
- mufloat, optional
Hypothesized difference between the two underlying population means.
Defaults to 0.
- test_type{'two_sides', 'less', 'greater'}, optional
The alternative hypothesis type.
Defaults to 'two_sides'.
- var_equalbool, optional
Controls whether to assume that the two samples have equal variance.
Defaults to False.
- conf_levelfloat, optional
Confidence level for alternative hypothesis confidence interval.
Defaults to 0.95.
- Returns:
- DataFrame
Statistics results.
Examples
Original data:
>>> df.collect() X1 X2 0 1.0 10.0 1 2.0 12.0 2 4.0 11.0 3 7.0 15.0 4 NaN 10.0
Perform Independent Sample T-Test:
>>> ttest_ind(data=df).collect() STAT_NAME STAT_VALUE 0 t-value -5.013774 1 degree of freedom 5.649757 ... 7 upperLimit -4.086722