hanaml.ChisqIndependence.RdPerform the chi-squared test of independence to tell whether two variables are independent from each other.
hanaml.ChisqIndependence(data, key, observed.data = NULL, correction = NULL)
| data |
|
|---|---|
| key |
|
| observed.data |
|
| correction |
|
Returns a list of 2 DataFrame:
DataFrame 1
The expected count table, structured as follows:
ID column, with same name and type as data's ID column.
Expected count columns, named by prepending Expected_ to each observed.data column name, type DOUBLE. There will be as many columns here as there are observed.data columns.
DataFrame 2
Statistical outputs including the calculated chi-squared value,
degrees of freedom and p-value, structured as follows:
STAT_NAME: type NVARCHAR(100), name of statistics
STAT_VALUE: type DOUBLE, value of statistics
Input DataFrame data:
> data$Collect()
ID X1 X2 X3 X4
1 male 25 23.0 11 14.0
2 female 41 20.0 18 6.0
Call the function:
> result <- hanaml.ChisqIndependence(data, key="ID")
Expected output:
> result[[1]]$Collect()
ID EXPECTED_X1 EXPECTED_X2 EXPECTED_X3 EXPECTED_X4
1 male 30.493671 19.867089 13.398734 9.240506
2 female 35.506329 23.132911 15.601266 10.759494