hanaml.ChisqGoF.RdPerform the chi-squared goodness-of-fit(GoF) test to tell whether or not an observed distribution differs from an expected chi-squared distribution.
hanaml.ChisqGoF(data, key, observed.data = NULL, expected.freq = NULL)
| data |
|
|---|---|
| key |
|
| observed.data |
|
| expected.freq |
|
Returns a list of 2 DataFrame:
Comparsion between the actual counts and the expected counts :
DataFrame
structured as follows:
ID column, with same name and type as data's ID column
Observed data column, with same name as data's observed.data column, but always with type DOUBLE.
EXPECTED column, type DOUBLE, expected count in each category.
RESIDUAL column, type DOUBLE, the difference between the observed counts and the expected counts.
Statistical outputs : DataFrame
including the calculated chi-squared value,
degrees of freedom and p-value, structured as follows:
STAT_NAME: type NVARCHAR(100), name of statistics.
STAT_VALUE: type DOUBLE, value of statistics.
Input DataFrame data:
> data$Collect()
ID OBSERVED P
1 0 519.0 0.3
2 1 364.0 0.2
3 2 363.0 0.2
4 3 200.0 0.1
5 4 212.0 0.1
6 5 193.0 0.1
Create chisquaredfit instance:
> result <- hanaml.ChisqGoF(data, key = "ID")
Output:
> result[[1]]$Collect()
ID OBSERVED EXPECTED RESIDUAL
1 0 519.0 555.3 -36.3
2 1 364.0 370.2 -6.2
3 2 363.0 370.2 -7.2
4 3 200.0 185.1 14.9
5 4 212.0 185.1 26.9
6 5 193.0 185.1 7.9