hanaml.ConditionIndex.Rd
hanaml.ConditionIndex is a R wrapper for SAP HANA PAL Condition Index.
hanaml.ConditionIndex(
data,
key,
features = NULL,
scaling = NULL,
intercept = NULL,
thread.ratio = NULL
)
DataFrame
DataFrame containting the data.
character
Name of the ID column.
character of list of characters, optional
Name of feature columns.
If not provided, it defaults all non-key, non-label columns of data.
logical, optional
Specifies whether or not to scale the input data to have unit variance
before the analysis.
Default to TRUE.
logical, optional
Specifies whether or not to consider intercept during the calculation.
Default to TRUE.
double, optional
Controls the proportion of available threads that can be used by this
function.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates all available threads.
Values between 0 and 1 will use up to
that percentage of available threads.Values outside this
range are ignored.
Defaults to 0.
Returns a list of 2 DataFrame:
DataFrame 1
Condition index results, structured as follows:
COMPONENT_ID, principal component ID.
EIGENVALUE, eigenvalue.
CONDITION_INDEX, Condition index.
FEATURES, variance decomposition proportion for each variable.
INTERCEPT, variance decomposition proportion for the intercept term.
DataFrame 2
This table is empty if collinearity problem has not been detected.
Distinct values results, structured as follows:
STAT_NAME: name for the values, including condition number,
and the name of variables which are involved in collinearity problem.
STAT_VALUE: values of the corresponding name.
Condition index is used to detect collinearity problem between independent variables which are later used as predictors in a multiple linear regression model.
Input DataFrame data:
> data$Collect()
ID X1 X2 X3 X4
1 1 12 52 20 44
2 2 12 57 25 45
3 3 12 54 21 45
4 4 13 52 21 46
5 5 14 54 24 46
Call ConditionIndex function:
> ci <- hanaml.ConditionIndex(data, key="ID", thread.ratio=0.1)
Output:
> ci[[1]]$Collect()
COMPONENT_ID EIGENVALUE CONDITION_INDEX X1 X2
1 Comp_1 1.996669e+01 1.00000 1.185761e-05 1.556872e-06
2 Comp_2 2.073585e-02 31.03074 8.776374e-03 2.098206e-04
3 Comp_3 1.226013e-02 40.35575 5.347198e-02 2.570866e-03
4 Comp_4 2.295285e-04 294.94070 2.056656e-01 1.522431e-02
5 Comp_5 8.639595e-05 480.73565 7.320742e-01 9.819934e-01
X3 X4 INTERCEPT
1 9.911148e-06 3.175778e-06 2.173805e-06
2 3.106275e-02 1.251087e-03 9.070816e-04
3 5.314573e-03 6.389341e-04 2.710487e-03
4 6.578588e-03 9.311208e-01 2.468621e-01
5 9.570342e-01 6.698598e-02 7.495182e-01