hanaml.PCA.Rdhanaml.PCA is a R wrapper for SAP HANA PAL PCA.
hanaml.PCA( data = NULL, key = NULL, features = NULL, formula = NULL, scaling = NULL, thread.ratio = NULL, scores.output = NULL )
| data |
|
|---|---|
| key |
|
| features |
|
| formula |
|
| scaling |
|
| thread.ratio |
|
| scores.output |
|
Returns a "PCA" object with following values:
loadings : DataFrame
The weights by which each standardized original variable should be
multiplied when computing component scores.
loadings.stat : DataFrame
Loading statistics on each component
scores : DataFrame
The transformed variable values corresponding to each data point.
Set to NULL if scores is FALSE.
scaling.stat : DataFrame
Mean and scale values of each variable
model : list of DataFrame
The fitted model.
The principal component analysis procedure to reduce the dimensionality of multivariate data using Singular Value Decomposition.
Input DataFrame data:
> data$Head(4)$Collect() ID X1 X2 X3 X4 1 1 12.0 52.0 20.0 44.0 2 2 12.0 57.0 25.0 45.0 3 3 12.0 54.0 21.0 45.0 4 4 13.0 52.0 21.0 46.0
Call the function:
> pca <- hanaml.PCA(data = data,
key = "ID",
scaling=TRUE,
thread.ratio=0.5,
scores.output=TRUE)
Output:
> pca$loadings$Collect() COMPONENT_ID LOADINGS_X1 LOADINGS_X2 LOADINGS_X3 LOADINGS_X4 1 Comp1 0.541547 0.321424 0.511941 0.584235 2 Comp2 -0.454280 0.728287 0.395819 -0.326429 3 Comp3 -0.171426 -0.600095 0.760875 -0.177673 4 Comp4 -0.686273 -0.078552 -0.048095 0.721489 > pca$loadings.stat$Collect() COMPONENT_ID SD VAR_PROP CUM_VAR_PROP 1 Comp1 1.566624 0.613577 0.613577 2 Comp2 1.100453 0.302749 0.916327 3 Comp3 0.536973 0.072085 0.988412 4 Comp4 0.215297 0.011588 1.000000 > pca$scaling.stat$Collect() VARIABLE_ID MEAN SCALE 1 1 17.000000 5.039841 2 2 53.636364 1.689540 3 3 23.000000 2.000000 4 4 48.454545 4.655398