hanaml.FactorAnalysis.Rdhanaml.FactorAnalysis is a R wrapper for SAP HANA PAL Factor Analysis.
hanaml.FactorAnalysis( data, key, factor.num, cols = NULL, method = NULL, rotation = NULL, score = NULL, matrix = NULL, kappa = NULL )
| data |
|
|---|---|
| key |
|
| factor.num |
|
| cols |
|
| method |
|
| rotation |
Defaults to 'varimax'. |
| score |
Defaults to 'regression'. |
| matrix |
Defaults to 'correlation'. |
| kappa |
|
Return a list of DataFrame:
DataFrame 1
Sampling results, structured as follows:
FACTOR_ID: factor id
EIGENVALUE: Eigenvalue (i.e. variance explained)
VAR_PROP: Variance proportion to the total variance explained.
CUM_VAR_PROP: Cumulative variance proportion to the total variance explained.
DataFrame 2
Variance explanation, structured as follows:
FACTOR_ID: factor id
VAR: Variance explained without rotation
VAR_PROP: Variance proportion to the total variance explained without rotation.
CUM_VAR_PROP: Cumulative variance proportion to the total variance explained without rotation.
ROT_VAR: Variance explained with rotation
ROT_VAR_PROP: Variance proportion to the total variance explained with rotation.Note that there is no rotated variance proportion when performing oblique rotation since the rotated factors are correlated.
ROT_CUM_VAR_PROP: Cumulative variance proportion to the total variance explained with rotation.
DataFrame 3
NAME: Variable name.
OBERVED_VARS: Communalities of observed variable
DataFrame 4
FACTOR_ID: Factor id
LOADINGs_+OBSERVED_VARs: loadings
DataFrame 5
FACTOR_ID: Factor id
ROT_LOADINGS_+OBSERVED_VARs: rotated loadings
DataFrame 6
FACTOR_ID: Factor id
STRUCTURE+OBSERVED_VARS: Structure matrix. It is empty when rotation is not oblique.
DataFrame 7
ROTATION: rotation
ROTATION_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table): Rotation matrix
DataFrame 8
FACTOR_ID: Factor id
FACTOR_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table): Factor correlation matrix. It is empty when rotation is not oblique.
DataFrame 9
NAME: Factor id, MEAN, SD
OBSERVED_VARS (in input table) column name: Score coefficients, means and standard deviations of observed variables.
DataFrame 10
FACTOR_ID: Factor id
FACTOR_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table)): scores
DataFrame 11
Placeholder for future features:
STAT_NAME: statistic name.
STAT_VALUE: statistic value.
Factor Analysis is a statistical method used to extract a low number of latent factors that can best describe the correlations of a large set of observed variables.
Input DataFrame data:
>data$Head(6)$Collect() ID X1 X2 X3 X4 X5 X6 1 1 1 1 3 3 1 1 2 2 1 2 3 3 1 1 3 3 1 1 3 4 1 1 4 4 1 1 3 3 1 2
Call the function:
> fa <- hanaml.FactorAnalysis(data=data,
factor.num=2,
method="pcm",
rotation="promax",
score="regression",
matrix="correlation",
kappa=4)
Output:
> fa[[1]]$Collect() FACTOR_ID EIGENVALUE VAR_PROP CUM_VAR_PROP 1 FACTOR_1 3.69603077 0.616005129 0.6160051 2 FACTOR_2 1.07311448 0.178852413 0.7948575 3 FACTOR_3 1.00077409 0.166795682 0.9616532 4 FACTOR_4 0.16100348 0.026833913 0.9884871 5 FACTOR_5 0.04096116 0.006826860 0.9953140 6 FACTOR_6 0.02811601 0.004686002 1.0000000