hanaml.FactorAnalysis.Rd
hanaml.FactorAnalysis is a R wrapper for SAP HANA PAL Factor Analysis.
hanaml.FactorAnalysis(
data,
key,
factor.num,
cols = NULL,
method = NULL,
rotation = NULL,
score = NULL,
matrix = NULL,
kappa = NULL
)
DataFrame
DataFrame containting the data.
character
Name of the ID column.
integer
number of factors used to explain the covariance structure of
the dataset. It should be choosen between 1
and the number of variables.
list/vector of characters, optional
Name of data columns that need to be analyzed.
If it is not provided, it defaults all non-key columns of data
.
{"pcm"}, optional
Specifies method used for factor analysis.
Currently PAL only supports the principal component method.
Defaults to "pcm".
{"none", "varimax", "promax"}, optional
Specifies method used to rotate the loadings
"none"
"varimax"
"promax"
Defaults to "varimax".
{"none", "regression"}, optional
Specifies method to compute factor scores:
"none"
"regression"
Defaults to "regression".
{"covariance", "correlation"}, optional
"covariance"
use covariance matrix to perform factor
analysis
"correlation"
use correlation matrix to perform factor
analysis
Defaults to "correlation".
double, optional
only valid when rotation = "promax"
specifies power of promax rotation.
Defaults to 4.
Returns a list of DataFrames:
DataFrame 1
Sampling results, structured as follows:
FACTOR_ID: factor id.
EIGENVALUE: Eigenvalue (i.e. variance explained).
VAR_PROP: Variance proportion to the total variance explained.
CUM_VAR_PROP: Cumulative variance proportion to the total variance explained.
DataFrame 2
Variance explanation, structured as follows:
FACTOR_ID: factor id.
VAR: Variance explained without rotation.
VAR_PROP: Variance proportion to the total variance explained without rotation.
CUM_VAR_PROP: Cumulative variance proportion to the total variance explained without rotation.
ROT_VAR: Variance explained with rotation
ROT_VAR_PROP: Variance proportion to the total variance explained with rotation.Note that there is no rotated variance proportion when performing oblique rotation since the rotated factors are correlated.
ROT_CUM_VAR_PROP: Cumulative variance proportion to the total variance explained with rotation.
DataFrame 3
NAME: Variable name.
OBERVED_VARS: Communalities of observed variable.
DataFrame 4
FACTOR_ID: Factor id.
LOADINGs_+OBSERVED_VARs: loadings.
DataFrame 5
FACTOR_ID: Factor id.
ROT_LOADINGS_+OBSERVED_VARs: rotated loadings.
DataFrame 6
FACTOR_ID: Factor id.
STRUCTURE+OBSERVED_VARS: Structure matrix. It is empty when rotation is not oblique.
DataFrame 7
ROTATION: rotation
ROTATION_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table): Rotation matrix.
DataFrame 8
FACTOR_ID: Factor id
FACTOR_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table): Factor correlation matrix. It is empty when rotation is not oblique.
DataFrame 9
NAME: Factor id, MEAN, SD
OBSERVED_VARS (in input table) column name: Score coefficients, means and standard deviations of observed variables.
DataFrame 10
FACTOR_ID: Factor id
FACTOR_ + i (i sequences from 1 to number of columns in OBSERVED_VARS (in input table)): scores
DataFrame 11
Placeholder for future features:
STAT_NAME: statistic name.
STAT_VALUE: statistic value.
Factor Analysis is a statistical method used to extract a low number of latent factors that can best describe the correlations of a large set of observed variables.
Input DataFrame data:
>data$Head(6)$Collect()
ID X1 X2 X3 X4 X5 X6
1 1 1 1 3 3 1 1
2 2 1 2 3 3 1 1
3 3 1 1 3 4 1 1
4 4 1 1 3 3 1 2
Call the function:
> fa <- hanaml.FactorAnalysis(data=data,
factor.num=2,
method="pcm",
rotation="promax",
score="regression",
matrix="correlation",
kappa=4)
Output:
> fa[[1]]$Collect()
FACTOR_ID EIGENVALUE VAR_PROP CUM_VAR_PROP
1 FACTOR_1 3.69603077 0.616005129 0.6160051
2 FACTOR_2 1.07311448 0.178852413 0.7948575
3 FACTOR_3 1.00077409 0.166795682 0.9616532
4 FACTOR_4 0.16100348 0.026833913 0.9884871
5 FACTOR_5 0.04096116 0.006826860 0.9953140
6 FACTOR_6 0.02811601 0.004686002 1.0000000