hanaml.DiscriminantAnalysis {hana.ml.r}R Documentation

Linear Discriminant Analysis

Description

hanaml.DiscriminantAnalysis is a R wrapper for PAL Linear Discriminant Analysis.

Usage

hanaml.DiscriminantAnalysis (conn.context,
                             data = NULL,
                             key = NULL,
                             features = NULL,
                             label = NULL,
                             regularization.type = NULL,
                             regularization.amount = NULL,
                             projection = NULL)

Arguments

conn.context

ConnectionContext
Connection to the SAP HANA system.

data

DataFrame
DataFrame containing the data.

key

character, optional
Name of the ID column.
Defaults to the first column.

features

character or list of characters, optional
Names of the feature columns.
If not provided, it defaults to all non-ID, no-label columns.

label

character
Name of the column in data that specifies the dependent variable.
Defaults to the last column.

regularization.type

character, optional
The strategy for handling ill-conditioning or rank-deficiency of the empirical covariance matrix.

  • 'mixing': uses regularized covariance estimate.

  • 'diag': uses diagonal covariance estimate.

  • 'pseudo': uses pseudo inverse covariance estimate.

Defaults to 'mixing'.

regularization.amount

float, optional
The convex mixing weight assigned to the diagonal matrix obtained from diagonal of the empirical covriance matrix. Valid range for this parameter is (0,1) Valid only when regularization.type| is 'mixing'.
Defaults to the smallest number in (0,1) that makes the regularized emprical covariance matrix invertible.

projection

logical, optional
Whether or not to compute the projection model.
Defaults to TRUE.

Format

R6Class object.

Details

Linear discriminant analysis for classification and data reduction.

Value

See Also

predict.DiscriminantAnalysis

transform.DiscriminantAnalysis

Examples

## Not run: 
  The training DataFrame data:
 > data

   ID   X1   X2   X3   X4            CLASS
   0   5.1  3.5  1.4  0.2      Iris-setosa
   1   4.9  3.0  1.4  0.2      Iris-setosa
   2   4.7  3.2  1.3  0.2      Iris-setosa
   3   4.6  3.1  1.5  0.2      Iris-setosa
   4   5.0  3.6  1.4  0.2      Iris-setosa
   5   5.4  3.9  1.7  0.4      Iris-setosa
   ......
   24  6.5  3.0  5.8  2.2   Iris-virginica
   25  7.6  3.0  6.6  2.1   Iris-virginica
   26  4.9  2.5  4.5  1.7   Iris-virginica
   27  7.3  2.9  6.3  1.8   Iris-virginica
   28  6.7  2.5  5.8  1.8   Iris-virginica
   29  7.2  3.6  6.1  2.5   Iris-virginica

   Set up a 'DiscriminantAnalysis' object lda:

  >lda <- hanaml.DiscriminantAnalysis(conn.context,
                                      data
                                      key = 'ID',
                                      label = 'CLASS',
                                      regularization.type = "mixing",
                                      regularization.amount = 0.5,
                                      projection = TRUE)

  Expected output:

  > lda$coef$Collect()
                CLASS   COEFF_X1   COEFF_X2   COEFF_X3   COEFF_X4   INTERCEPT
   0      Iris-setosa  23.907391  51.754001 -34.641902 -49.063407 -113.235478
   1  Iris-versicolor   0.511034  15.652078  15.209568  -4.861018  -53.898190
   2   Iris-virginica -14.729636   4.981955  42.511486  12.315007  -94.143564

  > lda$proj.model$collect()
                NAME        X1        X2        X3        X4
   0  DISCRIMINANT_1  1.907978  2.399516 -3.846154 -3.112216
   1  DISCRIMINANT_2  3.046794 -4.575496 -2.757271  2.633037
   2    OVERALL_MEAN  5.843333  3.040000  3.863333  1.213333


## End(Not run)


[Package hana.ml.r version 1.0.8 Index]