hanaml.LogarithmicRegression is a R wrapper for SAP HANA PAL Bi-variate natural logarithmic regression algorithm.

hanaml.LogarithmicRegression(
  data = NULL,
  key = NULL,
  features = NULL,
  label = NULL,
  formula = NULL,
  decomposition = NULL,
  adjusted.r2 = NULL,
  pmml.export = NULL
)

Arguments

data

DataFrame
DataFrame containting the data.

key

character, optional
Name of the ID column. If not provided, the data is assumed to have no ID column.
No default value.

features

character, optional
Name of the feature column.
If not provided, it defaults the first non-key, non-label column of data.

label

character, optional
Name of the column which specifies the dependent variable.
Defaults to the last column of data if not provided.

formula

formula type, optional
Formula to be used for model generation. format = label~<feature_list> e.g.: formula=CATEGORY~V1+V2+V3
You can either give the formula, or a feature and label combination, but do not provide both.
Defaults to NULL.

decomposition

c("LU", "QR", "SVD", "Cholesky"), optional
Specifies decomposition method(case-insensitive).

  • "LU": Doolittle decomposition.

  • "QR": QR decomposition.

  • "SVD": singular value decomposition.

  • "Cholesky": Cholesky decomposition.

Defaults to "QR".

adjusted.r2

logical, optional
If TRUE, include the adjusted R^2 value in the statistics table.
Defaults to FALSE.

pmml.export

c("no", "single-row", "multi-row"), optional
Controls whether to output a PMML representation of the model, and how to format the PMML.

  • "no": No PMML model.

  • "single-row": Exports a PMML model in a maximum of one row. Fails if the model doesn't fit in one row.

  • "multi-row": Exports a PMML model, splitting it across multiple rows if it doesn't fit in one.

Default to "no".

Value

Returns a "LogarithmicRegression" object with following values:

  • coefficients: DataFrame
    Fitted regression coefficients.

  • pmml: DataFrame
    Regression model content in PMML format. Set to NULL if no PMML model was requested.

  • model : DataFrame
    Model is used to save coefficients or PMML model. If PMML model is requested, model defaults to PMML model. Otherwise, it is coefficients.

  • fitted: DataFrame
    Predicted dependent variable values for training data. Set to NULL if the training data has no row IDs.

  • statistics: DataFrame
    Regression-related statistics, like mean square error, F-statistics, etc.

Details

Bi-variate natural logarithmic regression is an approach to modeling the relationship between a scalar variable y and one variable denoted X. In natural logarithmic regression, data is modeled using natural logarithmic functions, and unknown model parameters are estimated from the data. Such models are called natural logarithmic models.

Examples

Input DataFrame data:


 > data$Collect()
   ID   Y X1
 1  0  10  1
 2  1  80  2
 3  2 130  3
 4  3 160  4
 5  4 180  5
 6  5 190  6
 7  6 192  7

Call the function:


> nlr <- hanaml.LogarithmicRegression(data = df,
                                      key = "ID",
                                      label = "Y",
                                      pmml.export="multi-row")

Output:


 > nlr$coefficients$Collect()

       VARIABLE_NAME COEFFICIENT_VALUE
 1 __PAL_INTERCEPT__           14.8616
 2                X1           98.2936