# ExponentialRegression

Exponential regression is an approach to modeling the relationship between a scalar variable y and one or more variables denoted X. In exponential regression, data is modeled using exponential functions, and unknown model parameters are estimated from the data. Such models are called exponential models.

Parameters
decomposition{'LU', 'QR', 'SVD', 'Cholesky'}, optional

Matrix factorization type to use. Case-insensitive.

• 'LU': LU decomposition.

• 'QR': QR decomposition.

• 'SVD': singular value decomposition.

• 'Cholesky': Cholesky(LDLT) decomposition.

Defaults to QR decomposition.

If true, include the adjusted R2 value in the statistics table.

Defaults to False.

pmml_export{'no', 'single-row', 'multi-row'}, optional

Controls whether to output a PMML representation of the model, and how to format the PMML. Case-insensitive.

• 'no' or not provided: No PMML model.

• 'single-row': Exports a PMML model in a maximum of one row. Fails if the model doesn't fit in one row.

• 'multi-row': Exports a PMML model, splitting it across multiple rows if it doesn't fit in one.

Prediction does not require a PMML model.

Controls the proportion of available threads to use for fitting.

The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads.

Values between 0 and 1 will use that percentage of available threads.

Values outside this range tell PAL to heuristically determine the number of threads to use.

Defaults to 0.

Examples

```>>> df.collect()
ID       Y       X1      X2
0   0     0.5     0.13    0.33
1   1    0.15     0.14    0.34
2   2    0.25     0.15    0.36
3   3    0.35     0.16    0.35
4   4    0.45     0.17    0.37
```

Training the model:

```>>> er = ExponentialRegression(pmml_export = 'multi-row')
>>> er.fit(data=df, key='ID')
```

Prediction:

```>>> df2.collect()
ID    X1       X2
0   0   0.5      0.3
2   1     4      0.4
2   2     0      1.6
3   3   0.3     0.45
5   4   0.4      1.7
```
```>>> er.predict(data=df2, key='ID').collect()
ID                      VALUE
0   0         0.6900598931338715
1   1         1.2341502316656843
2   2       0.006630664136180741
3   3         0.3887970208571841
4   4      0.0052106543571450266
```
Attributes
coefficients_DataFrame

Fitted regression coefficients.

pmml_DataFrame

PMML model. Set to None if no PMML model was requested.

fitted_DataFrame

Predicted dependent variable values for training data. Set to None if the training data has no row IDs.

statistics_DataFrame

Regression-related statistics, such as mean squared error.

Methods

 `fit`(data[, key, features, label]) Fit regression model based on training data. `predict`(data[, key, features, model_format, ...]) Predict dependent variable values based on fitted model. `score`(data[, key, features, label]) Returns the coefficient of determination R2 of the prediction.
fit(data, key=None, features=None, label=None)

Fit regression model based on training data.

Parameters

Training data.

keystr, optional

Name of the ID column.

If `key` is not provided, then:

• if `data` is indexed by a single column, then `key` defaults to that index column;

• otherwise, it is assumed that `data` contains no ID column.

featureslist of str, optional

Names of the feature columns.

labelstr, optional

Name of the dependent variable.

Defaults to the last non-ID column(this is not the PAL default).

Returns
Fitted object.
predict(data, key=None, features=None, model_format=None, thread_ratio=0.0)

Predict dependent variable values based on fitted model.

Parameters

Independent variable values used for prediction.

keystr, optional

Name of the ID column.

Mandatory if `data` is not indexed, or the index of `data` contains multiple columns.

Defaults to the single index column of `data` if not provided.

featureslist of str, optional

Names of the feature columns.

model_formatint or str, optional(deprecated)
• 0 or 'coefficient' : using coefficient table as model for prediction

• 1 or 'pmml' : using pmml table as model for prediction

Defaults to 'coefficient'.

Deprecated, not effective any more.

Controls the proportion of available threads to use for prediction.

The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads.

Values between 0 and 1 will use that percentage of available threads.

Values outside this range tell PAL to heuristically determine the number of threads to use.

Returns
DataFrame

Predicted values, structured as follows:

• ID column, with same name and type as `data` 's ID column.

• VALUE, type DOUBLE, representing predicted values.

Note

predict() will pass the `pmml_` table to PAL as the model representation if there is a `pmml_` table, or the `coefficients_` table otherwise.

score(data, key=None, features=None, label=None)

Returns the coefficient of determination R2 of the prediction.

Parameters

Data on which to assess model performance.

keystr, optional

Name of the ID column.

Mandatory if `data` is not indexed, or the index of `data` contains multiple columns.

Defaults to the single index column of `data` if not provided.

featureslist of str, optional

Names of the feature columns.

labelstr, optional

Name of the dependent variable.

Defaults to the last non-ID column(this is not the PAL default).

Returns
float

The coefficient of determination R2 of the prediction on the given data.

property fit_hdbprocedure

Returns the generated hdbprocedure for fit.

property predict_hdbprocedure

Returns the generated hdbprocedure for predict.

## Inherited Methods from PALBase

Besides those methods mentioned above, the ExponentialRegression class also inherits methods from PALBase class, please refer to PAL Base for more details.