QuantileTransform
- class hana_ml.algorithms.pal.preprocessing.QuantileTransform(num_quantiles=None, output_distribution=None)
Python wrapper for PAL Quantile Transformer.
- Parameters:
- num_quantilesint, optional
Specifies the number of quantiles to be computed.
Defaults to 100.
- output_distribution{'uniform', 'normal'}, optional
Specifies the marginal distribution of the quantile-transformed data.
'uniform': Uniform distribution
'normal': normal distribution
Defaults to 'uniform'.
Examples
>>> qt = QuantileTransform(num_quantiles=200, output_distribution='uniform') >>> qt.fit(data=df, key='ID', features=['X2', 'X6'], categorical_variable='X5') >>> qt.result_.collect()
- Attributes:
- result_DataFrame
Training data with selected features quantile-transformed.
- model_list of DataFrames
The model for transforming subsequent data, consisted of 2 DataFrames:
DataFrame 1: Quantiles for the output distribution.
DataFrame 2: Other model info for the Quantile Transformer.
Methods
fit
(data[, key, features, categorical_variable])Quantile transformation to numerical features.
fit_transform
(data[, key, features, ...])Fit a Quantile Transformer, in the meantime transform the training data and return the result.
Get the model metrics.
Get the score metrics.
transform
(data[, key])Transform the test data using a fitted QuantileTransformer.
- fit(data, key=None, features=None, categorical_variable=None)
Quantile transformation to numerical features.
- Parameters:
- dataDataFrame
Input data for fitting a quantile-transformation model(Quantile-Transformer).
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.- featuresstr or list of strings, optional
Specifies the names of columns in
data
for which quantile-transformation should be applied. However, categorical columns infeatures
are ignored since only numerical columns can be quantile-transformed.Defaults to all numerical columns in
data``(except ``key
).- categorical_variablestr or a list of str, optional
Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.
No default value.
- Returns:
- A fitted object of class "QuantileTransform".
- fit_transform(data, key=None, features=None, categorical_variable=None)
Fit a Quantile Transformer, in the meantime transform the training data and return the result.
- Parameters:
- dataDataFrame
Input data for fitting a quantile-transformation model(Quantile-Transformer).
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.- featuresstr or list of strings, optional
Specifies the names of columns in
data
for which quantile-transformation should be applied. However, categorical columns infeatures
are ignored since only numerical columns can be quantile-transformed.Defaults to all numerical columns in
data``(except ``key
).- categorical_variablestr or a list of str, optional
Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.
No default value.
- Returns:
- DataFrame
The data with selected features being quantile-transformed.
- transform(data, key=None)
Transform the test data using a fitted QuantileTransformer.
- Parameters:
- dataDataFrame
Input data for applying a trained quantile-transformation model(Quantile-Transformer).
Should be structured the same as the data used in the model training phase.
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.
- Returns:
- DataFrame
Quantile-transformed data w.r.t. selected(numerical) features.
- get_model_metrics()
Get the model metrics.
- Returns:
- DataFrame
The model metrics.
- get_score_metrics()
Get the score metrics.
- Returns:
- DataFrame
The score metrics.
Inherited Methods from PALBase
Besides those methods mentioned above, the QuantileTransform class also inherits methods from PALBase class, please refer to PAL Base for more details.