QuantileTransform
- class hana_ml.algorithms.pal.preprocessing.QuantileTransform(num_quantiles=None, output_distribution=None)
Python wrapper for PAL Quantile Transformer.
- Parameters:
- num_quantilesint, optional
Specifies the number of quantiles to be computed.
Defaults to 100.
- output_distribution{'uniform', 'normal'}, optional
Specifies the marginal distribution of the quantile-transformed data.
'uniform': Uniform distribution
'normal': normal distribution
Defaults to 'uniform'.
Examples
>>> qt = QuantileTransform(num_quantiles=200, output_distribution='uniform') >>> qt.fit(data=df, key='ID', features=['X2', 'X6'], categorical_variable='X5') >>> qt.result_.collect()
- Attributes:
- result_DataFrame
Training data with selected features quantile-transformed.
- model_list of DataFrames
The model for transforming subsequent data, consisted of 2 DataFrames:
DataFrame 1: Quantiles for the output distribution.
DataFrame 2: Other model info for the Quantile Transformer.
Methods
fit
(data[, key, features, categorical_variable])Quantile transformation to numerical features.
fit_transform
(data[, key, features, ...])Fit a Quantile Transformer, in the meantime transform the training data and return the result.
transform
(data[, key])Transform the test data using a fitted QuantileTransformer.
- fit(data, key=None, features=None, categorical_variable=None)
Quantile transformation to numerical features.
- Parameters:
- dataDataFrame
Input data for fitting a quantile-transformation model(Quantile-Transformer).
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.- featuresstr or list of strings, optional
Specifies the names of columns in
data
for which quantile-transformation should be applied. However, categorical columns infeatures
are ignored since only numerical columns can be quantile-transformed.Defaults to all numerical columns in
data``(except ``key
).- categorical_variablestr or a list of str, optional
Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.
No default value.
- Returns:
- A fitted object of class "QuantileTransform".
- fit_transform(data, key=None, features=None, categorical_variable=None)
Fit a Quantile Transformer, in the meantime transform the training data and return the result.
- Parameters:
- dataDataFrame
Input data for fitting a quantile-transformation model(Quantile-Transformer).
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.- featuresstr or list of strings, optional
Specifies the names of columns in
data
for which quantile-transformation should be applied. However, categorical columns infeatures
are ignored since only numerical columns can be quantile-transformed.Defaults to all numerical columns in
data``(except ``key
).- categorical_variablestr or a list of str, optional
Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.
No default value.
- Returns:
- DataFrame
The data with selected features being quantile-transformed.
- transform(data, key=None)
Transform the test data using a fitted QuantileTransformer.
- Parameters:
- dataDataFrame
Input data for applying a trained quantile-transformation model(Quantile-Transformer).
Should be structured the same as the data used in the model training phase.
- keystr, optional
Specifies the name of the ID column in
data
.Mandatory if
data
is not indexed by a single column; otherwise defaults to the index column ofdata
.
- Returns:
- DataFrame
Quantile-transformed data w.r.t. selected(numerical) features.
Inherited Methods from PALBase
Besides those methods mentioned above, the QuantileTransform class also inherits methods from PALBase class, please refer to PAL Base for more details.