hanaml.FeatureNormalizer {hana.ml.r}R Documentation

Feature Normalizer

Description

hanaml.FeatureNormalizer is a R wrapper for PAL scale algorithm.

Usage

hanaml.FeatureNormalizer(conn.context, method = NULL, data = NULL,
                         features= NULL, key = NULL,
                         z.score.method = NULL, new.max = NULL,
                         new.min = NULL, thread.ratio = NULL,
                         division.by.zero.handler = NULL)

Arguments

conn.context

ConnectionContext
The connection to the SAP HANA system.

data

DataFrame
DataFrame containing the data.

key

character
Name of the ID column of data.

features

list of character, optional
Names of the feature columns. If features is not provided, it defaults to all non-ID, no-label columns.

method

{'min.max', 'z.score', 'decimal'}, optional
Invokes one of the following scaling methods:

  • 'min.max' - min.max normalization.

  • 'z.score' - z.score normalization.

  • 'decimal' - Decimal scaling normalization.

z.score.method

{'mean.standard', 'mean.mean', 'median.median'}, optional
Only valid when method is 'z.score'.

  • 'mean.standard' - Mean-Standard deviation.

  • 'mean.mean' - mean.mean deviation.

  • 'median.median' - median.median absolute deviation.

new.max

double, optional
The new maximum value for min.max normalization. Only valid when method is 'min.max'.

new.min

double, optional
The new minimum value for min.max normalization. Only valid when method is 'min.max'.

thread.ratio

double, optional
Controls the proportion of available threads to use. The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads. Values between 0 and 1 will use that percentage of available threads.
Defaults to 0.

division.by.zero.handler

c("ignore", "throw.error"), optional
Specifies what to do when encountering a division by zero.

  • "ignore": ignores the column when encountering a division by zero.

  • "throw.error": throws an error when encountering a division by zero.

Format

R6Class object.

Details

Class to Normalize input data and generate a scaling model using one of the three scaling methods: min.max normalization, z.score normalization and normalization in decimal scaling. The transform function can be used to perform transform on the given DataFrame.

Value

Return a "FeatureNormalizer" object with following values:

See Also

transform.FeatureNormalizer

Examples

## Not run: 
Input DataFrame data for training:
 > data$Collect()
   ID   X1   X2
   1  0  6.0  9.0
   2  1 12.1  8.3
   3  2 13.5 15.3
   4  3 15.4 18.7
   5  4 10.2 19.8
Generating a feature normalizer model:
fn <- hanaml.FeatureNormalizer(conn, data = data, key = "ID",
                                method="min.max", new.max=1.0, new.min=0.0)
> fn$result$Collect()
  ID        X1         X2
  1   0 0.0000000 0.03317536
  2   1 0.1865443 0.00000000
  3   2 0.2293578 0.33175355
  4   3 0.2874618 0.49289100
  5   4 0.1284404 0.54502370
  6   5 0.5290520 0.58293839
  7   6 0.5626911 0.75829384
  8   7 0.7522936 0.80568720
  9   8 0.8103976 0.91469194
 10  9 0.5993884 0.95734597
 11 10 1.0000000 1.00000000
 12 11 1.0000000 1.00000000


## End(Not run)

[Package hana.ml.r version 1.0.8 Index]