Normalization Component

You can configure properties for the Normalization Preparation Component in HANA and non-HANA scenarios.

Syntax Use this component to normalize the attribute data. HANA Normalization scales the large value attribute data to fall within a specific range, such as -1.0 to 1.0, or 0.0 to 1.0. You can use this component for In-Database analysis. Normalization of data is useful for classification algorithms involving neural networks, or distance measurements such as nearest neighbor classification and clustering.
Note If you want the processed data to replace the existing column, select Replace column.

The normalization component supports the following normalization methods:

  • Min-Max normalization: Performs a linear transformation on the original data values, and scales each value to fit in a specific range. While performing the Min-Max normalization you can specify New Maximum value and New Minimum value. This normalization is helpful for ensuring that extreme values are constrained within a fixed range.
    Note
    • New Maximum value must be greater than New Minimum value.
  • Z-score normalization: Computed based on the mean and standard deviation for each attribute. This normalization is useful to determine whether a specific value is above or below average, and by how much.
  • Decimal scaling normalization: The decimal point of the values of each attribute are moved according to its maximum absolute value.
Note You can select Replace column, if you want the normalized data to replace the existing column data, on which normalization is performed.

Example:

Normalizing the time taken to cover a certain distance.
Table:
Name Distance (in meters) Time (in seconds)
Laura 500 66
Desy 500 360
Alex 500 201
John 500 78
Ted 500 504
To normalize the time column using Min-Max normalization, perform the following steps:
  1. In the Predict view, from the Component List choose Data Preparation tab.
  2. Drag the HANA Normalization component onto the analysis editor or Double-click on HANA Normalization.
  3. Double click HANA Normalization , or hover the mouse pointer on HANA Normalization and choose Configure Properties.
  4. Select the columns you want to normalize.
    Note You can only select columns with numerical values.

    For example, Time (in seconds).

  5. From Normalization Type drop down, choose Min-Max.
  6. Enter values for the New Maximum and the New Minimum.
  7. Choose Done, and then choose Run.
Output table:
Name Distance (in meters) Time (in seconds) Time (in seconds)_Normalized
Laura 500 66 0.05
Desy 500 360 0.30
Alex 500 201 0.17
John 500 78 0.06
Ted 500 504 0.42
Perform same steps for Z-score normalization and Decimal Scaling normalization as mentioned in Min-Max normalization. However, in case of Z-score normalization and Decimal Scaling normalization, you do not have enter the New Maximum and the New Minimum value.
Z-score normalization output:
Output table:
Name Distance (in meters) Time (in seconds)
Laura 500 -0.49
Desy 500 1.77
Alex 500 0.55
John 500 -0.40
Ted 500 2.88
Decimal Scaling normalization output:
Output table:
Name Distance (in meters) Time (in seconds)
Laura 500 0.01
Desy 500 0.04
Alex 500 0.02
John 500 0.01
Ted 500 0.05