Auto Clustering

Properties that can be configured for the Automated (Auto) Clustering algorithm in HANA and non-HANA scenarios.

What is Auto Clustering ?

The Auto Clustering algorithm discovers segments in the data with reference to a target variable. This is done by automatically selecting a clustering algorithm and key input variables to generate the best model.

However, you can train Auto Clustering without a target variable. If one is provided, it is used internally to verify the performance of clustering and fine tune the model automatically.

Note You can see the results of an analysis that uses the Auto Clustering algorithm displayed in chart format. You can also display the summary view of the analysis results.
Syntax Automated Clustering is a semi-supervised or targeted clustering algorithm designed and optimized to reveal segments that are related to a specific business question. It discovers natural segments or common behaviors in a dataset and provides the description for each of the segments.
Note When using the Automated Clustering algorithm, we recommend that you trim the values before acquiring the dataset. You can find the Trim Values option in the Advanced Options section of the "New Dataset" dialog.

For more information about the functions used in online Automated algorithms, see the SAP Automated Predictive Library Reference Guide (APL) at http://help.sap.com/pa

HANA Automated Clustering Properties
Table 1: Algorithm Properties
Property Description
Features Select the input columns with which you want to perform the analysis.
Target Variable Select an optional target column for which you want to perform the analysis.
Minimum Number of Clusters Enter the minimum number of clusters that you want to use for clustering.
Maximum Number of Clusters Enter the maximum number of clusters that you want to use for clustering.
Predicted Column Name Enter a name for the newly-created column that contains predicted values.