Show TOC Start of Content Area

Function documentation Clustering  Locate the document in its SAP Library structure

Use

Clustering allows you to segment data automatically into clusters. In a subordinate dataset, the system groups together associated data by forging formerly unknown links. This entails determining the criteria for clustering as well as the mappings between datasets.

You execute clustering by training a model on the basis of historic data. You can use a prediction to apply the same segmentation to another dataset.

Example

The customer data for a fruit juice outlet contains attributes such as gender, age, income, region, occupation, and product bought most. During clustering, the system determines which combinations of attributes frequently occur together and uses this information to build clusters, that is to say, customer segments. A customer segment could consist of male customers aged between 30 and 40, with high incomes, and whose most frequent purchase is orange juice. Another customer segment could represent female customers aged between 20 and 40, without occupation, and whose most frequent purchase is apple juice.

Integration

The data that you use to train the model can be taken from any other system, provided that the system can extract data into SAP BW. Likewise, you can apply the same segmentation to any data that has been extracted into SAP BW.

Prerequisites

The queries available in SAP BW allow you to access data for which the statements are known and which you can use to find out similar statements about other data.

Features

You can make the following settings in a model for the Clustering method:

You use the model fields to specify which characteristic is to be considered with which attributes (such as the characteristic Customer with the attributes Occupation, Gender, Age, and so on). You can specify in the field parameters different weightings for the individual attributes. The system then establishes formerly unknown associations between the attribute values.

You can use the model parameters to specify, for example, how many clusters the system should create during training. By specifying conditions for interrupting the segmentation, you enhance the quality and performance of the segmentation.

During training, the system determines not only the clusters but also which cluster each characteristic (such as a customer) belongs to and what distance separates the clusters. You can display the result in graphical format and export it into an Excel workbook.

See Also

Creating, Changing and Activating a Model

Creating Analysis Process for Training

Analysis Process for Executing the Prediction

 

 

End of Content Area