Properties that can be configured for the HANA Self-Organizing Maps algorithm.
A self-organizing map (SOM) or self-organizing feature map (SOFM) is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map. Self-organizing maps are different from other artificial neural networks in that they use a neighborhood function to preserve the topological properties of the input space.
This makes SOMs useful for visualizing low-dimensional views of high-dimensional data, akin to multi-dimensional scaling. The model was first described as an artificial neural network by the Finnish professor Teuvo Kohonen, and is sometimes called a Kohonen map. Like most artificial neural networks, SOMs operate in two modes: training and mapping. Training builds the map using input examples. It is a competitive process, also called vector quantization. Mapping automatically classifies a new input vector.
The SOM approach has many applications, such as virtualization, web document clustering, and recognition of speech.
| Property | Description |
|---|---|
| Map Height | Enter the map height. The default value is 5. |
| Map Width | Enter the map width. The default value is 5. |
| Alpha | Enter a value for the learning rate. The default value is 0.5. |
| Map Shape | Select the map shape. |
| Features | Select input columns with which you want to perform the analysis. |
| Calculate Silhouette | Select this option to calculate silhouette values. Silhouette signifies the quality of clustering. The silhouette value 1 signifies that the clustering is good and 0 signifies that the clustering is bad. |
| Cluster Name | Enter a name for the new column that contains the cluster numbers for the given dataset. |
| Missing Values | Select the method for handling missing values. Possible
methods:
|
| Normalization Type | Select the type of normalization. Possible types:
|
| Random Seed | Enter a random number that you want to use to perform the calculation. If you enter -1, the algorithm selects a random number by itself for calculation. The default value is -1. |
| Maximum Iterations | Enter the number of iterations you want the algorithm to use for finding clusters. The default value is 100. |
| Number of Threads | Enter the number of threads that the algorithm should use during execution. The default value is 2. |