hanaml.SOM {hana.ml.r} | R Documentation |
hanaml.SOM is a R wrapper for PAL Self-Organizing Maps algorithm.
hanaml.SOM(conn.context, data = NULL, key = NULL, features = NULL, tol = NULL, normalization = NULL, random.state = NULL, height.of.map = NULL, width.of.map = NULL, kernel = NULL, alpha = NULL, learning.rate = NULL, grid = NULL, radius = NULL, batch.som = NULL, max.iter = NULL)
conn.context |
|
data |
|
key |
|
features |
|
tol |
|
normalization |
Defaults to "no". |
random.state |
Defaults to -1. |
height.of.map |
|
width.of.map |
|
kernel |
Defaults to "gaussain". |
alpha |
|
learning.rate |
Defaults to "exponential". |
grid |
Defaults to "hexagon". |
radius |
|
batch.som |
|
max.iter |
|
R6Class
object.
A "SOM" object with the following attributes:
map : DataFrame
The map after training. The structure is as follows:
- 1st column: CLUSTER_ID, int. Unit cell ID.
- Other columns except the last one: FEATURE (in input data)
column with prefix "WEIGHT\_", float. Weight vectors used to simulate
the original tuples.
- Last column: COUNT, int. Number of original tuples that
every unit cell contains.
labels : DataFrame
The label of input data, the structure is as follows:
- 1st column: ID (in input table) data type, ID (in input table) column name
ID of original tuples.
- 2nd column: BMU, int. Best match unit (BMU).
- 3rd column: DISTANCE, float, The distance between the tuple and its BMU.
- 4th column: SECOND_BMU, int, Second BMU.
- 5th column: IS_ADJACENT. int.
Indicates whether the BMU and the second BMU are adjacent.
- 0: Not adjacent
- 1: Adjacent
model : DataFrame
The SOM model.
## Not run: Input DataFrame for clustering: > data$collect() TRANS_ID V000 V001 1 0 0.10 0.20 2 1 0.22 0.25 3 2 0.30 0.40 4 3 0.40 0.50 5 4 0.50 1.00 6 5 1.10 15.10 7 6 2.20 11.20 8 7 1.30 15.30 9 8 1.40 15.40 10 9 3.50 15.90 11 10 13.10 1.10 12 11 16.20 1.50 13 12 16.30 1.30 14 13 12.40 2.40 15 14 16.90 1.90 16 15 49.00 40.10 17 16 50.10 50.20 18 17 50.20 48.30 19 18 55.30 50.40 20 19 50.40 56.50 > som <- hanaml.SOM(conn, data, tol = 1.0e-6, normalization = "no", random.state = 1, height.of.map =4, width.of.map = 4, kernel = "gaussian", learning.rate = "exponential", grid = "hexagon", batch.som = FALSE, max.iter = 4000) expected output: > som$map$Collect() CLUSTER_ID WEIGHT_V000 WEIGHT_V001 COUNT 1 0 52.8376884 53.4653266 2 2 1 50.1502513 49.2452261 2 3 2 18.5976067 27.1745897 0 4 3 1.2676711 15.2676711 3 5 4 49.0000211 40.1000986 1 6 5 33.4309941 34.4504915 0 7 6 3.4999807 15.8999720 1 8 7 2.2000010 11.2000508 1 9 8 24.8149919 11.7383922 0 10 9 20.6962530 8.3151278 0 11 10 9.8170045 6.4531495 0 12 11 1.1444060 4.2462051 0 13 12 16.4690145 1.5680112 3 14 13 12.7482412 1.7532663 2 15 14 3.8868516 0.8463565 0 16 15 0.3059696 0.4737271 5 ## End(Not run)