hanaml.SOM.Rd
hanaml.SOM is an R wrapper for SAP HANA PAL Self-Organizing Maps algorithm.
hanaml.SOM(
data = NULL,
key = NULL,
features = NULL,
tol = NULL,
normalization = NULL,
random.state = NULL,
height.of.map = NULL,
width.of.map = NULL,
kernel = NULL,
alpha = NULL,
learning.rate = NULL,
grid = NULL,
radius = NULL,
batch.som = NULL,
max.iter = NULL,
decay = NULL
)
DataFrame
DataFrame containting the data.
character
Name of ID column.
character or list of characters, optional
Names of the features columns.
double, optional
If the largest difference of the successive maps is less than this value,
the calculation is regarded as convergence, and SOM is completed
consequently.
Defaults to 1.0e-6.
character, optional
Normalization type:
"no"
: no normalization.
"min.max"
: Transform to new rang: 0 to 1
"z.score"
: Z-score normalization
Defaults to "no".
integer, optional
-1
: Random
0
: Sets every weight to zero
Other value
: Uses this value as seed
Defaults to -1.
integer, optional
Indicates the height of the map.
Defaults to 10.
integer, optional
Indicates the width of the map.
Defaults to 10.
character, optional
Represents the neighborhood kernel function.
"gaussian"
: Gaussian kernel function
"bubble"
: Bubble/Flat kernel function.
Defaults to "gaussian".
double, optional
Specifies the learning rate.
Defaults to 0.5
character, optional(deprecated)
Indicates the decay function for learning rate:
"exponential"
"linear"
Will be replaced by decay
in future release.
Defaults to "exponential".
character, optional
Indicates the shape of the grid.
"rectangle"
"hexagon"
Defaults to "hexagon".
double, optional
Specifies the scan radius.
Defaults to the bigger value of height.of.map and width.of.map.
logical, optional
Indicates whether batch SOM is carried out.
Defaults to FALSE.
Note that for batch SOM, kernel.function is always Gaussian,
and the learning.rate factors take no effect.
Defaults to FALSE.
integer, optional
Maximum number of iterations.
Note that the training might not converge if this value is too small,
for example, less than 1000.
Defaults to 1000 plus 500 times the number of neurons in the lattice.
character, optional
Indicates the decay function for learning rate:
"exponential"
"linear"
If both learning.rate
and decay
are set,
decay
takes precedence.
Defaults to "exponential".
An R6 object of class "SOM", with the following attributes and methods:
Attributes
map : DataFrame
The map after training. The structure is as follows:
1st column: CLUSTER_ID, int. Unit cell ID.
Other columns except the last one: FEATURE (in input data) column with prefix "WEIGHT_", float. Weight vectors used to simulate the original tuples.
Last column: COUNT, int. Number of original tuples that every unit cell contains.
labels : DataFrame
The label of input data, the structure is as follows:
1st column: ID (in input table) data type, ID (in input table) column name ID of original tuples.
2nd column: BMU, int. Best match unit (BMU).
3rd column: DISTANCE, float, The distance between the tuple and its BMU.
4th column: SECOND_BMU, int, Second BMU.
5th column: IS_ADJACENT. int. Indicates whether the BMU and the second BMU are adjacent.
0: Not adjacent
1: Adjacent
model : DataFrame
The SOM model.
Methods
CreateModelState(model=NULL, algorithm=NULL, func=NULL, state.description="ModelState", force=FALSE)
Usage:
> som <- hanaml.SOM(data=df, key="ID")
> som$CreateModelState()
Arguments:
model: DataFrame
DataFrame containing the model for parsing.
Defaults to self$model
.
algorithm: character
Specifies the PAL algorithm associated with model
.
Defaults to self$pal.algorithm
.
func: character
Specifies the functionality for Unified Classification/Regression.
Valid only for object instance of R6Class "UnifiedClassification" or "UnifiedRegression".
Defaults to self$func
.
state.description: character
A summary string for the generated model state.
Defaults to "ModelState".
force: logic
Specifies whether or not the replace existing state for model
.
Defaults to FALSE.
After calling this method, an attribute state
that contains the parsed info for model
shall be assigned
to the corresponding R6 object.
DeleteModelState(state=NULL)
Usage:
Assuming we have trained a hanaml
model and created its model state, like the following:
> som <- hanaml.SOM(data=df, key="ID")
> som$CreateModelState()
After using the model state for real-time scoring, we can delete the state by calling:
> som$DelateModelState()
Arguments:
state: DataFrame
DataFrame containing the state info.
Defaults to self$state
.
After calling this method, the specified model state shall be cleaned up and associated memory be released.
Input DataFrame data:
> data$Collect()
TRANS_ID V000 V001
1 0 0.10 0.20
2 1 0.22 0.25
3 2 0.30 0.40
4 3 0.40 0.50
......
16 15 49.00 40.10
17 16 50.10 50.20
18 17 50.20 48.30
19 18 55.30 50.40
20 19 50.40 56.50
Call the function:
> som <- hanaml.SOM(data,
key = "TRANS_ID",
tol = 1.0e-6,
normalization = "no",
random.state = 1,
height.of.map =4,
width.of.map = 4,
kernel = "gaussian",
learning.rate = "exponential",
grid = "hexagon",
batch.som = FALSE,
max.iter = 4000)
Output:
> som$map$Collect()
CLUSTER_ID WEIGHT_V000 WEIGHT_V001 COUNT
1 0 52.8376884 53.4653266 2
2 1 50.1502513 49.2452261 2
3 2 18.5976067 27.1745897 0
4 3 1.2676711 15.2676711 3
5 4 49.0000211 40.1000986 1
6 5 33.4309941 34.4504915 0
7 6 3.4999807 15.8999720 1
8 7 2.2000010 11.2000508 1
9 8 24.8149919 11.7383922 0
10 9 20.6962530 8.3151278 0
11 10 9.8170045 6.4531495 0
12 11 1.1444060 4.2462051 0
13 12 16.4690145 1.5680112 3
14 13 12.7482412 1.7532663 2
15 14 3.8868516 0.8463565 0
16 15 0.3059696 0.4737271 5