hanaml.Kmedian {hana.ml.r} | R Documentation |
hanaml.Kmedian is a R wrapper for PAL Kmedian algorithm.
hanaml.Kmedian(conn.context, data, key, features = NULL, n.clusters, init = NULL, max.iter = NULL, tol = NULL, thread.ratio = NULL, distance.level = NULL, minkowski.power = NULL, category.weights = NULL, normalization = NULL, categorical.variable = NULL)
conn.context |
|
data |
|
key |
|
features |
|
n.clusters |
|
init |
|
max.iter |
|
tol |
|
thread.ratio |
|
distance.level |
|
minkowski.power |
|
category.weights |
|
normalization |
Defaults to 'no'. |
categorical.variable |
|
R6Class
object.
The K-Medians clustering algorithm that partitions n observations into K clusters according to their nearest cluster center. It uses medians of each feature to calculate cluster centers.
labels : DataFrame
Label assigned to each sample.
cluster.centers : DataFrame
Coordinates of cluster centers.
## Not run: Input DataFrame data: > data$Collect() ID V000 V001 V002 0 0 0.5 A 0.5 1 1 1.5 A 0.5 2 2 1.5 A 1.5 3 3 0.5 A 1.5 4 4 1.1 B 1.2 5 5 0.5 B 15.5 6 6 1.5 B 15.5 7 7 1.5 B 16.5 8 8 0.5 B 16.5 9 9 1.2 C 16.1 10 10 15.5 C 15.5 11 11 16.5 C 15.5 12 12 16.5 C 16.5 13 13 15.5 C 16.5 14 14 15.6 D 16.2 15 15 15.5 D 0.5 16 16 16.5 D 0.5 17 17 16.5 D 1.5 18 18 15.5 D 1.5 19 19 15.7 A 1.6 > kmedian <- hanaml.Kmedian(conn.context = conn, data = data, key = "ID", n.clusters = 4, init = 'first_k', max.iter = 100, tol = 1.0E-6, thread.ratio = 0.3, distance.level = 'euclidean', category.weights = 0.5) Expected output: > kmedian$cluster.centers$Collect() CLUSTER_ID V000 V001 V002 0 0 1.1 A 1.2 1 1 15.7 D 1.5 2 2 15.6 C 16.2 3 3 1.2 B 16.1 >kmedian$labels$Collect() ID CLUSTER_ID DISTANCE 1 0 0 0.9219544 2 1 0 0.8062258 3 2 0 0.5000000 4 3 0 0.6708204 5 4 0 0.7071068 6 5 3 0.9219544 7 6 3 0.6708204 8 7 3 0.5000000 9 8 3 0.8062258 10 9 3 0.7071068 11 10 2 0.7071068 12 11 2 1.1401754 13 12 2 0.9486833 14 13 2 0.3162278 15 14 2 0.7071068 16 15 1 1.0198039 17 16 1 1.2806248 18 17 1 0.8000000 19 18 1 0.2000000 20 19 1 0.8071068 ## End(Not run)