predict.UnifiedClustering.Rd
Similar to other predict methods, this function Cluster assignment is a unified interface to call a cluster assignment algorithm to assign data to clusters that are previously generated by some clustering methods, including K-Means, Accelerated K-Means, K-Medians, K-Medoids, DBSCAN, SOM, and GMM. AgglomerateHierarchicalClustering does not provide predict function!
# S3 method for UnifiedClustering
predict(
model,
data,
key = NULL,
features = NULL,
func = NULL,
group.key = NULL
)
S3
methods
R6Class
A "hanaml.UnifiedClustering" object for prediction.
DataFrame
DataFrame containting the data.
character, optional
Name of the ID column.
If not provided, the data is assumed to have no ID column.
No default value.
character of list of characters, optional
Name of feature columns for prediction.
If not provided, it defaults to all non-key columns of data.
character, optional
The functionality for unified Clustering model.
Mandatory only when the func
attribute of model
is NULL.
"DBSCAN"
"GaussianMixture"
"AcceleratedKMeans"
"KMeans"
"KMedians"
"KMedoids"
"SOM"
"AffinityPropagation"
character, optional
The column of group key.
This parameter is only valid when model$massive is TRUE.
Defaults to the first column of data if group.key is not provided.
Predicted values are returned as a list of DataFrame.
DataFrame 1:
ID: column name.
CLUSTER_ID: Assigned cluster ID.
DISTANCE: Distance metric between a given point and the assigned cluster.
DataFrame 2:
Error message and only valid if massive is TRUE.
Input data for prediction:
> df.predict$Collect()
ID CLUSTER_ID DISTANCE
1 88 3 0.981659
2 89 3 0.826454
3 90 2 1.990205
4 91 2 0.325812
Call the predict() function:
> res <- predict(model = ukmeans,
data = df.predict,
key = "ID",
func = "KMeans")
Check the result:
> res$Collect()
ID CLUSTER_ID DISTANCE
1 88 3 0.981659
2 89 3 0.826454
3 90 2 1.990205
4 91 2 0.325812