Similar to other predict methods, this function Cluster assignment is a unified interface to call a cluster assignment algorithm to assign data to clusters that are previously generated by some clustering methods, including K-Means, Accelerated K-Means, K-Medians, K-Medoids, DBSCAN, SOM, and GMM. AgglomerateHierarchicalClustering does not provide predict function!

# S3 method for UnifiedClustering
predict(model, data, key, features = NULL, func = NULL)

Arguments

model

R6Class
A "UnifiedClustering" object for prediction.

data

DataFrame
DataFrame containting the data.

key

character
Name of the ID column.

features

character of list of characters, optional
Name of feature columns for prediction.
If not provided, it defaults to all non-key columns of data.

func

character, optional
The functionality for unified Clustering model.
Mandatory only when the func attribute of model is NULL.
Valid values are as follows:
"AgglomerateHierarchicalClustering", "DBSCAN", "GaussianMixture", "AcceleratedKMeans", "KMeans", "KMedians", "KMedoids", "SOM".

Format

S3 methods

Value

Predicted values are returned as a DataFrame, structured as follows.

  • ID column name.

  • CLUSTER_ID: Assigned cluster ID.

  • DISTANCE: Distance metric between a given point and the assigned cluster.

Examples

Input data for prediction:

> df.predict$Collect()
   ID  CLUSTER_ID  DISTANCE
1  88           3  0.981659
2  89           3  0.826454
3  90           2  1.990205
4  91           2  0.325812

Call the predict() function:

> res <- predict(model = ukmeans,
                 data = df.predict,
                 key = "ID",
                 func = "KMeans")

Check the result:

> res$Collect()
   ID  CLUSTER_ID  DISTANCE
1  88           3  0.981659
2  89           3  0.826454
3  90           2  1.990205
4  91           2  0.325812

See also