predict.KDE.Rdapply Kernel Density Estimation analysis
# S3 method for KDE predict( model, data = NULL, key = NULL, features = NULL, thread.ratio = NULL, stat.info = NULL )
| model |
|
|---|---|
| data |
|
| key |
|
| features |
|
| thread.ratio |
|
| stat.info |
|
S3 methods
Returns a list of DataFrame.
DataFrame 1
Evaluated log density value of the data points, structured as
follows:
ID: id.
DENSITY_VALUE: log Density value.
DataFrame 2
Statistics information, structured as follows:
COMMUNALITIES : DataFrame
TEST_ID: ID of evaluated test data point.
FITTING_IDS: Fitting IDs.
Input DataFrame data.df.fit and data.eval.df.fit:
> data.df.fit$Collect() ID X1 X2 1 0 -2.1029683 -1.4283269 2 1 -2.1029683 0.7197969 3 2 -2.1029683 2.8679208 4 3 -0.6094340 -1.4283269 5 4 -0.6094340 0.7197969 6 5 -0.6094340 2.8679208 7 6 0.8841004 -1.4283269 8 7 0.8841004 0.7197969 9 8 0.8841004 2.8679208 > data.eval.df.fit$Collect() ID X1 X2 1 0 -0.4257698 -1.39613035 2 1 0.8841004 1.38149350 3 2 0.1341262 -0.03222389 4 3 0.8455036 2.86792078 5 4 0.2884408 1.51333705 6 5 -0.6667847 1.24498042 7 6 -2.1029683 -1.42832694 8 7 0.7699024 -0.47300711 9 8 0.2102913 0.32843074 10 9 0.4823225 -0.43796174
Call the function:
estimation <- hanaml.KDE(data = data.df.fit, leaf.size = 10, algorithm = "kd-tree", bandwidth = 0.68129, distance.level = "euclidean", kernel = "gaussian") eval.result <- predict(estimation, data = data.eval.df.fit, stat.info = TRUE)
Output:
> eval.result[[1]]$Collect() ID DENSITY_VALUE 1 0 -3.852755 2 1 -4.586453 3 2 -6.110158 4 3 -3.275507 5 4 -2.888267 6 5 -4.107246 7 6 -3.387239 8 7 -2.732173 9 8 -3.554738