apply Kernel Density Estimation analysis

# S3 method for KDE
predict(
  model,
  data = NULL,
  key = NULL,
  features = NULL,
  thread.ratio = NULL,
  stat.info = NULL
)

Format

S3 methods

Arguments

model

R6Class object
A 'KDE' object.

data

DataFrame
DataFrame containting the data points whose density value need to be evaluated.

key

character, optional
Name of the ID column.
Defaults to the first column if not provided.

features

character or list of characters, optional
Names of features columns.
If is not provided, it defaults to all non-key columns of data.

thread.ratio

double, optional
Controls the proportion of available threads that can be used by this function.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads.
Values between 0 and 1 will use up to that percentage of available threads.Values outside this range are ignored.
Defaults to 0.

stat.info

logical, optional
If TRUE, return a DataFrame with statistics information.
Defaults to FALSE.

Value

Returns a list of DataFrames.

  • DataFrame 1
    Evaluated log density value of the data points, structured as follows:

    • ID: id.

    • DENSITY_VALUE: log Density value.

  • DataFrame 2
    Statistics information, structured as follows:

  • COMMUNALITIES: DataFrame

    • TEST_ID: ID of evaluated test data point.

    • FITTING_IDS: Fitting IDs.

Examples

Input DataFrame data.df.fit and data.eval.df.fit:


> data.df.fit$Collect()
 ID        X1          X2
1 0 -2.1029683 -1.4283269
2 1 -2.1029683  0.7197969
3 2 -2.1029683  2.8679208
4 3 -0.6094340 -1.4283269
5 4 -0.6094340  0.7197969
6 5 -0.6094340  2.8679208
7 6  0.8841004 -1.4283269
8 7  0.8841004  0.7197969
9 8  0.8841004  2.8679208
> data.eval.df.fit$Collect()
 ID         X1          X2
1 0 -0.4257698 -1.39613035
2 1  0.8841004  1.38149350
3 2  0.1341262 -0.03222389
4 3  0.8455036  2.86792078
5 4  0.2884408  1.51333705
6 5 -0.6667847  1.24498042
7 6 -2.1029683 -1.42832694
8 7  0.7699024 -0.47300711
9 8  0.2102913  0.32843074
10 9 0.4823225 -0.43796174

Call the function:

estimation <- hanaml.KDE(data = data.df.fit,
                         leaf.size = 10,
                         algorithm = "kd-tree",
                         bandwidth = 0.68129,
                         distance.level = "euclidean",
                         kernel = "gaussian")

eval.result <- predict(estimation,
                       data = data.eval.df.fit,
                       stat.info = TRUE)

Output:


> eval.result[[1]]$Collect()
  ID DENSITY_VALUE
1  0     -3.852755
2  1     -4.586453
3  2     -6.110158
4  3     -3.275507
5  4     -2.888267
6  5     -4.107246
7  6     -3.387239
8  7     -2.732173
9  8     -3.554738

See also