hanaml.GeoDBSCAN {hana.ml.r} | R Documentation |
Geometry DBSCAN
Description
hanaml.GeoDBSCAN is a R wrapper for PAL GeoDBSCAN algorithm.
Usage
hanaml.GeoDBSCAN(conn.context,
data = NULL,
key = NULL,
features = NULL,
minpts = NULL,
eps = NULL,
thread.ratio = NULL,
metric = NULL,
minkowski.power = NULL,
algorithm = NULL,
save.model = NULL)
Arguments
conn.context |
ConnectionContext
Connection to the SAP HANA System
|
data |
DataFrame
DataFrame containing the data.
|
key |
character
Name of ID column.
|
features |
character, optional
Name of the feature column. GeoDBSCAN only supports one feature.
If is not provided, it defaults to first non-ID columns.
|
minpts |
integer, optional
The minimum number of points required to form a cluster
Note that
minpts and eps need to be provided together by user or these
two parameters are automatically determined.
|
eps |
double, optional
The scan radius.
Note that minpts and eps need to be provided together
by user or these two parameters are automatically determined.
|
thread.ratio |
double, optional
Controls the proportion of available threads to use.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates up to all available threads. Values between 0 and 1
will use that percentage of available threads. Values outside this
range tell PAL to heuristically determine the number of threads to use.
Defaults to -1.
|
metric |
character, optional
Ways to compute the distance between two points. Valid metric options include:
'manhattan'
'euclidean'
'minkowski'
'chebyshev'
'standardized.euclidean'
'cosine'
Defaults to "euclidean".
|
minkowski.power |
integer, optional
When minkowski is choosed for "metric", this parameter
controls the value of power.
Only applicable when metric is 'minkowski'.
Defaults to 3.
|
algorithm |
{"brute.force", "kd.tree"}, optional
Ways to search for neighbours.
Defaults to "kd.tree".
|
save.model |
logical, optional
If TRUE, the generated model will be saved.
save.model must be TRUE to call.
Defaults to TRUE.
|
Format
R6Class
object.
Value
A "GeoDBSCAN" object with the following attributes:
Examples
## Not run:
following SQL:
> CREATE COLUMN TABLE PAL_GEO_DBSCAN_DATA_TBL (
"ID" INTEGER,
"POINT" ST_GEOMETRY);
Input DataFrame for clustering:
> data$collect()
ID V1 V2 V3
0 1 0.10 0.10 B
1 2 0.11 0.10 A
2 3 0.10 0.11 C
3 4 0.11 0.11 B
4 5 0.12 0.11 A
5 6 0.11 0.12 E
...
27 28 16.11 16.11 A
28 29 20.11 20.12 C
29 30 15.12 15.11 A
> GeoDBSCAN <- hanaml.GeoDBSCAN(conn,
data,
thread.ratio = 0.2,
metric = "Manhattan")
expected output:
> DBSCAN$labels$Collect()
ID CLUSTER.ID
1 1 0
2 2 0
3 3 0
4 4 0
5 5 0
...
28 28 -1
29 29 -1
30 30 -1
## End(Not run)
[Package
hana.ml.r version 1.0.8
Index]