hanaml.AffinityPropagation {hana.ml.r}R Documentation

Affinity Propagation

Description

hanaml.AffinityPropagation is a R wrapper for PAL Affinity Propagation algorithm.

Usage

hanaml.AffinityPropagation(conn.context,
                           data,
                           key,
                           features = NULL,
                           affinity,
                           n.clusters,
                           max.iter = NULL,
                           convergence.iter = NULL,
                           damping = NULL,
                           preference = NULL,
                           seed.ratio = NULL,
                           times = NULL,
                           minkowski.power = NULL,
                           thread.ratio = NULL)

Arguments

conn.context

ConnectionContext
Connection to the SAP HANA System

data

DataFrame
DataFrame containing the data.

key

character
Name of the ID column..

features

character or list of characters, optional
Names of the features columns.

affinity

character
Ways to compute the distance between two points.

  • 'manhattan'

  • 'euclidean'

  • 'minkowski'

  • 'chebyshev'

  • 'standardized.euclidean'

  • 'cosine'

No default value as it is mandatory.

n.clusters

integer

  • 0: Does not adjust Affinity Propagation cluster result.

  • Non-zero integer: If Affinity Propagation cluster number is bigger than n.clusters, PAL will merge the result to make the cluster number be the value specified for n.clusters.

max.iter

integer, optional
Maximum number of iterations.
Defaults to 500.

convergence.iter

integer, optional
When the clusters keep a steady one for the specified times, the algorithm ends.
Defaults to 100.

damping

double, optional
Controls the updating velocity. Value range: (0, 1).
Defaults to 0.9.

preference

double, optional
Determines the preference. Value range: [0,1].
Defaults to 0.5.

seed.ratio

double, optional
Select a portion of (seed_ratio * data_number) the input data as seed, where data_number is the row_size of the input data. Value range: (0,1]. If seed_ratio is 1, all the input data will be the seed.
Defaults to 1.

times

integer, optional
The sampling times. Only valid when seed_ratio is less than 1 and affinity is 'minkowski'.
Defaults to 1.

minkowski.power

integer, optional
The sampling times. Only valid when affinity is 'minkowski'.
Defaults to 1.

thread.ratio

numeric, optional
Specifies the ratio of total number of threads that can be used by this function. The value range is from 0 to 1, where 0 means only using 1 thread, and 1 means using at most all the currently available threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Defaults to 0.

Format

R6Class object.

Value

An "AffinityPropagation" object with the following attributes:

Examples

## Not run: 
 Input DataFrame data:
> data$collect()
       ID     V1     V2
   0    1   0.10   0.10
   1    2   0.11   0.10
   2    3   0.10   0.11
   3    4   0.11   0.11
   4    5   0.12   0.11
   5    6   0.11   0.12
   20  21  10.13  10.12
   21  22  10.13  10.13
   22  23  10.13  10.14
   23  24  10.14  10.13

Create a AffinityPropagation instance:
> ap <- hanaml.AffinityPropagation(conn.context = conn,
                                   data = data,
                                   affinity = 'euclidean',
                                   n.clusters = 0L,
                                   max.iter = 500L,
                                   convergence.iter = 100L,
                                   damping = 0.9,
                                   preference = 0.5,
                                   times = 1L,
                                   seed.ratio = 1,
                                   minkowski.power = 0,
                                   thread.ratio = 0)
Expected output:
> ap$labels$collect()
    ID  CLUSTER_ID
0    1           0
1    2           0
2    3           0
3    4           0
4    5           0
5    6           0
...
21  22           1
22  23           1
23  24           1

## End(Not run)

[Package hana.ml.r version 1.0.8 Index]