hanaml.AprioriLite {hana.ml.r}R Documentation

Lite Apriori algorithm for association rule mining

Description

Lite Apriori algorithm for association rule minining, based on PAL_LITE_APRIORI.

Usage

hanaml.AprioriLite (conn.context, data, used.cols = NULL,
                   min.support, min.confidence,
                   thread.ratio = NULL, subsample = NULL,
                   recalculate = NULL, timeout = NULL,
                   pmml.export = NULL)

Arguments

conn.context

ConnectionContext
Database connection object.

data

DataFrame Dataset used for association rule mininig.

min.support

numeric
User-specified minimum support value for rule generation.

min.confidence

numeric, optional
User-specified minimum confidence value for rule generation.

used.cols

list of characters, optional
Specified the columns in data that specify transaction IDs and item IDs. For example, consider that the transaction ID column for data is "CUSTOMER", while the item ID colum for data is "ITEM", then the correct way to set up this parameter is

used.cols = list("transaction" = "CUSTOMER", "item" = "ITEM"). Transaction ID column defaults to the 1st column of data, while item ID column defauts to the 2nd column of data.

subsample

double, optional
User specified subsampling rate of data used for rule mininig, ranging from 0 to 1. Set to 1 if you want to used the entire input data. Defaults to 1.

recalculate

logical, optional
If subsampled, this parameter controls whether or not to use the remaining data to update the computed support, confidence and lift values. Vaild only when 'subsample' is not 1. Defaults TRUE.

timeout

integer, optional
Specifies the maximum run time in seconds for association rule mining. The algorithm will stop running when the specified timeout is reached.

thread.ratio

double, optional
Controls the proportion of available threads to use. The value range is from 0 to 1, where 0 means only using 1 thread, and 1 means using at most all the currently available threads. Values outside this range tell PAL to heuristically determine the number of threads to use.

pmml.export

('no', 'single-row', 'multi-row'), optional
Controls whether to output a PMML representation of the model, and how to format the PMML. Case-insensitive.

  • 'no': No PMML model.

  • 'single-row': Export an PMML model in a single row.

  • 'multi-row': Export an PMML model in multiple-rows, each row with maximum of length of 5000 characters.

Format

R6Class object.

Value

An "AprioriLite" object with the following attributes:

Examples

## Not run: 
Input data for association rule mininig:

> df
  CUSTOMER  ITEM
1         2 item2
2         2 item3
3         3 item1
4         3 item2
5         3 item4
6         4 item1
7         4 item3
8         5 item2
9         5 item3
10        6 item1
11        6 item3
12        0 item1
13        0 item2
14        0 item5
15        1 item2
16        1 item4
17        7 item1
18        7 item2
19        7 item3
20        7 item5
21        8 item1
22        8 item2
23        8 item3

Apply lite Apriori algorithm to the input data:

> apl <- hanaml.AprioriLite(conn.context = conn, data = df,
                           used.cols = c("transaction" = "CUSTOMER", "item" = "ITEM"),
                           min.support = 0.1, min.confidence = 0.3,
                           pmml.export = 'single-row')

Check the mined association rules:

> apl$result
ANTECEDENT CONSEQUENT   SUPPORT CONFIDENCE      LIFT
1       item5      item2 0.2222222  1.0000000 1.2857143
2       item1      item5 0.2222222  0.3333333 1.5000000
3       item5      item1 0.2222222  1.0000000 1.5000000
4       item5      item3 0.1111111  0.5000000 0.7500000
5       item1      item2 0.4444444  0.6666667 0.8571429
6       item2      item1 0.4444444  0.5714286 0.8571429
7       item4      item2 0.2222222  1.0000000 1.2857143
8       item3      item2 0.4444444  0.6666667 0.8571429
9       item2      item3 0.4444444  0.5714286 0.8571429
10      item4      item1 0.1111111  0.5000000 0.7500000
11      item3      item1 0.4444444  0.6666667 1.0000000
12      item1      item3 0.4444444  0.6666667 1.0000000

## End(Not run)

[Package hana.ml.r version 1.0.8 Index]