hanaml.Apriori.Rd
hanaml.Apriori is a R wrapper for SAP HANA PAL APRIORI and PAL APRIORI_RELATIONAL.
hanaml.Apriori(
data,
used.cols = NULL,
min.support,
min.confidence,
min.lift = NULL,
relational = FALSE,
max.item.length = NULL,
max.consequent = NULL,
use.prefix.tree = NULL,
ubiquitous = NULL,
lhs.restrict = NULL,
rhs.complement.lhs = NULL,
rhs.restrict = NULL,
lhs.complement.rhs = NULL,
timeout = NULL,
thread.ratio = NULL,
pmml.export = NULL
)
DataFrame
DataFrame containting the data.
list of characters, optional
Specified the columns in data
that specify transaction IDs and item IDs.
For example, consider that the transaction ID column for data
is "CUSTOMER",
while the item ID column for data
is "ITEM", then the correct way to set up
this parameter is
used.cols = list(transaction = "CUSTOMER", item = "ITEM")
Transaction ID column defaults to the 1st column of data
while item ID column
defaults to the 2nd column of data
.
numeric
User-specified minimum support value for rule generation.
numeric
User-specified minimum confidence value for rule generation.
numeric, optional
User-specified minimum lift value for rule generation.
Defaults to 0.
logical, optional
Whether or not to apply relational logic for association rule mining.
This will affect the format view of mined association rules.
Defaults to FALSE.
integer, optional
User-specified maximum length of items, inclusive of both antecedent and consequent items
for association rule generation.
Defaults to 5.
double, optional
Maximum length of consequent items for association rule generation.
Defaults to 100.
logical, optional
Indicates whether or not to use prefix tree data structure when generating association rules
for the purpose of memory-saving.
Defaults to FALSE.
double, optional
User-specified maximum support value during the frequent items mining phase, i.e.
if an item has support value above ubiquitous
, it shall be ignored.
Defaults to 1.0.
list of characters, optional
Specifies the items are only allowed to be antecedent items, i.e. they can only
appear on the left-hand side of association rules.
No default value.
logical, optional
If lhs.restrict
is not NULL, you can set this parameter to TRUE to restrict rest
of items so that they can only appear on the right-hand-side of association rules.
Defaults to FALSE.
list of characters, optional
Specifies the items are only allowed to be consequent items, i.e. they can only
appear on the right-hand-side of association rules.
No default value.
logical, optional
If rhs.restrict
is not NULL, you can set this parameter to TRUE to restrict rest
of items so that they can only appear on the left-hand-side of association rules.
Defaults to FALSE.
integer, optional
Specifies the maximum run time in seconds for association rule mining.
The algorithm will stop running when the specified timeout is reached.
Defaults to 3600.
double, optional
Controls the proportion of available threads that can be used by this
function.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates all available threads.
Values between 0 and 1 will use up to
that percentage of available threads.Values outside this
range are ignored.
Defaults to 0.
c("no", "single-row", "multi-row"), optional
Controls whether to output a PMML representation of the model,
and how to format the PMML.
"no":
No PMML model.
"single-row":
Exports a PMML model in a maximum of
one row. Fails if the model doesn't fit in one row.
"multi-row":
Exports a PMML model, splitting it
across multiple rows if it doesn't fit in one.
Default to "no".
An "Apriori" object with the following attributes:
result: DataFrame
Mined association rules as a whole.
Each rule has its antecedent/consequent items and support/confidence/lift values.
Available only when relatiional
is FALSE.
antecedent: DataFrame
Antecedent item information of mined association rules.
Available only when relational
is TRUE.
consequent: DataFrame
Consequent item information of mined association rules.
Available only when relational
is TRUE.
statistics: DataFrame
Support/confidence/lift values of mined association rules.
Available only when relational
is TRUE.
model: DataFrame
Mined association rules in PMML format.
Available only when pmml.export
is 'single-row' or 'multi-row'.
Input DataFrame data:
> data$Collect()
CUSTOMER ITEM
1 2 item2
2 2 item3
3 3 item1
4 3 item2
5 3 item4
6 4 item1
7 4 item3
8 5 item2
9 5 item3
10 6 item1
11 6 item3
12 0 item1
13 0 item2
14 0 item5
15 1 item2
16 1 item4
17 7 item1
18 7 item2
19 7 item3
20 7 item5
21 8 item1
22 8 item2
23 8 item3
Call the function:
> apr <- hanaml.Apriori(data = data, min.support = 0.1, min.confidence = 0.3,
min.lift = 1.10, max.consequent = 1, pmml.export = "single-row")
Output:
> apr$result
ANTECEDENT CONSEQUENT SUPPORT CONFIDENCE LIFT
1 item5 item2 0.2222222 1.0000000 1.285714
2 item1 item5 0.2222222 0.3333333 1.500000
3 item5 item1 0.2222222 1.0000000 1.500000
4 item4 item2 0.2222222 1.0000000 1.285714
5 item2&item1 item5 0.2222222 0.5000000 2.250000
6 item5&item1 item2 0.2222222 1.0000000 1.285714
7 item5&item2 item1 0.2222222 1.0000000 1.500000
8 item5&item3 item2 0.1111111 1.0000000 1.285714
9 item5&item3 item1 0.1111111 1.0000000 1.500000
10 item1&item4 item2 0.1111111 1.0000000 1.285714
11 item2&item1&item3 item5 0.1111111 0.5000000 2.250000
12 item5&item1&item3 item2 0.1111111 1.0000000 1.285714
13 item5&item2&item3 item1 0.1111111 1.0000000 1.500000