hanaml.SPM.Rdhanaml.SPM is a R wrapper for SAP HANA PAL SPM.
hanaml.SPM( data, used.cols = NULL, relational = NULL, min.support, ubiquitous = NULL, min.event.size = NULL, max.event.size = NULL, min.event.length = NULL, max.event.length = NULL, item.restrict = NULL, min.gap = NULL, calculate.lift = NULL, timeout = NULL )
| data |
|
|---|---|
| used.cols |
If not set, customer ID column defaults to the 1st column of data, transaction ID column defaults to the 2nd column of data, and item ID column defaults to the 3rd column of data. |
| relational |
|
| min.support |
|
| ubiquitous |
|
| min.event.size |
|
| max.event.size |
|
| min.event.length |
|
| max.event.length |
|
| item.restrict |
|
| min.gap |
|
| calculate.lift |
|
| timeout |
|
A "SPM" object with the following attributes:
result: DataFrame
Mined frequent patterns with transaction IDs, item IDs
as well as support, confindence and lift values in all.
Available only when relational is FALSE.
pattern: DataFrame
Mined frequent patterns with transaction IDs and item IDs.
Available only when relational is TRUE.
statistics: DataFrame
Support/confidence/lift values of mined frequent patterns.
Available only when relational is TRUE.
The sequential pattern mining (SPM) algorithm, which searches for frequent patterns in sequence databases.
Input transaction DataFrame data:
> data$CollecT() CUSTID TRANSID ITEMS 1 A 1 Apple 2 A 1 Blueberry 3 A 2 Apple 4 A 2 Cherry 5 A 3 Dessert 6 B 1 Cherry 7 B 1 Blueberry 8 B 1 Apple 9 B 2 Dessert 10 B 3 Blueberry 11 C 1 Apple 12 C 2 Blueberry 13 C 3 Dessert
Creating an SPM object for mining association rules from the input data:
> sp <- hanaml.SPM(data = df, relational = TRUE,
used.cols = c("customer" = "CUSTID",
"transaction" = "TRANSID",
"item" = "ITEMS"),
min.support = 0.5, calculate.lift = TRUE)
Check the mined frequent patterns from the attributes of above SPM object:
> sp$pattern$CollecT()
PATTERN_ID EVENT_ID ITEM
1 1 1 {Apple}
2 2 1 {Apple}
3 2 2 {Blueberry}
4 3 1 {Apple}
5 3 2 {Dessert}
6 4 1 {Apple,Blueberry}
7 5 1 {Apple,Blueberry}
8 5 2 {Dessert}
9 6 1 {Apple,Cherry}
10 7 1 {Apple,Cherry}
11 7 2 {Dessert}
12 8 1 {Blueberry}
13 9 1 {Blueberry}
14 9 2 {Dessert}
15 10 1 {Cherry}
16 11 1 {Cherry}
17 11 2 {Dessert}
18 12 1 {Dessert}
> sp$statistics$CollecT()
PATTERN_ID SUPPORT CONFIDENCE LIFT
1 1 1.0000000 0.0000000 0.0000000
2 2 0.6666667 0.6666667 0.6666667
3 3 1.0000000 1.0000000 1.0000000
4 4 0.6666667 0.0000000 0.0000000
5 5 0.6666667 1.0000000 1.0000000
6 6 0.6666667 0.0000000 0.0000000
7 7 0.6666667 1.0000000 1.0000000
8 8 1.0000000 0.0000000 0.0000000
9 9 1.0000000 1.0000000 1.0000000
10 10 0.6666667 0.0000000 0.0000000
11 11 0.6666667 1.0000000 1.0000000
12 12 1.0000000 0.0000000 0.0000000