hanaml.Partition.Rdhanaml.Partition is a R wrapper for SAP HANA PAL Partition algorithm.
hanaml.Partition( data, key, features = NULL, random.state = NULL, thread.ratio = NULL, method = NULL, stratified.column = NULL, split.ratio = NULL, split.size = NULL )
| data |
|
|---|---|
| key |
|
| features |
|
| random.state |
Defaults to 0. |
| thread.ratio |
|
| method |
Defaults to "random". |
| stratified.column |
|
| split.ratio |
|
| split.size |
|
List of DataFrames
DataFrames for training, testing and validation, arranged in the following order:
DataFrame 1: training,
DataFrame 2: testing,
DataFrame 3: validation.
Input DataFrame data:
> data$collect() ID HomeOwner MaritalStatus AnnualIncome DefaultedBorrower 1 0 YES Single 125 NO 2 1 NO Married 100 NO 3 2 NO Single 70 NO 4 3 YES Married 120 NO 5 4 NO Divorced 95 YES ... 28 27 NO Single 85 YES 29 28 NO Married 75 YES 30 29 NO Single 90 YES
Call the function:
> partition <- hanaml.Partition(data,
random.state = 23,
method = "random",
split.ratio = c(0.6, 0.2, 0.2))
Output:
> partition[[1]]$Collect()
ID HomeOwner MaritalStatus AnnualIncome DefaultedBorrower
1 0 YES Single 125 NO
2 1 NO Married 100 NO
3 3 YES Married 120 NO
4 5 NO Married 60 NO
5 7 NO Single 85 YES
6 10 YES Single 125 NO
7 12 NO Single 70 NO
8 13 YES Married 120 NO
9 17 NO Single 85 YES
10 18 NO Married 75 NO
11 21 NO Married 100 NO
12 22 NO Single 70 NO
13 23 YES Married 120 NO
14 24 NO Divorced 95 YES
15 25 NO Married 60 NO
16 27 NO Single 85 YES
17 28 NO Married 75 YES
18 29 NO Single 90 YES