hanaml.SMOTE.Rdhanaml.SMOTE is a R wrapper
for SAP HANA PAL SMOTE.
hanaml.SMOTE( data, key = NULL, features = NULL, label = NULL, thread.ratio = NULL, random.state = NULL, n.neighbors = NULL, minority.class = NULL, smote.amount = NULL, algorithm = NULL )
| data |
|
|---|---|
| key |
|
| features |
|
| label |
|
| thread.ratio |
|
| random.state |
|
| n.neighbors |
|
| minority.class |
|
| smote.amount |
|
| algorithm |
Defaults to "brute-force". |
DataFrame
Return dataset after sampling.
The Output Table has the same structure as defined in the Input Table.
SMOTE is a sampling method that oversamples the minority class to prepare the dataset for further applications. It creates new instances by taking each minority class sample and building convex combinations with the k nearest neighboring samples of the minority class.
> data.df$Collect() X1 X2 X3 TYPE 1 2 1 3.50 1 2 3 10 7.60 1 3 3 10 5.50 2 4 3 10 4.70 1 5 7 1000 8.50 1 6 8 1000 9.40 2 7 6 1000 0.34 1 8 8 999 7.40 2 9 7 999 3.50 1 10 6 1000 7.00 1
Call the function:
> result <- hanaml.SMOTE(data=data.df, thread.ratio = 1, random.state = 1,
label = "TYPE", minority.class = "2",
smote.amount = 200, n.neighbors = 2,
algorithm = "kd-tree")
Results:
> result$Collect() X1 X2 X3 TYPE 1 2 1.0000 3.500000 1 2 3 10.0000 7.600000 1 3 3 10.0000 5.500000 2 4 3 10.0000 4.700000 1 5 7 1000.0000 8.500000 1 6 8 1000.0000 9.400000 2 7 6 1000.0000 0.340000 1 8 8 999.0000 7.400000 2 9 7 999.0000 3.500000 1 10 6 1000.0000 7.000000 1 11 8 973.1091 7.350260 2 12 7 888.0711 8.959068 2 13 8 999.0567 7.513491 2 14 8 999.5123 8.424672 2 15 4 131.5100 5.733437 2 16 5 340.7139 6.135345 2