Properties that can be configured for the HANA R-Boosting Classification algorithm.
The R packages that implement the algorithm are adabag and rpart.
In this component, the decision tree method is selected as the classification algorithm.
When the column names contain the hyphen symbol (-), use the Data Type component to re-define the column name.
| Property | Description |
|---|---|
| Maximum Depth | Enter the maximum node level in the final tree with the root node counted as level 0. This parameter can be set between 1 and 20 inclusive. |
| Minimum Split | Enter the minimum number of observations required for splitting a node. The default value is 0. The parameter can be set between 0 and 500 inclusive. |
| Complexity Parameter | Enter the complexity parameter, which saves computing time by preventing any split that does not improve the fit. The value for the parameter must be between [-1, 1), which is equal to or more than -1 and less than 1. |
| Number of Iterations | Number of iterations for which boosting is running. This parameter can be set between 5 and 500 inclusive. |
| Sample Weights | If TRUE, a bootstrap sample of the training set is drawn by using the weights for each observation on that iteration. If FALSE, every observation is used with its weights. |
| Weight Updating Coefficient | Three ways to calculate the weight updating coefficient, which is α in AdaBoost.M1 algorithm are as follows: A) ‘Breiman’: α=1/[2 ln((1-err)/err)], and B) ‘Freund’: α=ln((1-err)/err), and C) ‘Zhu’: α=ln((1-err)/err)+ln(N_classes-1). |
| Features | Select the input columns with which you want to perform the analysis. |
| Target Columns | Select the target column on which you want to perform the analysis. |