hana_ml.algorithms.pal.abc_analysis.abc_analysis(data, key=None, percent_A=None, percent_B=None, percent_C=None, revenue=None, thread_ratio=None)

Perform the abc_analysis to classify objects based on a particular measure. Group the inventories into three categories.


Input data.

keystr, optional

Name of the ID column.

Defaults to the index column of data (i.e. data.index) if it is set.

revenuestr, optional

Name of column for revenue (or profits).

If not given, the input dataframe must only have two columns.

Defaults to the first non-key column.


Interval for A class.


Interval for B class.


Interval for C class.

thread_ratiofloat, optional

Specifies the ratio of total number of threads that can be used by this function.

The value range is from 0 to 1, where 0 means only using 1 thread, and 1 means using at most all the currently available threads.

Values outside the range will be ignored and this function heuristically determines the number of threads to use.

Default to 0.


Returns a DataFrame containing the ABC class result of partitioning the data into three categories.


Data to analyze:

>>> df_train = cc.table('AA_DATA_TBL')
>>> df_train.collect()
      ITEM    VALUE
0    item1     15.4
1    item2    200.4
2    item3    280.4
3    item4    100.9
4    item5     40.4
5    item6     25.6
6    item7     18.4
7    item8     10.5
8    item9    96.15
9    item10     9.4

Perform abc_analysis:

>>> res = abc_analysis(data = self.df_train, key = 'ITEM', thread_ratio = 0.3,
                       percent_A = 0.7, percent_B = 0.2, percent_C = 0.1)
>>> res.collect()
   ABC_CLASS         ITEM
0          A        item3
1          A        item2
2          A        item4
3          B        item9
4          B        item5
5          B        item6
6          C        item7
7          C        item1
8          C        item8
9          C       item10