abc_analysis

hana_ml.algorithms.pal.abc_analysis.abc_analysis(data, key=None, percent_A=None, percent_B=None, percent_C=None, revenue=None, thread_ratio=None)

Perform the abc_analysis to classify objects based on a particular measure. Group the inventories into three categories.

Parameters
dataDataFrame

Input data.

keystr, optional

Name of the ID column.

Defaults to the index column of data (i.e. data.index) if it is set.

revenuestr, optional

Name of column for revenue (or profits).

If not given, the input dataframe must only have two columns.

Defaults to the first non-key column.

percent_Afloat

Interval for A class.

percent_Bfloat

Interval for B class.

percent_Cfloat

Interval for C class.

thread_ratiofloat, optional

Specifies the ratio of total number of threads that can be used by this function.

The value range is from 0 to 1, where 0 means only using 1 thread, and 1 means using at most all the currently available threads.

Values outside the range will be ignored and this function heuristically determines the number of threads to use.

Default to 0.

Returns
DataFrame

Returns a DataFrame containing the ABC class result of partitioning the data into three categories.

Examples

Data to analyze:

>>> df_train = cc.table('AA_DATA_TBL')
>>> df_train.collect()
      ITEM    VALUE
0    item1     15.4
1    item2    200.4
2    item3    280.4
3    item4    100.9
4    item5     40.4
5    item6     25.6
6    item7     18.4
7    item8     10.5
8    item9    96.15
9    item10     9.4

Perform abc_analysis:

>>> res = abc_analysis(data = self.df_train, key = 'ITEM', thread_ratio = 0.3,
                       percent_A = 0.7, percent_B = 0.2, percent_C = 0.1)
>>> res.collect()
   ABC_CLASS         ITEM
0          A        item3
1          A        item2
2          A        item4
3          B        item9
4          B        item5
5          B        item6
6          C        item7
7          C        item1
8          C        item8
9          C       item10