abc_analysis
- hana_ml.algorithms.pal.abc_analysis.abc_analysis(data, key=None, percent_A=None, percent_B=None, percent_C=None, revenue=None, thread_ratio=None)
This algorithm is used to classify objects (such as customers, employees, or products) based on a particular measure (such as revenue or profit).
ABC analysis suggests that inventories of an organization are not of equal value, thus can be grouped into three categories (A, B, and C) by their estimated importance. “A” items are very important for an organization. “B” items are of medium importance, that is, less important than “A” items and more important than “C” items. “C” items are of the least importance.
- Parameters
- dataDataFrame
Input data.
- keystr, optional
Name of the ID column.
Defaults to the index column of
data
(i.e. data.index) if it is set.- revenuestr, optional
Name of column for revenue (or profits).
If not given, the input dataframe must only have two columns.
Defaults to the first non-key column.
- percent_Afloat
Interval for A class.
- percent_Bfloat
Interval for B class.
- percent_Cfloat
Interval for C class.
- thread_ratiofloat, optional
Specifies the ratio of total number of threads that can be used by this function.
The value range is from 0 to 1, where 0 means only using 1 thread, and 1 means using at most all the currently available threads.
Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Default to 0.
- Returns
- DataFrame
The ABC class results after partitioning the data into three categories.
Examples
Input DataFrame:
>>> df_train.collect() ITEM VALUE 0 item1 15.4 1 item2 200.4 2 item3 280.4 3 item4 100.9 4 item5 40.4 5 item6 25.6 6 item7 18.4 7 item8 10.5 8 item9 96.15 9 item10 9.4
Perform abc_analysis:
>>> res = abc_analysis(data = df_train, key = 'ITEM', thread_ratio = 0.3, percent_A = 0.7, percent_B = 0.2, percent_C = 0.1) >>> res.collect() ABC_CLASS ITEM 0 A item3 1 A item2 2 A item4 3 B item9 4 B item5 5 B item6 6 C item7 7 C item1 8 C item8 9 C item10