Hierarchical Forecast — hanaml.HierarchicalForecast • hana.ml.r

hanaml.HierarchicalForecast is a R wrapper for SAP HANA PAL hierarchy forecast algorithm.

hanaml.HierarchicalForecast(
  orig.data,
  pred.data,
  stru.data,
  orig.cols = NULL,
  pred.cols = NULL,
  method = NULL,
  weights = NULL
)

Arguments

orig.data

DataFrame
DataFrame of original data.
By default, it is assumed that orig.data is organized as follows:
1st column for names of time-series, 2nd column for time stamp, 3rd column for raw data, 4th column for residuals of base forecasts.

pred.data

DataFrame
DataFrame of predictive data.
By default, it is assumed that pred.data is organized as follows:
1st column for names of time-series, 2nd column for time stamp, 3rd column for predictive raw data.

stru.data

DataFrame
DataFrame of structure data.
It must be structured as follows: 1st column for ID, 2nd column for time-series names, 3rd column for hierarchical tree structure of time-series, which is the number of nodes at each level.

orig.cols

list of characters, optional
If the input orig.data is not organized by its default setting, then we can use orig.cols to set up the correct assignment of columns for orig.data as follows:

orig.cols = (name = [time-series names column], key = [time stamp column], endog = [raw data column], residual = [residual column])

Note that you need to specify all four columns for making this parameter effective, otherwise an error message shall be issued.

pred.cols

list of characters, optional
If the input pred.data is not organized by its default setting, then we can use pred.cols to set up the correct assignment of columns for pred.data as follows:

pred.cols = (name = [time-series names column], key = [time stamp column], endog = [predictive raw data column])

Note that you need to specify all three columns for making this parameter effective, otherwise an error message shall be issued.

method

c('optimal.combination', 'bottom.up', 'top.down'), optional
Method for reconciling forecasts across hierarchy.
Default to 'optimal_combination'.

weights

c('ordinary.least.squares', 'minimum.trace', 'weighted.least.squares'), optional
Specifies the method to assign weights to base forecasts in different hierarchies.
Only valid when parameter method is 'optimal.combination'.
Default to 'ordinary.least.squares'.

Value

Returns a list of two DataFrames:

DataFrame 1 result: DataFrame for Forecast result.
DataFrame 2 stats: DataFrame for Statistics analysis content.

Details

Hierarchical forecast algorithm forecast across the hierarchy (that is, ensuring the forecasts sum appropriately across the levels).

Examples

Input DataFrame data.orig:


> data.orig$Collect()
   Series TimeStamp  Original     Residual
1   Total      1992 48.748080  0.058251736
2   Total      1993 49.480469  0.236069215
3   Total      1994 49.932384 -0.044404927
4   Total      1995 50.240702 -0.188001523
5   Total      1996 50.608464 -0.128557779
6   Total      1997 50.848506 -0.256277857
7   Total      1998 51.709220  0.364393685
8   Total      1999 51.943298 -0.262241472
9   Total      2000 52.577956  0.138337958
10  Total      2001 53.214959  0.140682700
11      A      1992 27.211338  0.026941198
12      A      1993 27.838827  0.357361180
13      A      1994 28.145348  0.036394576
14      A      1995 28.277125 -0.138351039
15      A      1996 28.478001 -0.069250754
16      A      1997 28.564466 -0.183661771
17      A      1998 28.907533  0.072939797
18      A      1999 29.021548 -0.156112852
19      A      2000 29.403080  0.111405265
20      A      2001 29.642483 -0.030724402
21      B      1992 21.536742  0.021310538
22      B      1993 21.641642 -0.121291965
......

Input DataFrame data.pred:


> data.pred$Collect()
   Series TimeStamp     Value
1   Total      1993 54.711279
2   Total      1994 54.207598
3   Total      1995 54.703918
4   Total      1996 55.200238
5   Total      1997 55.696558
6   Total      1998 56.192878
7       A      1993 29.912610
8       A      1994 30.182737
9       A      1995 30.452864
10      A      1996 30.722991
11      A      1997 30.993119
12      A      1998 31.263246
13      B      1993 23.798669
14      B      1994 24.024861
15      B      1995 24.251054
16      B      1996 24.477247
......

Invoke the function:


> res <- hanaml.HierarchicalForecast(orig.data = data.orig,
                                     pred.data = data.pred,
                                     orig.cols = list(key = "TimeStamp",
                                                      name = "Series",
                                                      endog = "Original",
                                                      residual = "Residual"),
                                     stru.data = struct.df,
                                     method = "optimal.combination",
                                     weights = "minimum.trace")

Ouput:


> res[[1]]$Collect()
   Series TimeStamp     Value
1   Total      1993 48.862705
2   Total      1994 54.255631
3   Total      1995 54.663688
4   Total      1996 55.192436
5   Total      1997 55.719965
6   Total      1998 56.434261
7       A      1993 27.204558
8       A      1994 30.215278
9       A      1995 30.424968
10      A      1996 30.718347
11      A      1997 31.007053