Hierarchical Forecast — hanaml.HierarchicalForecast • hana.ml.r

hanaml.HierarchicalForecast is a R wrapper for SAP HANA PAL hierarchy forecast algorithm.

hanaml.HierarchicalForecast(
  orig.data,
  pred.data,
  stru.data,
  orig.cols = NULL,
  pred.cols = NULL,
  method = NULL,
  weights = NULL
)

Arguments

orig.data	`DataFrame` DataFrame of original data. By default, it is assumed that `orig.data` is organized as follows: 1st column for names of time-series, 2nd column for time stamp, 3rd column for raw data, 4th column for residuals of base forecasts.
pred.data	`DataFrame` DataFrame of predictive data. By default, it is assumed that `pred.data` is organized as follows: 1st column for names of time-series, 2nd column for time stamp, 3rd column for predictive raw data.
stru.data	`DataFrame` DataFrame of structure data. It must be structured as follows: 1st column for ID, 2nd column for time-series names, 3rd column for hierarchical tree structure of time-series, which is the number of nodes at each level.
orig.cols	`list of character, optional` If the input `orig.data` is not organized by its default setting, then we can use `orig.cols` to set up the correct assignment of columns for `orig.data` as follows: orig.cols = (name = [time-series names column], key = [time stamp column], endog = [raw data column], residual = [residual column]) Note that you need to specify `all four` columns for making this parameter effective, otherwise an error message shall be issued.
pred.cols	`list of character, optional` If the input `pred.data` is not organized by its default setting, then we can use `pred.cols` to set up the correct assignment of columns for `pred.data` as follows: pred.cols = (name = [time-series names column], key = [time stamp column], endog = [predictive raw data column]) Note that you need to specify `all three` columns for making this parameter effective, otherwise an error message shall be issued.
method	`('optimal.combination', 'bottom.up', 'top.down'), optional` Method for reconciling forecasts across hierarchy. Default to 'optimal_combination'.
weights	`('ordinary.least.squares', 'minimum.trace', 'weighted.least.squares'), optional` Specifies the method to assign weights to base forecasts in different hierarchies. Only valid when parameter method is 'optimal.combination'. Default to 'ordinary.least.squares'.

Value

Returns a list of two DataFrames:

DataFrame 1 result: DataFrame for Forecast result.
DataFrame 2 stats: DataFrame for Statistics analysis content.

Details

Hierarchical forecast algorithm forecast across the hierarchy (that is, ensuring the forecasts sum appropriately across the levels).

Examples

Input DataFrame data.orig:

> data.orig$Collect()
   Series TimeStamp  Original     Residual
1   Total      1992 48.748080  0.058251736
2   Total      1993 49.480469  0.236069215
3   Total      1994 49.932384 -0.044404927
4   Total      1995 50.240702 -0.188001523
5   Total      1996 50.608464 -0.128557779
6   Total      1997 50.848506 -0.256277857
7   Total      1998 51.709220  0.364393685
8   Total      1999 51.943298 -0.262241472
9   Total      2000 52.577956  0.138337958
10  Total      2001 53.214959  0.140682700
11      A      1992 27.211338  0.026941198
12      A      1993 27.838827  0.357361180
13      A      1994 28.145348  0.036394576
14      A      1995 28.277125 -0.138351039
15      A      1996 28.478001 -0.069250754
16      A      1997 28.564466 -0.183661771
17      A      1998 28.907533  0.072939797
18      A      1999 29.021548 -0.156112852
19      A      2000 29.403080  0.111405265
20      A      2001 29.642483 -0.030724402
21      B      1992 21.536742  0.021310538
22      B      1993 21.641642 -0.121291965
......

Input DataFrame data.pred:

> data.pred$Collect()
   Series TimeStamp     Value
1   Total      1993 54.711279
2   Total      1994 54.207598
3   Total      1995 54.703918
4   Total      1996 55.200238
5   Total      1997 55.696558
6   Total      1998 56.192878
7       A      1993 29.912610
8       A      1994 30.182737
9       A      1995 30.452864
10      A      1996 30.722991
11      A      1997 30.993119
12      A      1998 31.263246
13      B      1993 23.798669
14      B      1994 24.024861
15      B      1995 24.251054
16      B      1996 24.477247
......

Invoke the function:

> res <- hanaml.HierarchicalForecast(orig.data = data.orig,
                                     pred.data = data.pred,
                                     orig.cols = list(key = "TimeStamp",
                                                      name = "Series",
                                                      endog = "Original",
                                                      residual = "Residual"),
                                     stru.data = struct.df,
                                     method = "optimal.combination",
                                     weights = "minimum.trace")

Ouput:

> res[[1]]$Collect()
   Series TimeStamp     Value
1   Total      1993 48.862705
2   Total      1994 54.255631
3   Total      1995 54.663688
4   Total      1996 55.192436
5   Total      1997 55.719965
6   Total      1998 56.434261
7       A      1993 27.204558
8       A      1994 30.215278
9       A      1995 30.424968
10      A      1996 30.718347
11      A      1997 31.007053