hanaml.HierarchicalForecast.Rdhanaml.HierarchicalForecast is a R wrapper for SAP HANA PAL hierarchy forecast algorithm.
hanaml.HierarchicalForecast( orig.data, pred.data, stru.data, orig.cols = NULL, pred.cols = NULL, method = NULL, weights = NULL )
| orig.data |
|
|---|---|
| pred.data |
|
| stru.data |
|
| orig.cols |
Note that you need to specify |
| pred.cols |
Note that you need to specify |
| method |
|
| weights |
|
Returns a list of two DataFrames:
DataFrame 1 result: DataFrame for Forecast result.
DataFrame 2 stats: DataFrame for Statistics analysis content.
Hierarchical forecast algorithm forecast across the hierarchy (that is, ensuring the forecasts sum appropriately across the levels).
Input DataFrame data.orig:
> data.orig$Collect() Series TimeStamp Original Residual 1 Total 1992 48.748080 0.058251736 2 Total 1993 49.480469 0.236069215 3 Total 1994 49.932384 -0.044404927 4 Total 1995 50.240702 -0.188001523 5 Total 1996 50.608464 -0.128557779 6 Total 1997 50.848506 -0.256277857 7 Total 1998 51.709220 0.364393685 8 Total 1999 51.943298 -0.262241472 9 Total 2000 52.577956 0.138337958 10 Total 2001 53.214959 0.140682700 11 A 1992 27.211338 0.026941198 12 A 1993 27.838827 0.357361180 13 A 1994 28.145348 0.036394576 14 A 1995 28.277125 -0.138351039 15 A 1996 28.478001 -0.069250754 16 A 1997 28.564466 -0.183661771 17 A 1998 28.907533 0.072939797 18 A 1999 29.021548 -0.156112852 19 A 2000 29.403080 0.111405265 20 A 2001 29.642483 -0.030724402 21 B 1992 21.536742 0.021310538 22 B 1993 21.641642 -0.121291965 ......
Input DataFrame data.pred:
> data.pred$Collect() Series TimeStamp Value 1 Total 1993 54.711279 2 Total 1994 54.207598 3 Total 1995 54.703918 4 Total 1996 55.200238 5 Total 1997 55.696558 6 Total 1998 56.192878 7 A 1993 29.912610 8 A 1994 30.182737 9 A 1995 30.452864 10 A 1996 30.722991 11 A 1997 30.993119 12 A 1998 31.263246 13 B 1993 23.798669 14 B 1994 24.024861 15 B 1995 24.251054 16 B 1996 24.477247 ......
Invoke the function:
> res <- hanaml.HierarchicalForecast(orig.data = data.orig,
pred.data = data.pred,
orig.cols = list(key = "TimeStamp",
name = "Series",
endog = "Original",
residual = "Residual"),
stru.data = struct.df,
method = "optimal.combination",
weights = "minimum.trace")
Ouput:
> res[[1]]$Collect() Series TimeStamp Value 1 Total 1993 48.862705 2 Total 1994 54.255631 3 Total 1995 54.663688 4 Total 1996 55.192436 5 Total 1997 55.719965 6 Total 1998 56.434261 7 A 1993 27.204558 8 A 1994 30.215278 9 A 1995 30.424968 10 A 1996 30.718347 11 A 1997 31.007053