hanaml.Correlation is a R wrapper for SAP HANA PAL correlation.

hanaml.Correlation(
  data,
  key,
  cols = NULL,
  thread.ratio = NULL,
  method = NULL,
  max.lag = NULL,
  calculate.pacf = NULL,
  calculate.confint = FALSE,
  alpha = NULL,
  bartlett = NULL
)

Arguments

data

DataFrame
DataFrame containting the data.

key

character
Name of the ID column.

cols

list of characters, optional
Specifies the columns in data for correlation calculation. If only one column is specified, then the auto-correlation of that column will be calculated.
Defaults to the 1st non-ID column in data.

thread.ratio

double, optional
Controls the proportion of available threads that can be used by this function.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads. Values between 0 and 1 will use up to that percentage of available threads.
Values outside the range from 0 to 1 are ignored, and the actual number of threads used is then be heuristically determined.
Defaults to -1.

method

c("auto", "brute_force", "fft"), optional
Indicates the method to be used to calculate the correlation function.
Defaults to 'auto', i.e. automatically determined.

max.lag

integer, optional
Maximum lag for the correlation function.
Defaults to sqrt(n), where n is the data number.

calculate.pacf

logical, optional
Controls whether to calculate PACF or not. Valid only when only one series is provided.
Defaults to TRUE.

calculate.confint

logical, optional
Controls whether to calculate confidence intervals or not. If it is TRUE, two additional columns of confidence intervals are shown in the result.
Defaults to FALSE.

alpha

double, optional
Confidence bound for the given level are returned. For instance if alpha=0.05, 95 Valid only when only calculate.confint is TRUE.
Defaults to 0.05.

bartlett

logical, optional
- FALSE: using standard error to calculate the confidence bound.
- TRUE: using Bartlett's formula to calculate confidence bound.
Valid only when only calculate.confint is TRUE.
Defaults to TRUE.

Value

DataFrame

  • LAG: ID column.

  • CV: ACV/CCV.

  • CF: ACF/CCF.

  • PACF: PACF. Null if cross-correlation is calculated.

Examples

Input DataFrame data:


> data$Collect()
    TIMESTAMP  Y
 1          1 88
 2          2 84
 3          3 85
 4          4 85
 5          5 84
 6          6 85
 7          7 83
 8          8 85
 9          9 88
 10         9 89

Invoke the function:


> cr <- hanaml.Correlation(data,
                           key = "TIMESTAMP",
                           cols = c("Y"),
                           thread.ratio = 0.4,
                           method = "auto",
                           calculate.pacf = TRUE)

Output:


> cr$Collect()
  LAG     CV          CF       PACF
1   0  3.640  1.00000000  1.0000000
2   1  0.924  0.25384615  0.2538462
3   2 -0.292 -0.08021978 -0.1546211
4   3 -0.628 -0.17252747 -0.1201993