Anomaly Detection Using Multivariate Autoregression (MAR)

A multivariate autoregressive model can be used to detect anomalies in a univariate or multivariate series of sensor data records varying over time.

What Does the Algorithm Do?

Based on the training data, which in this case is a time series of data records, the algorithm trains a model. If trained on regular data (data without anomalies present), the model is capable of learning the regular behavior of a system. Based on a window of recently observed data records, the model can then predict the data record for one time step into the future. Once the actual values for this point in time are available, the model prediction can be compared to the actual observations. An anomaly score is then assigned based on the distance between the prediction and the observation. If large deviations appear, this can indicate abnormal behavior of the underlying system.

The following figure illustrates the predictive model for one input variable:

Model Configuration

To configure a model for multivariate autoregression, use the REST APIs or configuration UIs for the machine learning engine. For more information, see the chapters Managing Machine Learning Engine Using Configuration UIs and Managing Machine Learning Engine Using REST APIs in the guide Configuring SAP Predictive Maintenance and Service, on-premise edition 1.0

Data Preparation for Model Training and Scoring

This algorithm is designed to be used for time series data. This means that both training data and scoring data should be organized so that one record contains all relevant data at one time stamp.
For a correct training, the number of records in the provided data time series used for training needs to be at least window.size +1.

Model Training

To train an MAR model, the provided data is used to fit one autoregressive multivariate linear model for each target variable. By default, each provided input variable is also a target variable, but the model-specific parameter target.columns can be used to select only specific input variables as target variables.

To train a model for multivariate autoregression, use the REST APIs or configuration UIs for the machine learning engine. For more information, see the chapters Managing Machine Learning Engine Using Configuration UIs and Managing Machine Learning Engine Using REST APIs in the guide Configuring SAP Predictive Maintenance and Service on-premise edition 1.0.

Model Scoring

Scoring is applied to a series of window.size + 1 consecutive records referring to the order of their time stamps. The first n of these records are used as input for the linear models established during training to produce predictions for each target of the record number window.size + 1.

Each prediction is compared to the actual values of the first window.size + 1 record. An anomaly score is derived based on the distance between predictions and observations, and on other influencing factors such as model uncertainty.

To score a model for multivariate autoregression, use the REST APIs or configuration UIs forthe machine learning engine. For more information, see the chapters Managing Machine Learning Engine Using Configuration UIs and Managing Machine Learning Engine Using REST APIs in the guide Configuring SAP Predictive Maintenance and Service, on-premise edition 1.0.