Input Datasets
Depending where you are in the predictive model lifecycle, your input dataset can be a training or an application dataset (in the case of a classification or regression predictive model) or both (in case of a time series predictive model as only one dataset is used).
The training dataset contains the past observations that will be used to generate the predictive model. In this set, the values of the target variable, which is the variable corresponding to your business issue, are known. By analyzing the training dataset, Smart Predict generates a predictive model that explains and predicts the target variable, based on the variables identified as Influencers.
You apply a predictive model on an application dataset (for classification and regression predictive models).
- The same number of variables (additional columns will be ignored),
- The same variable names as the corresponding training dataset.
- Your training or application input dataset must not contain more than 1,000 columns. While applying the predictive model to an application dataset, Smart Predict generates additional columns. The application process can get blocked if your application dataset already risks crossing the limit of 1,000 columns. For more information refer to System Sizing, Tuning, and Limits.
-
The following limits are recommended when using a segmented time series forecast model on an input training or application dataset:
- Number of forecasts (independent of the number of segments): 120 maximum
- Number of segments: 1000 maximum
If your predictive model is configured for a number of forecasts or segments beyond the recommended maximum limits, then it's use of resources is likely to create performance issues that can impact other users on the same tenant.
- Your training and application input datasets must also come from the same type of data source. You can't apply a predictive model on a live dataset if it was trained with an acquired dataset, nor can you apply a predictive model on an acquired dataset if it was trained using a live one.