Partition Strategy

A partition strategy is a technique that decomposes a training data source into two distinct subsets:
  • A training subset
  • A validation subset
The partition is performed as follows:
  • The row or dimension selection is random
  • The training subset contains 75% of the input rows or dimensions.
  • The validation contains 25% of the input rows or dimensions.

Thanks to this partition strategy, the application can cross-validate the predictive models generated to ensure the best performance.

The following table defines the roles of the two data subsets obtained using partition strategies.
The data source Is used to...
Training Generate different predictive models. The predictive models generated at this stage are hypothetical.
Validation Select the best predictive model among those generated using the training subset, which represents the best compromise between perfect quality and perfect robustness.
Note
For Time Series Forecast, the validation subset allows you to calculate the confidence interval (Error Min and Error Max) of the predictions.