How Adding Influencers to Your Dataset Can Potentially Increase the Accuracy of Your Predictive Model?

Once you’ve trained your predictive model, the performance indicators can be too low to immediately consider the predictive model accurate (see Predictive Power for a classification predictive model, Root Mean Square Error for a regression predictive model or Horizon-Wide MAPE for a time series predictive model).

One way to increase your predictive model’s accuracy is to add influencers to your dataset. These influencers can then be used by Smart Predict to improve its understanding of the relationships between your data.

Note

Influencers are only available if your data source is a dataset.

Example

Your company noticed that the maintenance costs of their stores are getting too high. You need to analyze them to see where to cut costs but also predict future maintenance costs better to avoid going over budget. You create your first predictive scenario with a Time Series predictive model to assess the maintenance costs per store. You choose the overall expenses as signal, the date of these expenses as date variable and the store ID as entity.

You train your first predictive model excluding the twenty-three possible influencers.

The Horizon-Wide MAPE of your first predictive model in the debrief is at 26.71%.

Note

You want the percentage of your predictive model’s Horizon-Wide MAPE to be as low as possible as it indicates the percentage of error you can expect in your predictive forecasts.

You notice that some of the variables excluded as influencers such as the number of Saturdays and Sundays have a direct relation to the date dimension your used in your predictive model. You realize they impact the insights and could improve the accuracy of your predictive forecasts if they were included as influencers.

You create a second predictive model by duplicating your first predictive model. However, this time you include all influencers and train your second model.

The Horizon-Wide MAPE of your second predictive model in the debrief drops to 20.77%.

Your predictive model gained 22% of accuracy by simply including variables as influencers in your predictive model. You may need to try a few influencers combinations to reach the level of accuracy you want.