Use the Model Statistics component to generate performance statistics to solve two-class problems for all scenarios (HANA and non-HANA). Visualize and share results in a range of charts. Use the component with the Model Compare component to compare two or more models and discover the best one for a predictive problem.
Model Statistics is a component that calculates performance statistics on datasets that are generated by algorithms. It can do so for two algorithm types, classification and regression. In addition, you can configure the component to generate performance statistics for Train, Validate and Test datasets and selected KPIs.
The component works only with two-class problems. A two-class problem is a business problem with a binary outcome, which means that it classifies the elements of a given dataset into two groups by a classification rule.
One example is in churn modeling for a business with a subscription service. In such a case, the two-class problem is to identify subscribers who will stay with the service, and those who will leave.
Another example is fraud detection at a financial institution, where the two-class problem is to identify which transactions are fraudulent, and which are not.
You must ensure that the predictive quality (Ki) of the model is strong. For example, if the Ki is zero, it means that the model is not trained well and inspires no confidence, since it is equivalent essentially to a random model.
The Ki is directly linked to the amount of information available to predict the target. Therefore, you can improve the KI by increasing the number of useful variables in the model in the following ways:
You can generate and share charts for classification and regression algorithms in the Model Statistics component. The charts visualize the performance of Classification and Regression algorithms.
You can use the Model Statistics component with the Model Compare component to learn the best algorithm for your predictive problem. First the Model Statistics component calculates the performance statistics for either classification or regression algorithm types. After which, the Model Compare component compares the calculated performance statistics to pick the best algorithm of those run at execution.
Note that when you change configurations in the Model Statistics component, it affects the Model Compare component.
In rendering the charts when interacting with Model Compare, the Model Statistics component overlays the partitions atop each and displays different results per partition. The Model Compare component does the same because both components use the same data. Therefore, you should ensure that you configure the KPIs for both exactly the same.
When the Partition component is included before the Model Statistics component in an analysis chain, you receive the option to use three different partitions: Train, Test and Validate. If the Partition component is not included, the Model Statistics component displays a set of statistics and charts for the Train partition only.