Maximizing ROI: Evaluating the Business Value of a Machine Learning Model | by Amit Kulkarni

To accurately assess the performance of any model we build, it is essential to compare it against a baseline model. This enables us to evaluate the model’s effectiveness and determine how it performs relative to a basic reference point. Let’s delve deeper into this concept and its significance in the following sections.

Randon model: When evaluating any model, it is crucial to compare its performance against a baseline model to assess its effectiveness. In our case, the baseline model will be a random model, equivalent to flipping a coin. This means that there is a 50% probability of a positive outcome or the customer buying our product. Naturally, our logistic regression model’s performance should surpass this baseline.

Wizard Model: On the other end of the spectrum, we have the “Wizard” model. This hypothetical model represents perfect predictions with nearly 100% accuracy. However, it is important to note that such a model should never be used for real-world business decisions due to the high risk of overfitting and unrealistic expectations.

Logistic Model: It should fall somewhere in between these two extreme models, striking a balance that instills confidence in making informed business decisions.

To visually illustrate the performance of these models, we will employ a cumulative gain plot. This plot will provide a clear indication of where the logistic model stands in terms of its performance, allowing us to gauge its potential for delivering business value.

kds.metrics.plot_cumulative_gain(y_test.to_numpy(), prob_glm[:,1])

Looks good so far, the plot is on the expected lines and the logistic regression model is in between the two extreme models we have discussed.

Insights from the cumulative gain plot:

Target Class Coverage: By selecting only the top 20% (decile 1 and decile 2), we can achieve coverage of approximately 80% of the target class (blue box in the above plot). This implies that a significant portion of the desired outcomes can be captured by focusing on these deciles.

Decile Distribution: As the cumulative gain plot progresses, we notice that the curve flattens after decile 5. This indicates that deciles 6 to 10 either contain minimal records or none at all. Understanding the distribution of records across deciles provides valuable insights into the concentration of positive outcomes.

Wizard Model Reference: The idealistic wizard model achieves a 100% hit rate in decile 2, which serves as a reference point (red box in the above plot). However, it’s important to acknowledge that this model is unrealistic and should not be considered as a practical benchmark. If our model begins to resemble or approach the performance of either the wizard or random model, it warrants a thorough review of our own model.

Moving forward, we will dive deeper into the decile-level analysis, aiming to gain a comprehensive understanding of the underlying factors at play. Visualizations will aid us in presenting and comprehending the analysis more effectively. To streamline this process, we will leverage the kds package, which offers a convenient function to generate comprehensive metric reports with just a single line of code.

kds.metrics.report(y_test, prob_glm[:,1])

Let us understand each of these plots. Please note that the x-axis of all the plots is Deciles.

Lift Plot: This plot illustrates the relative improvement of the logistic regression model compared to the random model at each decile. For example, at decile 2, the lift is approximately four times, indicating that our model performs four times better than a random approach. As we move to higher deciles, the lift gradually decreases and eventually converges with the random model line. This occurs because higher probability scores are concentrated in the top deciles (1 to 3), as observed in the cumulative gains plot. Consequently, the bottom deciles exhibit lower probabilities, nearly equivalent to the random model’s predictions.
Decile-wise Lift Plot: This plot depicts the percentage of target class observations in each decile. We observe that decile 1 has the highest percentage, and as we move to higher deciles, the percentage gradually declines. After a certain point, it even falls below the random model line. This is because the random model distributes observations equally and randomly across all deciles, while our model predicts fewer observations in the higher deciles due to the concentration of positive outcomes in the top deciles.
Cumulative Gain Plot: As discussed earlier, this plot provides an overview of the model’s performance in capturing positive outcomes. It demonstrates the percentage of target class observations cumulatively achieved as we progress through the deciles.
KS Statistic Plot: The KS plot assesses the differentiation between two distributions: events and non-events. The KS statistic represents the maximum difference between these distributions. In essence, it helps us understand the ML model’s ability to distinguish between the two. A KS score greater than 40 is generally considered good, particularly if it occurs within the top three deciles. In our case, we have a KS score of 68.932, with the maximum difference observed in decile 3.

By examining these plots, we gain valuable insights into the performance and differentiating capabilities of our ML model, providing a comprehensive understanding of its strengths and areas for improvement.

Source link