![](https://crypto4nerd.com/wp-content/uploads/2023/06/0G5b1v3MU4gvYbzAE-1024x683.jpeg)
Computing the Gini Coefficient from AUC
The ROC (Receiver operating characteristic) curve is the plot of False Positive rate (FPR) on x-axis and True positive rate (TPR) on y-axis, across various thresholds. AUC is the area under this ROC curve.
The Gini coefficient is derived from AUC value using the formula mentioned below:
Gini = 2*AUC -1
class ModelEvaluation:"""
A class to compute the auc score & gini of the model for the given predictions.
Attributes
----------
predictions : pd.DataFrame
A DataFrame containing 'Probability_Default' & 'DV' columns.
PD_column : string
Name of PD column in the dataframe.
label_column: string
Name of the DV column.
Methods
-------
plot_roc_curve(): Plots the ROC Curve
compute_auc_gini(): Returns the AUC and Gini coefficient
"""
def __init__(self, predictions_df, PD_column, label_column):
self.predictions = predictions_df
self.PD_column = PD_column
self.label_column = label_column
def plot_roc_curve(self):
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
# Compute FPR, TPR, and thresholds
fpr, tpr, thresholds = roc_curve(y_true = self.predictions[self.label_column], y_score = self.predictions[self.PD_column])
# Compute AUC
roc_auc = auc(fpr,tpr)
# Plot the ROC curve
plt.figure()
plt.plot(fpr, tpr, label='ROC curve (AUC = %0.5f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'r--') # Plotting the diagonal line (random classifier)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()
def compute_auc_gini(self):
from sklearn.metrics import roc_auc_score
auc = roc_auc_score(y_score = self.predictions[self.PD_column],y_true = self.predictions[self.label_column])
gini = 2*auc - 1
return auc,gini
model_eval = ModelEvaluation(results_df,PD_column='Prediction_Probability',label_column='Bads')
auc,gini = model_eval.compute_auc_gini()
Gini = 2*0.73043–1 = 0.46086
Summary:
- The Gini coefficient has an intuitive interpretation. It represents the degree of separation between positive and negative classes, making it easier to understand and communicate.
- The Gini coefficient provides a single summary measure of the model’s discriminative power, capturing the performance across all possible classification thresholds.
- The Gini Coefficient is insensitive to calibration. Gini does not consider the calibration of PDs, focusing solely on the ranking. This means that models with poorly calibrated probabilities can still achieve high Gini scores.
- The Gini coefficient does not provide insights into the underlying factors driving the predictions.
The Gini coefficient is just one of the several metrics used to evaluate the credit risk models in finance industry and a comprehensive evaluation process should incorporate other relevant metrics to ensure robust and accurate model performance assessment.