Exhaustive list of all the Model Evaluation metrics/techniques used in Classification Algorithm are covered in this article.
Table of Contents
- Confusion Matrix
- AUC-ROC
- Lift Chart
- Gain Chart
- KS Statistic
- F1 Score
Table of Contents
1. Confusion Matrix
A confusion matrix provides a easy summary of the predictive results in a classification problem. Correct and incorrect predictions are summarized in a table with their values and broken down by each class. Article explains confusion matrix, precision, recall, F1 score, specificity, etc. using examples, and code.
Single accuracy value in classification in case of the imbalanced classes provide unreliable results.
For example, we have a dataset of 100 patients in which 5 have diabetes and 95 are healthy. However, even if our model only predicts the majority class i.e. all 100 people are healthy, classification accuracy is 95%, which giving completely wrong idea. Therefore, we need a confusion matrix.
# python script for confusion matrix creation
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
actual = ['dog','cat', 'dog', 'cat', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat']
predicted = ['dog', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat', 'cat', 'cat', 'cat']
results = confusion_matrix(actual, predicted)
print ('Confusion Matrix :')
print(results)
print ('Accuracy Score :',accuracy_score(actual, predicted))
print('Classification Report : ')
print (classification_report(actual, predicted))
2. AUC-ROC
ROC or Receiver Operator Characteristic curve is a plot of the True Positive Rate Recall or Recall (y-axis) and False Positive Rate (x-axis) for every possible classification threshold.
ROC curve is gives more information than a confusion matrix, as it visualizes all possible classification thresholds, whereas a confusion matrix is created only for a single threshold.
3. Lift Chart
Gain and Lift chart are based on the ordering of the probabilities based on decreasing order. Steps to create a Lift/Gain chart:
- Step 1: Predict or calculate probability for each observation
- Step 2: Rank these calculated probabilities in a decreasing order
- Step 3: Using output of Step 2, build deciles by dividing observations into 10 equal parts.
- Step 4: Calculate metrics such as the response rate, good observations, bad observations etc.
# REPRODUCABLE EXAMPLE
# Load Dataset and train-test split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import tree
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,random_state=3)
clf = tree.DecisionTreeClassifier(max_depth=1,random_state=3)
clf = clf.fit(X_train, y_train)
y_prob = clf.predict_proba(X_test)
# install kds library
pip install kds
# The magic happens here
import kds
kds.metrics.plot_lift(y_test, y_prob[:,1])
4. Gain Chart
# REPRODUCABLE EXAMPLE
# Load Dataset and train-test split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import tree
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,random_state=3)
clf = tree.DecisionTreeClassifier(max_depth=1,random_state=3)
clf = clf.fit(X_train, y_train)
y_prob = clf.predict_proba(X_test)
# install kds library
pip install kds
# The magic happens here
import kds
kds.metrics.plot_cumulative_gain(y_test, y_prob[:,1])
5. Kolomogorov Smirnov – KS Statistic
Kolomogorov Smirnov Plot measures the degree of separation between the good and bad observations. The KS Statistic value of a classification model ranges between 0 and 100. The higher the value the better the model is at separating the positive from negative observations.
# REPRODUCABLE EXAMPLE
# Load Dataset and train-test split
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import tree
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,random_state=3)
clf = tree.DecisionTreeClassifier(max_depth=1,random_state=3)
clf = clf.fit(X_train, y_train)
y_prob = clf.predict_proba(X_test)
# install kds library
pip install kds
# The magic happens here
import kds
kds.metrics.plot_ks_statistic(y_test, y_prob[:,1])
6. F1 Score
F1 Score is the Harmonic Mean of Precision and Recall. It is basically used to compare two models with different Precision and Recall. So to make them comparable, we use F-Score. Harmonic Mean punishes the extreme values more, as compared to Arithmetic Mean. The higher the F-score, the better is the model.