
Machine Learning - Confusion Matrix
What is a confusion matrix?
Confusion matrix is a simple table used to measure how well a classification model is performing. It compares the predictions made by the model with the actual results and shows where the model was right or wrong. This helps you understand where the model is making mistakes so you can improve it. It breaks down the predictions into four categories:

It is a table that is used in classification problems to assess where errors in the model were made.
The rows represent the actual classes the outcomes should have been. While the columns represent the predictions we have made. Using this table it is easy to see which predictions are wrong.
Creating a Confusion Matrix
Confusion matrixes can be created by predictions made from a logistic regression.
For now we will generate actual and predicted values by utilizing NumPy:
import numpyNext we will need to generate the numbers for "actual" and "predicted" values.
actual = numpy.random.binomial(1, 0.9, size = 1000)
predicted = numpy.random.binomial(1, 0.9, size = 1000)In order to create the confusion matrix we need to import metrics from the sklearn module.
from sklearn import metricsOnce metrics is imported we can use the confusion matrix function on our actual and predicted values.
confusion_matrix = metrics.confusion_matrix(actual, predicted)To create a more interpretable visual display we need to convert the table into a confusion matrix display.
cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labels = [0, 1])Visualizing the display requires that we import pyplot from matplotlib.
import matplotlib.pyplot as pltFinally to display the plot we can use the functions plot() and show() from pyplot.
cm_display.plot()
plt.show()Results Explained
The Confusion Matrix created has four different quadrants:
- True Negative (Top-Left Quadrant)
- False Positive (Top-Right Quadrant)
- False Negative (Bottom-Left Quadrant)
- True Positive (Bottom-Right Quadrant)
True means that the values were accurately predicted, False means that there was an error or wrong prediction.
Created Metrics
The matrix provides us with many useful metrics that help us to evaluate our classification model.
The different measures include: Accuracy, Precision, Sensitivity (Recall), Specificity, and the F-score, explained below.
Accuracy
Accuracy measures how often the model is correct.
How to Calculate
Example
Accuracy = metrics.accuracy_score(actual, predicted)Precision
Of the positives predicted, what percentage is truly positive?
How to Calculate
Example
Precision = metrics.precision_score(actual, predicted)Sensitivity (Recall)
Of all the positive cases, what percentage are predicted positive?
How to Calculate
Example
Sensitivity_recall = metrics.recall_score(actual, predicted)Specificity
How well the model is at predicting negative results?
How to Calculate
Example
Specificity = metrics.recall_score(actual, predicted, pos_label=0)F-score
F-score is the "harmonic mean" of precision and sensitivity.
It considers both false positive and false negative cases and is good for imbalanced datasets.
How to Calculate
Example
F1_score = metrics.f1_score(actual, predicted)All calculations in one:
Example
print({"Accuracy":Accuracy, "Precision":Precision, "Sensitivity_recall":Sensitivity_recall, "Specificity":Specificity, "F1_score":F1_score})