Machine Learning : Accuracy for classification models (Part 17)

Assume that, we want to detect if a patient has cancer or not from his X ray image

So, we can use ML to predict Positive (1) and Negative (0).

Here, you can see Actual POS means the patient actually has cancer, Actual NEG means the patient does not have cancer

Again, Prediction NEG means the model predicted that the patient has No Cancer and Prediction POS means the model predicted that the patient has Cancer

We can also see it like this:

Negative means 0 and Positive means 1

So, if the Prediction is Negative and actually it's Negative, we get True Negative (TN),

If the prediction is Positive and actually it's positive , we get True Positive (TP)

If the prediction is Negative and actually it's Positive, we get True Negative (FN)

Finally, if the prediction is Positive and actually Negative, then False Positive (FP)

Both FN & FP is dangerous !!!

Now let's check all of the classification model with our breast cancer dataset.

If we apply KNN Algo, we can see the accuracy to 94.7%

For Decision Tree, we get 95.9% accuracy

For Kernel SVM, we get 95.3% accuracy

Using Random Forest algo, we get 93.5% accuracy

Using Naive Bayes, we get 94.1% accuracy

Using logistic regression, we get 94.7%

Using Support Vector Machine, we get 94.1% accuracy

So, we basically will check this Accuracy and Confusion matrix to know how better the regression is better than others.

Moreover, Confusion matrix acts as this matrix we mentioned earlier.

We can also check this from this example which has one variable to check : (We will consider the y axis value Positive if greater than 0.5; if not , we will think of Negative)

Here , only for case 2 , we see the actual value should have been 1 (Positive), but it's Negative (0)

Again for case 3, the value should have been 0 (Negative), but we have Positive (1)

So, False positive and False Negative is always bad but False Negative is much worse here.

Pros and Cons of the Classification model

Done!!