Machine Learning : SVM (Part 12)
Here we can see 2 Categories.
Category 1 has red cross and Category 2 has green cross.
Now we can create a line and and using 2 support vectors, we can create 2 more dotted lines (decision boundary) alongside the previous blue line.
Here, we can see red cross and green cross as support vectors as they are supporting to create those dot lines and those two can be calculated as a vector.
Here, we can see Maximum margin hyperplane/classifier
Anything to the right is Positive Hyperplane and anything to the left is Negative Hyperplane
Let's give an example.
Assume that we have an apple and an orange
Let's apply this to our 2 category.
in general this is a cluster of data might look like.
But in SVM, we act differently.
here, we take 2 vectors . One is an apple which almost looks like an yellow orange and one orange which almost looks like a green apple.
here the red cross (orange apple) is so near to be an orange.
again green cross is (green orange) is so near to be an apple.
Problem statement: We are launching a new SUV in the market and we want to know which age people will buy it. Here is a list of people with their age and salary. Also, we have previous data of either they did buy any SUV before or not.
Let's import the libraries , dataset and split the dataset
Feature scaling
Training the SVM model on the Training set
from sklearn.svm import SVC
importing SVC class from sklearn.svm module
classifier = SVC(kernel = 'linear', random_state = 0)
The default value for kernel is 'rbf' but we want a linear one so that, we can do it easily. or, we could choose LinearSVC module and won't need to set the kernel then.
classifier.fit
(X_train, y_train)
Fitting the model.
Predicting a new result
predicting for a person who has the age 30 and salary 87000
print(classifier.predict(sc.transform([[30,87000]])))
The output shows 0, which means the person won't buy.
Predicting the Test set results
Let's see our prediction and the y_test data hand in hand.
y_pred = classifier.predict(X_test)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))
so, the left value is the predicted one and the right one is the test data.
Making the Confusion Matrix
It checks how many of the values we detected correctly and wrongly
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
let's check the accuracy
accuracy_score(y_test, y_pred)
we have a 90% accuracy!
Visualizing the Training set results
Visualizing the Test set results
Done!
The whole code