Machine Learning - Polynomial Regression (Part 4)
This is the mathematical equation for polynomial regression.
Now, let's see where we use it
if we use linear regression here, it basically does not event fit.
but if we use polynomial regression, it basically fits all of the data.
Let's learn it with coding
Problem statement: We want to appoint a regional manager and in the interview, he said that he was a regional manager for 2 years at XYZ company.
We have collected the data sheet and found out the salary of a regional manager is 150k in a year. As the person worked for 2 years, the salary can be in the range between level 6 & 7. Surely, it will be less than Partner 200k.
So, we can guess it 6.5 level to get an approximate idea and offer him the salary according to experience
Here is the dataset
So, let's do the data processing
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Here, we just want level column as X and Salary as y
dataset =
pd.read
_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:-1].values
y = dataset.iloc[:, -1].values
Training the Linear Regression model on the whole dataset
from sklearn.linear_model import LinearRegression
#imported LinearRegression class
lin_regressor= LinearRegression()
#importing the object
lin_
regressor.fit
(X,y)
#we are using the whole data and not splitting it
We have used all of our data to fit the model in linear model
Training the Polynomial Regression model on the whole dataset
from sklearn.preprocessing import PolynomialFeatures
#we are importing PolynomialFeatures
poly_regressor=PolynomialFeatures(degree=2)
#training this for degree 2 that means x1 & x1^2
X_poly=poly_
regressor.fit
_transform(X)
#the matrix of feature we want to transform
#now building y=b0+b1x1+b1x1^2
lin_regressor2=LinearRegression()
lin_
regressor2.fit
(X_poly,y)
#creating the new linear regression model using the X_poly
Visualizing the Linear Regression results
plt.scatter(X,y,color='red')
#for real data to keep in the 2d surface
plt.plot(X, lin_regressor.predict(X),color='blue')
#plot(X coordinates, y=lin_regressor.predict(X) would be predicated salaries)
plt.title('Linear regression model')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show
()
Surely, the model is not fitting the data and let's try with polynomial one then.
Visualizing the Polynomial Regression results
plt.scatter(X,y,color='red')
#for real data to keep in the 2d surface
plt.plot(X, lin_regressor2.predict(X_poly))
#plot(X coordinates, y=lin_regressor2.predict(X_poly) would be predicated salaries)
plt.title('Polynomial regression model')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show
()
As we can see the model is not fitting that properly. So, we can increase the degree and check it
Practice with much higher degree
from sklearn.preprocessing import PolynomialFeatures
#now we are targetting x1,x1^2,x1^2,x1^4,x1^5,x1^6
poly_regressor=PolynomialFeatures(degree=6)
X_poly=poly_
regressor.fit
_transform(X)
#the matrix of feature we want to transform#now building y=b0+b1x1+b1x1^2+b2x1^3+b3x1^4+b4x1^5+b5x1^6
lin_regressor2=LinearRegression()
lin_
regressor2.fit
(X_poly,y)
#creating the new linear regression model using the X_poly
#this plot fits the data when the degree is 6
plt.scatter(X,y,color='red')
#for real data to keep in the 2d surface
plt.plot(X, lin_regressor2.predict(X_poly))
#plot(X coordinates, y=lin_regressor2.predict(X_poly) would be predicated salaries)
plt.title('Polynomial regression model')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show
()
Optional: Visualizing the Polynomial Regression results (for higher resolution and smoother curve)
X_grid=np.arange(min(X),max(X),0.1)
#instead of taking integers value from X, we are taking 0.1, 0.2, 0.3 etc in smaller portion to make it look better
X_grid=X_grid.reshape((len(X_grid),1))
plt.scatter(X,y,color='red')
plt.plot(X_grid,lin_regressor2.predict(poly_
regressor.fit
_transform(X_grid)),color='blue')
plt.title('Polynomial regression')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show
()
Let's solve our problem now:
Predicting a new result with Linear Regression
lin_regressor.predict([[6.5]])
#we want to see the salary of regional managed at year 2. So, it should be in between level 6 & 7. As it's not more than level 7
#we will use array here
#the result is misleading as it crosses salary of level 7
330k is impossible as the level 7 role had 200k as salary
Let's solve our problem now: Predicting a new result with Polynomial Regression
#for polynomial , we need to provide x1, x1^2,x1^3,x1^4,x1^5,x1^6
lin_regressor2.predict(poly_
regressor.fit
_transform([[6.5]]))
Now it seems relevant as it's 170k which is in between 150k and 200k.
So, the result is in between Level 6 & 7
We can offer this salary to our regional manager!
Try the whole code
Previous Blogs
Machine learning - Multiple Linear Regression Model (Part 3)