This tutorial aims to guide you in implementing regression models using Machine Learning. We'll cover basic to advanced regression techniques with step-by-step explanations and practical examples.
By the end of this tutorial, you will be able to:
Before starting this tutorial, you should have a basic understanding of:
Regression models are supervised learning models that predict a continuous outcome variable (y) based on one or more predictor variables (x).
Linear regression is the most basic type of regression. It assumes a linear relationship between the input variables (x) and the single output variable (y).
# Import necessary libraries
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
# Split data into train and test datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Linear Regression model
lin_reg = LinearRegression()
# Train the model
lin_reg.fit(X_train, y_train)
# Make predictions
y_pred = lin_reg.predict(X_test)
If your data points clearly cannot be represented by a linear relationship, you can use polynomial regression.
# Import necessary libraries
from sklearn.preprocessing import PolynomialFeatures
# Initialize Polynomial Features
poly_features = PolynomialFeatures(degree=2)
# Transform the features to higher degree features.
X_train_poly = poly_features.fit_transform(X_train)
# fit the transformed features to Linear Regression
poly_model = LinearRegression()
poly_model.fit(X_train_poly, y_train)
# predicting on training data-set
y_train_predicted = poly_model.predict(X_train_poly)
# predicting on test data-set
y_test_predict = poly_model.predict(poly_features.fit_transform(X_test))
Let's implement a simple Linear Regression model using Scikit-learn.
# Import necessary libraries
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import numpy as np
# Create a random dataset
np.random.seed(0)
x = 2 - 3 * np.random.normal(0, 1, 20)
y = x - 2 * (x ** 2) + 0.5 * (x ** 3) + np.random.normal(-3, 3, 20)
# Reshape data
x = x[:, np.newaxis]
y = y[:, np.newaxis]
# Split data into train and test datasets
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
# Initialize the Linear Regression model
model = LinearRegression()
# Train the model
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
Let's implement a Polynomial Regression model using Scikit-learn.
# Import necessary libraries
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
# Create a Polynomial feature of degree 3
polynomial_features= PolynomialFeatures(degree=3)
# Transform the features to higher degree features.
x_poly = polynomial_features.fit_transform(x)
# fit the transformed features to Linear Regression
model = LinearRegression()
model.fit(x_poly, y)
# Visualizing the Polynomial Regression results
plt.scatter(x, y, color='red')
plt.plot(x, model.predict(polynomial_features.fit_transform(x)), color='blue')
plt.title('Predictions with Polynomial Regression')
plt.show()
In this tutorial, we learned about regression models and implemented Linear and Polynomial Regression models using Python and Scikit-learn. Next, you could learn about other regression models like Ridge Regression, Lasso Regression, and Logistic Regression.
Create a dataset with a non-linear relationship and try fitting a linear regression model. Observe the result and fit a polynomial regression model to the same data and compare the results.
Experiment with different degrees of polynomial regression on the same data set and observe the results.
# Create a random dataset
np.random.seed(0)
x = 2 - 3 * np.random.normal(0, 1, 20)
y = x - 2 * (x ** 2) + 0.5 * (x ** 3) + np.random.normal(-3, 3, 20)
# Reshape data
x = x[:, np.newaxis]
y = y[:, np.newaxis]
# Initialize the Linear Regression model
model = LinearRegression()
# Train the model
model.fit(x, y)
# Make predictions
y_pred = model.predict(x)
# Create a Polynomial feature of degree 3
polynomial_features= PolynomialFeatures(degree=3)
x_poly = polynomial_features.fit_transform(x)
# fit the transformed features to Linear Regression
model_poly = LinearRegression()
model_poly.fit(x_poly, y)
# Make predictions with Polynomial Regression
y_poly_pred = model_poly.predict(x_poly)
# Create a Polynomial feature of degree 2
polynomial_features2= PolynomialFeatures(degree=2)
x_poly2 = polynomial_features2.fit_transform(x)
# fit the transformed features to Linear Regression
model_poly2 = LinearRegression()
model_poly2.fit(x_poly2, y)
# Create a Polynomial feature of degree 4
polynomial_features4= PolynomialFeatures(degree=4)
x_poly4 = polynomial_features4.fit_transform(x)
# fit the transformed features to Linear Regression
model_poly4 = LinearRegression()
model_poly4.fit(x_poly4, y)
By running these scripts, you will observe how changing the degree of the polynomial regression model affects the fit to the data.