Introduction to Supervised and Unsupervised Learning

Tutorial 2 of 5

Introduction

The goal of this tutorial is to introduce you to two main types of Machine Learning: Supervised and Unsupervised Learning. You'll learn what they are, how they work, their differences, and their typical use cases.

By the end of this tutorial, you will be able to:

  • Understand the fundamental concepts of Supervised and Unsupervised Learning.
  • Distinguish between Supervised and Unsupervised Learning.
  • Understand the typical use cases for each type of learning.
  • Write simple code examples implementing these learning models.

Prerequisites: Basic knowledge of Python and a general understanding of Machine Learning concepts is recommended.

Step-by-Step Guide

Supervised Learning

Supervised Learning is a type of Machine Learning where the model is trained on a labelled dataset. That is, a dataset where the target outcome is already known.

Examples of supervised learning include regression and classification problems.

Example:

Let's consider a scenario where you need to predict house prices based on different features like the number of rooms, location, size, etc. Here, you already have the price (label) for each house in the training dataset. The model will 'learn' from this data and then predict prices for new, unseen houses.

Unsupervised Learning

Unsupervised Learning is a type of Machine Learning where the model is trained on an unlabelled dataset. That is, a dataset where the target outcome is not known.

Examples of unsupervised learning include clustering and association problems.

Example:

Consider a scenario where you have a dataset of customers with various features like age, income, browsing history, etc., but no specific labels. The goal here might be to identify distinct groups or 'clusters' in the data, such as different customer segments.

Code Examples

Supervised Learning: Simple Linear Regression

Here is an example of a simple linear regression model using the Python library sklearn. This is a type of supervised learning.

# Import necessary libraries
from sklearn.model_selection import train_test_split 
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import pandas as pd

# Load the dataset
dataset = pd.read_csv('house_prices.csv') 

# Split the data into features and target
X = dataset['size'].values.reshape(-1,1)
y = dataset['price'].values.reshape(-1,1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) 

# Train the model
regressor = LinearRegression()  
regressor.fit(X_train, y_train)

# Use the model to make predictions
y_pred = regressor.predict(X_test)

# Output the predicted values
print(y_pred)

Unsupervised Learning: K-means Clustering

Here is an example of a k-means clustering model, a type of unsupervised learning.

# Import necessary libraries
from sklearn.cluster import KMeans
import pandas as pd

# Load the dataset
dataset = pd.read_csv('customers.csv')

# Select the features for clustering
X = dataset[['age', 'income']]

# Create the KMeans object and fit the data
kmeans = KMeans(n_clusters=3, random_state=0).fit(X)

# Output the cluster centers
print(kmeans.cluster_centers_)

Summary

In this tutorial, you've learned about two main types of Machine Learning: Supervised and Unsupervised Learning. You've seen how they work, their differences, and their typical use cases. You've also seen some simple code examples.

For further learning, consider looking into other types of Machine Learning such as semi-supervised learning and reinforcement learning.

Practice Exercises

  1. Use the sklearn library to create a logistic regression model (a type of supervised learning) on a dataset of your choice.

  2. Use the sklearn library to create a hierarchical clustering model (a type of unsupervised learning) on a dataset of your choice.

Remember, practice is key when learning new concepts, so try to spend plenty of time on these exercises.