Using re Module for Pattern Matching

Tutorial 2 of 5

Using re Module for Pattern Matching

1. Introduction

This tutorial aims to provide a detailed understanding of the 're' module in Python, which allows us to work with Regular Expressions effectively. Regular Expressions or regex are a powerful tool for pattern matching and text manipulation.

By the end of this tutorial, you will learn:
- How to use the 're' module for string searching and manipulation
- Various functions provided by the 're' module
- Practical examples of using these functions

Prerequisites

A basic understanding of Python programming language is required.

2. Step-by-Step Guide

The Python 're' module provides several functions to work with regular expressions. The most commonly used ones are:
- re.match()
- re.search()
- re.findall()
- re.split()
- re.sub()

re.match()

re.match() function will only match the pattern if it occurs at the start of the string.

import re

# check if the pattern "Python" starts the string
result = re.match('Python', 'Python is fun')

# if match is found, print the match
if result:
  print("Match found:", result.group())
else:
  print("No match")

re.search()

re.search() function will search the entire string for the pattern.

import re

# search for the pattern "fun" in the string
result = re.search('fun', 'Python is fun')

# if match is found, print the match
if result:
  print("Match found:", result.group())
else:
  print("No match")

3. Code Examples

re.findall()

re.findall() returns all non-overlapping matches of pattern in string, as a list of strings.

import re

# find all occurrences of "is" in the string
result = re.findall('is', 'Python is fun and it is easy')

print("Matches found:", result)
# Output: Matches found: ['is', 'is']

re.split()

re.split(pattern, string, maxsplit=0) function returns a list where the string has been split at each match.

import re

# split the string by any number of spaces
result = re.split('\s+', 'Python is fun and easy')

print("Split string:", result)
# Output: Split string: ['Python', 'is', 'fun', 'and', 'easy']

re.sub()

re.sub(pattern, repl, string, count=0) function replaces all occurrences of the pattern in string with repl, substituting all occurrences unless count provided.

import re

# replace all occurrences of "is" with "was"
result = re.sub('is', 'was', 'Python is fun and it is easy')

print("Modified string:", result)
# Output: Modified string: Python was fun and it was easy

4. Summary

In this tutorial, we learned how to use the 're' module in Python for pattern matching and text manipulation. We looked at several functions provided by this module and their practical applications.

Next, you should try to use these functions with different patterns and strings. You can also look into more advanced topics like grouping in regular expressions, lookahead, lookbehind assertions.

5. Practice Exercises

  1. Write a Python program to check if a string starts with "The" and ends with "end".
  2. Write a Python program to find all 5 characters long word in a string.
  3. Write a Python program to replace whitespaces with an underscore.

Solutions:

import re

def check_string(s):
    if re.match('^The', s) and re.search('end$', s):
        return True
    else:
        return False

print(check_string("The quick brown fox ends"))
print(check_string("The quick brown fox end"))
import re

def find_words(s):
    return re.findall(r'\b\w{5}\b', s)

print(find_words("The quick brown fox jumps over the lazy dog"))
import re

def replace_spaces(s):
    return re.sub('\s', '_', s)

print(replace_spaces("The quick brown fox jumps over the lazy dog"))

For further practice, try to solve more complex problems using regular expressions.