Python正则表达式re模块讲解以及其案例举例

1. Introduction

Regular expressions (regex) are a powerful tool in Python for pattern matching and text manipulation. The re module in Python provides functions to work with regex.

2. Basics of Regular Expressions

2.1 Character Classes

A character class defines a set of characters that can match at a given position in a string. It is denoted by enclosing the characters in square brackets [].

import re

text = "Hello, world!"

pattern = r'[llo]'

matches = re.findall(pattern, text)

print(matches)

In the above code, the pattern [llo] matches any of the characters 'l', 'l', or 'o'. The findall() function returns a list of all matches found in the text.

2.2 Quantifiers

Quantifiers are used to specify how many times a character or group can occur in the input string.

import re

text = "Hello, world!"

pattern = r'o{1,2}'

matches = re.findall(pattern, text)

print(matches)

In the above code, the pattern 'o{1,2}' matches either one or two occurrences of the letter 'o'.

3. Special Sequences

3.1 Anchors

Anchors are used to match specific positions in a string. Commonly used anchors are:

^ - Matches the beginning of a string.

$ - Matches the end of a string.

\b - Matches a word boundary.

\B - Matches a non-word boundary.

import re

text = "Hello, world!"

pattern = r'^H'

matches = re.findall(pattern, text)

print(matches)

In the above code, the pattern '^H' matches the 'H' at the beginning of the string.

3.2 Groups

Groups are portions of a pattern that can be matched individually.

import re

text = "Hello, world!"

pattern = r'(Hello), (world!)'

matches = re.findall(pattern, text)

print(matches)

In the above code, the pattern '(Hello), (world!)' defines two groups: 'Hello' and 'world!'.

4. Practical Examples

4.1 Email Validation

One practical example of using regex is email validation.

import re

def validate_email(email):

pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

if re.match(pattern, email):

return True

return False

email = "test@example.com"

if validate_email(email):

print("Valid email")

else:

print("Invalid email")

The above code uses a regex pattern to validate an email address. It checks if the given email matches the pattern and returns True or False accordingly.

4.2 Phone Number Extraction

Another practical example is extracting phone numbers from a text.

import re

text = "Contact us at 123-456-7890 or email@example.com"

pattern = r'\d{3}-\d{3}-\d{4}'

matches = re.findall(pattern, text)

print(matches)

The above code uses a regex pattern to find all phone numbers in the given text. It searches for patterns in the format '123-456-7890' and returns a list of matched phone numbers.

5. Conclusion

The re module in Python provides a powerful and flexible way to work with regular expressions. It allows you to perform pattern matching and text manipulation efficiently. Regular expressions are useful in many scenarios, such as data validation, text extraction, and search operations. By understanding the basics of regex and using the re module effectively, you can greatly enhance your text processing capabilities in Python.

后端开发标签