Python re模块的用法详解-猿码集

1. 简介

Python re模块是用来操作正则表达式的模块，正则表达式是一种描述字符串规律的表达式。当我们需要匹配一类由特定规则组成的字符串时，就可以使用正则表达式来进行匹配。

在Python中，通过引入re模块，可以使用正则表达式进行字符串的处理。

下面，我们就来详细了解一下Python re模块的用法。

2. re.match()函数

Python re模块提供了多个函数来使用正则表达式进行字符串匹配操作，其中最基本的函数是re.match()函数。

re.match()函数用于对一个字符串进行匹配，匹配成功返回一个match对象，否则返回None。

2.1 re.match()函数的基本用法

下面是一个简单的re.match()函数使用的例子：

import re
pattern = r'hello'
string = 'hello, world!'
result = re.match(pattern, string)
print(result)

输出结果：

<re.Match object; span=(0, 5), match='hello'>

在上述例子中，我们用pattern指定了匹配模式（这里简单地匹配的是字符串中的 hello），string是需要进行匹配的字符串，使用re.match(pattern, string)函数进行匹配，最后返回一个match对象。

需要注意的是，re.match()函数进行匹配时，只会匹配字符串的开始部分，如果字符串的开始部分不符合匹配规则，则匹配失败。

2.2 re.match()函数的高级用法

re.match()函数还可以使用group()函数来获取匹配的结果。

下面是一个高级使用方法的例子：

import re
pattern = r'hello.*'
string = 'hello, world!'
result = re.match(pattern, string)
print(result.group())

输出结果：

hello, world!

在上述例子中，我们指定了匹配模式 pattern，它将匹配整个字符串 starting with "hello"，然后使用 re.match() 函数对 string 进行匹配。最后，对返回的 match 对象调用 group() 函数，返回完整的匹配结果。

需要注意的是，如果调用 group(0) 函数，则会返回完整的匹配结果，如果调用其他数值，则会返回匹配结果中第几个括号内的数据。

3. re.search()函数

re.search()函数的用法与 re.match() 函数类似，但是它不限制匹配字符串的开头。它会在整个字符串中查找匹配。

3.1 re.search()函数的基本用法

下面是一个简单的 re.search() 的例子：

import re
pattern = r'world'
string = 'hello, world!'
result = re.search(pattern, string)
print(result)

输出结果：

<re.Match object; span=(7, 12), match='world'>

在上述例子中，我们用 pattern 指定了匹配模式，string 是需要进行匹配的字符串。调用 re.search() 函数进行匹配，成功匹配会返回一个 match 对象。

3.2 re.search()函数的高级用法

re.search()函数也可以使用 group() 函数来获取匹配的结果。

下面是一个高级使用方法的例子：

import re
pattern = r'hello (.*?)!'
string = 'hello, world! hello, python!'
result = re.search(pattern, string)
if result:
  print(result.group(1))

输出结果：

world

在上述例子中，我们指定了匹配模式 pattern，它将匹配字符串 starting with "hello"，直到"!"，然后使用 re.search() 函数对 string 进行匹配。最后，对返回的 match 对象调用 group(1) 函数，返回匹配结果中第一个括号内的数据。

4. re.findall()函数

re.findall() 函数会在整个字符串中查找匹配，它的返回值是一个 list，包含所有匹配到的子串。

4.1 re.findall()函数的基本用法

下面是一个简单的 re.findall() 的例子：

import re
pattern = r'hello'
string = 'hello, world! hello, python!'
result = re.findall(pattern, string)
print(result)

输出结果：

['hello', 'hello']

在上述例子中，我们指定了匹配模式 pattern，它将匹配字符串 starting with "hello"，然后使用 re.findall() 函数对 string 进行匹配。最后，返回匹配的结果。

4.2 re.findall()函数的高级用法

re.findall()函数还可以在匹配模式中指定匹配规则中的部分，然后将匹配规则中的部分作为返回结果。

下面是一个高级使用方法的例子：

import re
pattern = r'apple (\d+)'
string = 'apple 10, apple 20, orange 30'
result = re.findall(pattern, string)
print(result)

输出结果：

['10', '20']

在上述例子中，我们指定了匹配模式 pattern，它将匹配带有数字的 apple 串，然后使用 re.findall() 函数对 string 进行匹配。最后，返回匹配结果中所有数字。

5. re.sub()函数

re.sub() 函数用于在字符串中替换匹配到的子串。

5.1 re.sub()函数的基本用法

下面是一个简单的 re.sub() 的例子：

import re
pattern = r'world'
string = 'hello, world! hello, python!'
replace = 'python'
result = re.sub(pattern, replace, string)
print(result)

输出结果：

hello, python! hello, python!

在上面的例子中，我们使用 pattern 匹配字符串中的 "world"，然后使用 re.sub() 函数替换掉所有的匹配结果，替换成 "python"，得到的结果字符串即为:

hello, python! hello, python!

5.2 re.sub()函数的高级用法

re.sub() 函数还可以使用函数作为替换的方式，这样就可以对匹配到的子串进行复杂的处理了。

下面是一个高级使用方法的例子：

import re
pattern = r'\d+'
string = 'apple 10, banana 20, orange 30'
def func(match_obj):
  return str(int(match_obj.group()) + 1)
result = re.sub(pattern, func, string)
print(result)

输出结果：

apple 11, banana 21, orange 31

在上述例子中，我们使用 pattern 匹配字符串 string 中的所有数字，然后使用 re.sub() 函数对匹配到的数字替换，替换后的结果是原数字加 1。

6. 正则表达式中的元字符

正则表达式中的元字符是一些特殊字符，有特定的语法含义，使用这些字符可以指定字符串匹配的规则。

6.1 正则表达式中的常用元字符

下面是正则表达式中的常用元字符：

. 匹配除了换行符以外的任意字符

^ 匹配字符串的开始位置

$ 匹配字符串的结束位置

* 匹配前一个字符的零个或多个

+ 匹配前一个字符的一个或多个

? 匹配前一个字符的零个或一个

| 匹配两个或多个字符串中的任意一个

( ) 将括号内的字符作为一个单元处理

[ ] 匹配指定范围内的任意一个字符

{ } 匹配前一个字符的指定次数

6.2 元字符使用的例子

下面是一些使用元字符的例子：

匹配以 "hello" 开始的字符串：

pattern = r'^hello'
string = 'hello, world!'
result = re.match(pattern, string)
print(result)

匹配以 "world" 结束的字符串：

pattern = r'world$'
string = 'hello, world!'
result = re.search(pattern, string)
print(result)

匹配所有数字：

pattern = r'\d+'
string = 'apple 10, banana 20, orange 30'
result = re.findall(pattern, string)
print(result)

替换所有数字：

import re
pattern = r'\d+'
string = 'apple 10, banana 20, orange 30'
def func(match_obj):
  return str(int(match_obj.group()) + 1)
result = re.sub(pattern, func, string)
print(result)

总结

Python re 模块提供了多个函数来使用正则表达式进行字符串匹配操作。其中，最基本的是 re.match() 和 re.search() 函数，分别用于匹配字符串的开始和整个字符串。还有 re.findall() 函数用于返回所有匹配的子串，re.sub() 函数用于替换字符串中的子串。正则表达式还有很多的元字符，可以用于指定字符串匹配的方式，使用正则表达式可以对文本进行复杂的字符串处理。

Python re模块的用法详解

1. 简介

2. re.match()函数

2.1 re.match()函数的基本用法

2.2 re.match()函数的高级用法

3. re.search()函数

3.1 re.search()函数的基本用法

3.2 re.search()函数的高级用法

4. re.findall()函数

4.1 re.findall()函数的基本用法

4.2 re.findall()函数的高级用法

5. re.sub()函数

5.1 re.sub()函数的基本用法

5.2 re.sub()函数的高级用法

6. 正则表达式中的元字符

6.1 正则表达式中的常用元字符

6.2 元字符使用的例子

总结

相关阅读

后端开发标签

Python热门

Python更新