正则表达式核心语法 + Python的 re 库中常用方法

企业开发 2025-04-11 23:21:57 阅读次数: 0

正则表达式（Regular Expression，re）：一种用于匹配、查找或替换文本中特定模式的强大工具。

一、re的核心语法

1、基本匹配

语法	说明	示例（表达式 → 匹配示例）
`abc`	匹配字面值 `"abc"`	`"abc"` → `"abc"`
`.`	匹配任意单个字符（除换行符 `\n`）	`"a.c"` → `"abc"`, `"a c"`
`\`	转义特殊字符（如 `\.` 匹配点号）	`"a\.c"` → `"a.c"`
`\|`	或逻辑（匹配左边或右边的表达式）	`"cat\|dog"` → `"cat"` 或 `"dog"`

2、字符类

语法	说明	示例（表达式 → 匹配示例）
`[abc]`	匹配 `a`、`b` 或 `c`	`"[aeiou]"` → `"e"` in `"hello"`
`[^abc]`	匹配非 `a`、`b`、`c` 的字符	`"[^0-9]"` → `"a"` in `"a1"`
`[a-z]`	匹配小写字母（范围）	`"[a-z]"` → `"h"` in `"Hi"`
`[A-Z0-9]`	匹配大写字母或数字	`"[A-Z0-9]"` → `"H"`, `"1"`

3、量词（重复匹配）

语法	说明	示例（表达式 → 匹配示例）
`*`	匹配前一项 0次或多次	`"a*"` → `""`, `"aaa"`
`+`	匹配前一项 1次或多次	`"a+"` → `"a"`, `"aaa"`
`?`	匹配前一项 0次或1次	`"a?"` → `""`, `"a"`
`{n}`	匹配前一项恰好n次	`"a{2}"` → `"aa"`
`{n,}`	匹配前一项至少n次	`"a{2,}"` → `"aaa"`
`{n,m}`	匹配前一项 n到m次	`"a{2,3}"` → `"aa"`, `"aaa"`

4、贪婪 vs 非贪婪

语法	说明	示例（表达式 → 匹配示例）
`*`	贪婪匹配（尽可能多）	`"a.*b"` → `"aabb"` in `"aabbaab"`
`*?`	非贪婪匹配（尽可能少）	`"a.*?b"` → `"aab"` in `"aabbaab"`
`+?`	非贪婪的 `+`	`"a.+?b"` → `"aab"`

5、预定义字符类

语法	说明	等价写法 → 匹配示例
`\d`	数字（`[0-9]`）	`"a\d"` → `"a1"`
`\D`	非数字（`[^0-9]`）	`"a\D"` → `"ab"`
`\w`	单词字符（`[a-zA-Z0-9_]`）	`"\w+"` → `"word_"`
`\W`	非单词字符	`"\W"` → `"!"`
`\s`	空白字符（空格、制表符等）	`"a\sb"` → `"a b"`
`\S`	非空白字符	`"a\Sb"` → `"a1b"`

6、边界匹配

语法	说明	示例（表达式 → 匹配示例）
`^`	匹配字符串开头	`"^a"` → `"a"` in `"abc"`
`$`	匹配字符串结尾	`"c$"` → `"c"` in `"abc"`

二、Python的 re 库中常用的基本方法

1、核心匹配方法

方法	语法	返回值	功能说明	示例
`re.match()`	`re.match(pattern, string)`	`Match` 对象或 `None`	从字符串开头匹配	`re.match(r'\d+', '123abc').group()` → `'123'`
`re.search()`	`re.search(pattern, string)`	`Match` 对象或 `None`	扫描整个字符串匹配第一个	`re.search(r'\d+', 'abc123').group()` → `'123'`
`re.findall()`	`re.findall(pattern, string)`	列表	返回所有匹配的子串	`re.findall(r'\d+', 'a1b22c333')` → `['1', '22', '333']`

代码示例：

# match()方法错误示范
text = "邮箱：[email protected]"
email_pattern = r'^[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
res = re.match(email_pattern, text)
if res:
    print(res.group())   
# 没有输出，因为文本开头是邮箱，而match()方法只从字符串开头匹配正则表达式，res为None



# match()方法正确使用：修改text，或使用search（）方法
text = "[email protected]"
email_pattern = r'^[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
res = re.match(email_pattern, text)
if res:
    print(res.group())  
# 输出为：[email protected]

# search（）
text = "abc123def456"
result = re.search(r'\d+', text)  # 查找第一个数字序列

if result:
    print("找到数字:", result.group())  # 输出: 123
else:
    print("未找到数字")

# findall（）
text = "a156b22c333d"
results = re.findall(r'\d+', text)  # 查找所有数字序列

print("所有数字:", results)  # 输出: ['156', '22', '333']

注：.group()方法用于提取匹配的内容。如re.match()方法返回结果的是Match 对象，而不是匹配的内容，需要使用group()提取匹配内容。

2. 替换与分割

方法	语法	返回值	功能说明	示例
`re.sub()`	`re.sub(pattern, repl, string, count=0)`	字符串	替换匹配的子串。 `count`：最大替换次数（默认 `0` 表示全部替换）	`re.sub(r'\d+', 'X', 'a1b22')` → `'aXbX'`
`re.split()`	`re.split(pattern, string, maxsplit=0)`	列表	按正则表达式分割字符串	`re.split(r'\d+', 'a1b22c3')` → `['a', 'b', 'c', '']`

代码示例：

import re

# 替换所有匹配项
text = "Python is great. Python is easy."
result = re.sub(r'Python', 'Java', text)  
print(result)  
# 输出: "Java is great. Java is easy."


# 只替换第一个
result = re.sub(r'Python', 'Java', text, count=1)  
print(result)  
# 输出: "Java is great. Python is easy."

text = "apple?banana,cherry.egg right"
result = re.split(r'[,.? ]', text)  # 按[,.? ]分割
print(result)  
# 输出: ['apple', 'banana', 'cherry', 'egg']


result = re.split(r'[,.? ]', text, maxsplit=1)  # 只分割一次
print(result)  
# 输出: ['apple', 'banana,cherry.egg right']

# 文章如有错误，欢迎大家指正。我们下期再见

猜你喜欢

转载自blog.csdn.net/weixin_74268817/article/details/146910972

正则表达式核心语法 + Python的 re 库中常用方法

python正则表达式re模块常用方法

python高级语法-系统标准库中re（正则表达式）的使用

python库的解析--正则表达式(re库)

python常用模块-re 正则表达式

Python常用模块——正则表达式re模块

python--re 常用正则表达式（转载）

Python正则表达式和Re库

python__标准库 : 正则表达式(re)

python re库入门(正则表达式)

python 正则表达式re库相关笔记

python 正则表达式re库学习

【Python库】05—RE正则表达式

Python库-re(正则表达式)

python爬虫的re库（正则表达式匹配）

Python爬虫--- 1.4 正则表达式：re库

python re库（正则表达式）入门

python re库正则表达式

正则表达式与 Python re库

正则表达式相关，python的re库

Python-Re正则表达式库

Re库以及正则表达式 - Python

正则表达式Re库的使用-Python

Python爬虫速成------正则表达式及re库

python学习之 re库正则表达式

Python核心编程之Re正则表达式

python核心--正则表达式re模块

python正则表达式re模块语法

re模块语法—python正则表达式

今日推荐

Electron中的关于静态资源加载问题解决方案

《Cursor-AI编程》基础篇-界面指南

《Cursor-AI编程》基础篇-Tab代码智能补充

《Cursor-AI编程》基础篇-Composer功能详解

《Cursor-AI编程》基础篇-Chat功能详解

《Cursor-AI编程》进阶篇-自定义模型

《Cursor-AI编程》进阶篇-上下文详解

【大模型系列篇】最强检索增强技术GraphRAG基本原理详解

【大模型系列篇】基于Ollama和GraphRAG v2.0.0快速构建知识图谱

解释什么是迁移学习？在 CNN 中如何应用？（面试题200合集，高频、关键）

解释数据增强（Data Augmentation）的概念和方法（（面试题200合集，高频、关键））

揭秘大模型“魔法”：Function Calling 让 AI 不止会说，更能“做”！

周排行

ConfigurationClassParser类的parse方法源码解析

基础大讲堂-java 位运算符

ConsecutiveInteger判断给定的整数n能否表示成连续的m(m>1)个正整数之和

多项式问题之六——多项式快速幂

Spring Security技术栈开发企业级认证与授权（四）RESTful API服务异常处理

Linux基础命令---apachectl

MATLAB中的线性插值

Unity编辑器拓展之十七：NGUI ComponentSelector增加搜索框

SqlServer 备份还原教程

[Unity动画]01.

每日归档

2025-04-12(10529)

2025-04-11(9561)

2025-04-10(1213)

2025-04-09(10354)

2025-04-08(12998)

2025-04-07(0)

2025-04-06(0)

2025-04-05(0)

2025-04-04(0)

2025-04-03(0)