python正则库re基础知识

正则语法：

[ABC]

匹配[…]中的所有字符
[A-Z]：匹配A-Z大写字母
.:任意字符，除换行符

[sS]

匹配所有
s:匹配所有空白符
S:非空白符，不包括换行
空白符的定义:
空格（ ）
制表符（t）
换行符（n）
换页符（f）
垂直制表符（v）

匹配字母、下划线、数字
与[A-Za-z0-9_]等价

匹配任意一个阿拉伯数字
相当于 [0-9]

忽略大小写

全局匹配，找到所有匹配项

限定符号

*:匹配零次以上
+:匹配一次以上
?:零次或一次
{n}:匹配n次
{n,}:匹配n次以上
{n,m}:匹配n到m次

定位符：

^：匹配输入字符串开始的位置。如果设置了 RegExp 对象的 Multiline 属性，^ 还会与 n 或 r 之后的位置匹配
$：匹配输入字符串结尾的位置。如果设置了 RegExp 对象的 Multiline 属性，$ 还会与 n 或 r 之前的位置匹配
b：匹配一个单词边界，即字与空格间的位置
B：非单词边界匹配

//不能将限定符与定位符一起使用

选择：

():捕获组，有捕获组后返回捕获组里的内容
//ptn_get = re.compile(br"$_GET['(w+)']")返回参数

exp1(?=exp2)：查找exp2前面的exp1
(?<=exp2)exp1：查找 exp2 后面的 exp1
exp1(?!exp2)：查找后面不是 exp2 的 exp1
(?<!exp2)exp1：查找前面不是 exp2 的 exp1

re库：

判断字符串是否符合某种模式：

import re
if re.match(r'^d+$', '12345'):
    print("全是数字")

在字符串中查找符合模式的内容：

result = re.search(r'd+', '价格是100元')
if result:
    print(result.group())  # 输出: 100

查找所有匹配项：

numbers = re.findall(r'd+', 'a1b22c333')
print(numbers)  # ['1', '22', '333']

替换字符：

text = re.sub(r'd+', 'X', 'a1b22c333')
print(text)  # aXbXcX

分割字符串：

parts = re.split(r'W+', 'hello, world!')
print(parts)  # ['hello', 'world', '']

重复使用：

基本匹配
import re
# 编译一个匹配邮箱的正则
email_pattern = re.compile(r'b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]{2,}b')
text = "联系我：user@example.com 或 admin@test.org"
# 使用编译后的对象进行查找
matches = email_pattern.findall(text)
print(matches)  # ['user@example.com', 'admin@test.org']

带标志的编译（如忽略大小写）
pattern = re.compile(r'hello', re.IGNORECASE)
print(pattern.search("HELLO world"))  # 匹配成功

多次使用同一个模式（性能优势）
digits = re.compile(r'd+')
lines = ["abc123", "def456", "xyz789"]
for line in lines:
    match = digits.search(line)
    if match:
        print(match.group())

编译后对象的常用方法：
.match(string)：从字符串开头匹配
.search(string)：在整个字符串中搜索第一个匹配
.findall(string)：返回所有非重叠匹配的列表
.finditer(string)：返回匹配的迭代器（Match 对象）
.sub(repl, string)：替换匹配项
.split(string)：按匹配项分割字符串

python正则库re基础知识

正则语法：

re库：

发送评论 编辑评论

发送评论编辑评论