python非贪婪、多行匹配正则表达式
jopen
11年前
一些regular的tips:
1 非贪婪flag
>>> re . findall ( r " a( \d +?) " , " a23b " )
[ ' 2 ' ]
>>> re . findall ( r " a( \d +) " , " a23b " )
[ ' 23 ' ]
[ ' 2 ' ]
>>> re . findall ( r " a( \d +) " , " a23b " )
[ ' 23 ' ]
注意比较这种情况:
>>> re . findall ( r " a( \d +)b " , " a23b " )
[ ' 23 ' ]
>>> re . findall ( r " a( \d +?)b " , " a23b " )
[ ' 23 ' ]
[ ' 23 ' ]
>>> re . findall ( r " a( \d +?)b " , " a23b " )
[ ' 23 ' ]
2 如果你要多行匹配,那么加上re.S和re.M标志
re.S:.将会匹配换行符,默认.不会匹配换行符
>>> re . findall ( r " a( \d +)b.+a( \d +)b " , " a23b \n a34b " )
[]
>>> re . findall ( r " a( \d +)b.+a( \d +)b " , " a23b \n a34b " , re . S )
[( ' 23 ' , ' 34 ' )]
>>>
[]
>>> re . findall ( r " a( \d +)b.+a( \d +)b " , " a23b \n a34b " , re . S )
[( ' 23 ' , ' 34 ' )]
>>>
re.M:^$标志将会匹配每一行,默认^和$只会匹配第一行
>>> re . findall ( r " ^a( \d +)b " , " a23b \n a34b " )
[ ' 23 ' ]
>>> re . findall ( r " ^a( \d +)b " , " a23b \n a34b " , re . M )
[ ' 23 ' , ' 34 ' ]
[ ' 23 ' ]
>>> re . findall ( r " ^a( \d +)b " , " a23b \n a34b " , re . M )
[ ' 23 ' , ' 34 ' ]
但是,如果没有^标志,
>>> re . findall ( r " a( \d +)b " , " a23b \n a23b " )
[ ' 23 ' , ' 23 ' ]
[ ' 23 ' , ' 23 ' ]
可见,是无需re.M