python regular expression (\ []) (c)

Escape meta characters:

Backslash \ it has many uses in the regular expression.

For example, we want to search for in the text below, in front of the point of all strings, also includes point itself.

Apple green 
orange. Orange 
bananas are yellow

 If we write a regular expression . * , You must find clever quite right

Because the point is a meta-character, it appears directly in the regular expression, which matches any single character, can not be represented . The character itself mean

If we want to search the content itself contains meta-characters, can be escaped with a backslash

Here we should use this expression: . * \.

Examples, python procedure is as follows:

= Content '' ' 
. apple green 
orange is orange. 
Bananas are yellow. 
' '' 
Import Re 
the p-re.compile = (r '* \..') 
for One in p.findall (Content): 
    Print (one)

 Match a certain character types

Backslash character answer it, which matches a certain type of character

such as

\ D between 0-9 match any numeric character, equivalent to the expression [0-9]

\ D matches any character is not a number between 0-9, equivalent to the expression [^ 0-9]

\ S matches any whitespace character including space, tab, newline, equivalent to the expression [\ t \ n \ r \ f \ v]

\ S matches any non-blank character, equivalent to the expression [^ \ t \ n \ r \ f \ v]

\ W matches any text characters, including uppercase and lowercase letters, numbers, underscores, equivalent to the expression [a-zA-Z0-9_]

\ W matches any non-text character, equivalent to the expression [^ a-zA-Z0-9_]

Backslash also be used in square brackets, for example [\ s ,.] match represents: any whitespace character, or a comma, or point

For example:

source=''' 
王亚辉
tony
刘文武
'''
import re
p=re.compile(r'\w{2,4}',re.A)
print (p.findall(source))



'''['tony']'''

  

 

Square brackets - certain types match

Square brackets indicate that you want to match certain types of characters

such as

[abc] match any a, b, c or inside a character, equivalent to [AC]

[AC] middle - represents a range from a to c

If you want to match all lowercase letters, you can use [az]

Some metacharacters lost magic in square brackets, and becomes the same as the ordinary characters

such as,

[akm.] match akm. which any character

Here . In parentheses denote not match any character, but rather indicates a match . This character

starting point

^ Representing the starting position of the matched text

If it is a multi-line mode, to indicate the beginning of each line position of the matching text

For example, the following text you want to select each row in front of a comma character string, including commas se

Apples, apple green 
oranges, orange is orange 
banana, banana yellow

 You can write a regular expression ^. *.

 

If in square brackets ^ represents a non-set of characters inside the square brackets ( [^ \ D] , said selected non-numeric character)

such as:

content='a1b2c3d4e5'
import re
p=re.compile(r'[^\d]')
for one in p.findall(content):
    print(one)

'''
输出结果:
a
b
c
d
e
'''

  

 

Guess you like

Origin www.cnblogs.com/wxcx/p/12643156.html