Escape meta characters:
Backslash \ it has many uses in the regular expression.
For example, we want to search for in the text below, in front of the point of all strings, also includes point itself.
Apple green orange. Orange bananas are yellow
If we write a regular expression . * , You must find clever quite right
Because the point is a meta-character, it appears directly in the regular expression, which matches any single character, can not be represented . The character itself mean
If we want to search the content itself contains meta-characters, can be escaped with a backslash
Here we should use this expression: . * \.
Examples, python procedure is as follows:
= Content '' ' . apple green orange is orange. Bananas are yellow. ' '' Import Re the p-re.compile = (r '* \..') for One in p.findall (Content): Print (one)
Match a certain character types
Backslash character answer it, which matches a certain type of character
such as
\ D between 0-9 match any numeric character, equivalent to the expression [0-9]
\ D matches any character is not a number between 0-9, equivalent to the expression [^ 0-9]
\ S matches any whitespace character including space, tab, newline, equivalent to the expression [\ t \ n \ r \ f \ v]
\ S matches any non-blank character, equivalent to the expression [^ \ t \ n \ r \ f \ v]
\ W matches any text characters, including uppercase and lowercase letters, numbers, underscores, equivalent to the expression [a-zA-Z0-9_]
\ W matches any non-text character, equivalent to the expression [^ a-zA-Z0-9_]
Backslash also be used in square brackets, for example [\ s ,.] match represents: any whitespace character, or a comma, or point
For example:
source=''' 王亚辉 tony 刘文武 ''' import re p=re.compile(r'\w{2,4}',re.A) print (p.findall(source)) '''['tony']'''
Square brackets - certain types match
Square brackets indicate that you want to match certain types of characters
such as
[abc] match any a, b, c or inside a character, equivalent to [AC]
[AC] middle - represents a range from a to c
If you want to match all lowercase letters, you can use [az]
Some metacharacters lost magic in square brackets, and becomes the same as the ordinary characters
such as,
[akm.] match akm. which any character
Here . In parentheses denote not match any character, but rather indicates a match . This character
starting point
^ Representing the starting position of the matched text
If it is a multi-line mode, to indicate the beginning of each line position of the matching text
For example, the following text you want to select each row in front of a comma character string, including commas se
Apples, apple green oranges, orange is orange banana, banana yellow
You can write a regular expression ^. *.
If in square brackets ^ represents a non-set of characters inside the square brackets ( [^ \ D] , said selected non-numeric character)
such as:
content='a1b2c3d4e5' import re p=re.compile(r'[^\d]') for one in p.findall(content): print(one) ''' 输出结果: a b c d e '''