Common regular expressions:
\d can match a digit
\w can match a letter or number
. can match any character
To match variable length characters:
* can match zero or n characters
+ can match one or n characters
? Can match zero or 1 character
{n} can match n characters
{n,m} can match n to m characters
To match more precisely, you can []
express the range with:
For a more precise match, you can use the []
representation range, for example:
-
[0-9a-zA-Z\_]
Can match a number, letter or underscore; -
[0-9a-zA-Z\_]+
Can match strings consisting of at least one number, letter or underscore, such as'a100'
,'0_Z'
,'Py3000'
etc.; -
[a-zA-Z\_][0-9a-zA-Z\_]*
It can match a string starting with a letter or underscore, followed by any string consisting of a number, letter or underscore, which is a legal variable in Python; -
[a-zA-Z\_][0-9a-zA-Z\_]{0, 19}
More precisely limits the length of the variable to 1-20 characters (1 character in front + up to 19 characters in back).
A|B
can match either A or B, so (P|p)ython
can match 'Python'
either 'python'
.
^
Indicates the beginning of a line, ^\d
indicating that it must start with a number.
$
Indicates the end of the line, \d$
indicating that it must end with a number.
Exercise questions:
#! /usr/bin/env python3
# -*- coding:gbk -*-
import re
email1= '[email protected]'
email2='[email protected]'
if re.match('[a-zA-Z0-9\.\_]+\@[a-zA-Z0-9\.\_]+\.com$',email1):
print('ok')
else:
print('erros')