Senior ----- shell regular expressions

Regular expressions overview

Regular expressions are a defined rules, Linux tools can use to filter text.

Basic regular expressions

Plain Text

[root@node1 ~]# echo "this is a cat" | sed -n '/cat/p'
this is a cat
[root@node1 ~]# echo "this is a cat" | gawk  '/cat/{print $0}'
this is a cat

 Regular expression matching is very picky, especially need to remember that regular expressions are case-sensitive.

Special characters

Special characters in regular expressions identified include:

.*[]^${}\+?|()

If you want to use a special character as a text character, it must be escaped, general use (\) to escape.

[root@node1 ~]# echo "this is  a $" | sed -n '/\$/p'
this is  a $

Anchor Character

There are two special characters may be used to lock the pattern in the first row or end of data stream

The caret (^) define the mode from the beginning of the line data stream Chinese Bank began.

The dollar sign ($) defines the end of the line anchor.

[root@node1 ~]# echo "this is a cat" | sed -n '/^this/p'
this is a cat
[root@node1 ~]# echo "this is a cat" | sed -n '/cat$/p' 
this is a cat

In some cases these two commands may be used in combination

1. For example, find the line that contains only specific text

[root@node1 ljy]# more test.txt            
this is a dog
what
how
this is a cat
is a dog
[root@node1 ljy]# sed -n '/^is a dog$/p' test.txt 
is a dog
[root@node

2. The combination of two anchors, blank lines may be directly filtered

[root@node1 ljy]# more test.txt           
this is a dog
what
how

this is a cat
is a dog
[root@node1 ljy]# sed  '/^$/d' test.txt    
this is a dog
what
how
this is a cat
is a dog

Dot character

Dot matches any other single newline character, he must match a character.

[root@node1 ljy]# more test.txt 
this is a dog
what
how
this is a cat
is a dog
at
[root@node1 ljy]# sed -n '/.at/p' test.txt 
what
this is a cat

Character Group

DETAILED defined characters to be matched using a set of characters. Use square brackets to define a character group.

[root@node1 ljy]# more test.txt 
this is a dog
this is a Dog
this is a DoG
this is a cat
[root@node1 ljy]# sed -n '/[dD]og/p' test.txt 
this is a dog
this is a Dog
[root@node1 ljy]# sed -n '/[dD]o[gG]/p' test.txt  
this is a dog
this is a Dog
this is a DoG

Negated character set

To exclude certain elements, to add a caret in front of the character set.

[root@node1 ljy]# sed -n '/[dD]o[gG]/p' test.txt  
this is a dog
this is a Dog
this is a DoG
[root@node1 ljy]# sed -n '/[^D]og/p' test.txt       
this is a dog

Interval

Regular expressions can include any character in this range.

[root@node1 ljy]# more test.txt 
123123
1231
121222222
412345341613
vsdvs
qwer12344123
12345
34211
444444
[root@node1 ljy]# sed -n '/^[0-9][0-9][0-9][0-9][0-9]$/p' test.txt 
12345
34211

Expand the regular expression

question mark

Question mark indicates that the preceding character appear 0 or 1, limited.

[root@node1 ljy]# echo "bat" | gawk '/ba?t/{print $0}'  
bat
[root@node1 ljy]# echo "baat" | gawk '/ba?t/{print $0}'
[root@node1 ljy]# echo "bt" | gawk '/ba?t/{print $0}'  
bt

Can be used with a question mark and character set

[root@node1 ljy]# echo "bt" | gawk '/b[ae]?t/{print $0}' 
bt
[root@node1 ljy]# echo "bat" | gawk '/b[ae]?t/{print $0}'
bat
[root@node1 ljy]# echo "bet" | gawk '/b[ae]?t/{print $0}' 
bet
[root@node1 ljy]# echo "baat" | gawk '/b[ae]?t/{print $0}'

plus

Plus sign indicates that the preceding character may occur once or more times, but at least once.

[root@node1 ljy]# echo "baat" | gawk '/b[ae]+t/{print $0}' 
baat
[root@node1 ljy]# echo "bt" | gawk '/b[ae]+t/{print $0}'  
[root@node1 ljy]# echo "bt" | gawk '/ba+t/{print $0}'   
[root@node1 ljy]# echo "bat" | gawk '/ba+t/{print $0}'
bat
[root@node1 ljy]# echo "baat" | gawk '/ba+t/{print $0}'
baat

curly braces

ERE in curly braces allows you to repeat regular expression specified limits.

m, n m This minimum occurs, occurs most n.

[root@node1 ljy]# echo "baat" | gawk '/b[ae]{1,2}t/{print $0}'  
baat
[root@node1 ljy]# echo "baaat" | gawk '/b[ae]{1,2}t/{print $0}'

The pipe symbol

Specified regular expression rules or logical manner in which to comply with a condition it can be.

Expressions grouping

Regular expression grouping can also be grouped with parentheses.

[root@node1 ljy]# echo "bat" | gawk '/b(a|e)t/{print $0}'           
bat
[root@node1 ljy]# echo "baat" | gawk '/b(a|e)t/{print $0}'
[root@node1 ljy]# echo "bet" | gawk '/b(a|e)t/{print $0}'  
bet

 

Guess you like

Origin www.cnblogs.com/jinyuanliu/p/10937795.html