SHELL脚本之常用本文处理命令④sed

文章目录

语法

工作流程
选项
脚本

模式空间

基本命令
进阶命令

保持空间
改变流
模式替代

sed： Stream Editor，流编辑器，用于文本编辑

语法

sed [OPTIONS] ‘SCRIPTS’ FILE…

工作流程

循环读取文件的每一行到模式空间
匹配 SCRIPTS 中的第一个 address，如果符合则执行 command；不符合则读取下一个 address，继续匹配
直到所有的 address command 对匹配完，输出模式空间的内容，取下一行内容

选项

-f SCRIPTS_FILE：调用sed脚本处理文件
-n：抑制默认输出
-r：支持扩展元字符，用于正则表达式
iSUFFIX：inplace，修改源文件，同时创建一个备份文件 源文件名.SUFFIX

脚本

脚本的组成：[address][!]command
- address：匹配规则
  - 空：匹配所有行
  - $：匹配最后一行
  - N：匹配第N行
  - N,M：匹配第N到M行
  - N~M：从第N行开始，每隔M-1，M是步进
  - N,+M：匹配第N行和后面M行
  - /PATTERN/：用正则表达式匹配行
    - /PATTERN1/, /PATTERN2/：从匹配到PATTERN1的行开始，到匹配到PATTERN2的行结束，如sed ‘/^a/,/^b/’ FILE：匹配a开头的行到b开头的行
- !：不执行后面的指令
- command：执行模式空间或保持空间的命令
多脚本处理：SCRIPT1; SCRIPT2; SCRIPT3…
多脚本对同一地址操作：address{[address1]command1; …}

模式空间

sed会将文件逐行调入模式空间中，对文本的处理是在模式空间中进行的

基本命令

d：删除模式空间的内容

# 删除第一行内容
[root@localhost ~]# echo -e 'a\nb\naaa' | sed '1d' 
b
aaa

p：打印模式空间的内容

[root@localhost ~]# echo -e 'a\nb\naaa' | sed 'p' 
a
a
b
b
aaa
aaa
# 抑制默认输出
[root@localhost ~]# echo -e 'a\nb\naaa' | sed -n 'p' 
a
b
aaa

s/PATTERN/REPLACE/[g]：替换模式空间的内容，g表示全部替换

扫描二维码关注公众号，回复： 10298695 查看本文章
```
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed 's/aa/b/' 
a
b
baa
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed 's/aa/b/g' 
a
b
bb
```

a\STRING：append，将STRING追加到匹配的行的下一行

# 追加到第二行的后面
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed '2a\c' 
a
b
c
aaaa

i\STRING：insert，将STRING插入到匹配的行的前一行

# 插入到第二行的前面
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed '2i\c' 
a
c
b
aaaa

c\STRING：change，用STRING替换模式空间的内容

# 替换第二行的内容
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed '2c\ccc' 
a
ccc
aaaa

=：打印行号（在匹配的上一行打印）

[root@localhost ~]# echo -e 'a\nb\naaaa' | sed '=' 
1
a
2
b
3
aaaa
# 寻找以a开头的行的行号
[root@localhost ~]# echo -e 'a\nb\naaaa' | sed -n '/^a/=' 
1
3

l：打印内容（包括控制字符）

控制字符：如制表符\t，换行符$（这里表示每一行的结尾）

[root@localhost ~]# echo -e 'a\nb\na\taa' | sed 'l' 
a$
a
b$
b
a\taa$
a	aa
[root@localhost ~]# echo -e 'a\nb\na\taa' | sed -n 'l' 
a$
b$
a\taa$

y/PATTERN/REPLACE/：字符转换，用法与tr命令类似

[root@localhost ~]# echo -e 'a\nb\naaa' | sed 'y/ab/AB/' 
A
B
AAA
[root@localhost ~]# echo -e 'a\nb\naaa' | tr a-b A-B
A
B
AAA

n：next，读取下一行到模式空间

# 打印奇数行（偶数行被奇数行读取但不进行操作）
[root@localhost ~]# echo -e 'a\nb\naaa' | sed -n 'p;n' 
a
aaa
# 打印偶数行
[root@localhost ~]# echo -e 'a\nb\naaa' | sed -n 'n;p' 
b

r FILE：read，读取文件内容到指定行

# 把文件读取到第一行
[root@localhost ~]# echo -e 'a\nb\naaa' | sed '1r /etc/fstab' 
a
# /etc/fstab
# Created by anaconda on Wed Jan 15 04:14:26 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=e3ee29aa-3286-4515-bae2-1558e60a0c7e /                       ext4    defaults        1 1
UUID=4111b6a8-bc5a-4781-b3d3-5c8167353b7f /boot                   ext4    defaults        1 2
UUID=e60a9f33-df7b-467d-a284-b18b731565ae swap                    swap    defaults        0 0

b
aa

w FILE：write，保存指定行到文件

q：quit，读到指定行就退出

# 假设test是一个有一百万行的文件，要读取前十行
# sed是一行一行的读，读到第十行退出
[root@localhost ~]# sed '10q' test
# head是一次性把文件全部读到内存中，输出前十行
[root@localhost ~]# head -n 10 test

进阶命令

N：读取下一行内容到模式空间，以\n拼接上一行内容
P：打印模式空间上一行的内容（\n前面的内容）

D：删除模式空间上一行的内容（\n前面的内容），并循环执行命令

# 流水线作业
[root@localhost ~]# echo -e 'a\nb\nc\nd' | sed 'N;P;D'
a
b
c
d
[root@localhost ~]# echo -e 'a\nb\nc\nd' | sed -n 'N;P;D'
a
b
c

# 删除重复的空行
[root@localhost ~]# echo -e 'a\n\nb\n\n\nc\n\n\n\nd' | sed '/^$/{N;/\n$/D}'
a

b

c

d

保持空间

将模式空间的内容存储到保持空间，使得可以利用保持空间处理多行内容

h/H：hold，将模式空间的内容覆盖/追加到保持空间
g/G：get，将保持空间的内容覆盖/追加到模式空间
x：exchange，交换两个空间的内容

# 将内容在同一行输出
[root@localhost ~]# echo -e 'a\nb\nc\nd' | sed 'H;${x;s/\n/,/g; s/^,//}; $!d'

# 将行反转输出
[root@localhost ~]# echo -e 'a\nb\nc\nd' | sed -n '1!G;h;$p'
d
c
b
a

改变流

用跳转语句改变默认sed的工作流程（从第一个脚本执行到最后一个脚本）

branch：无条件跳转

[address]b：无label表示跳过b后所有命令
[address]b LABEL：有label表示跳到label后面的命令执行
label定义：:LABEL

# test文件内容
This is the header line.
This is the first data line.
This is the second data line.
This is the last data line.

[root@localhost ~]# sed '2,3b;s/This is/Is this/;s/line./test?/' test
Is this the header test?
This is the first data line.
This is the second data line.
Is this the last data test?
[root@localhost ~]# sed '/first/b jump1;s/This is/Is this/;:jump1;s/line./test?/' test
Is this the header test?
This is the first data test?
Is this the second data test?
Is this the last data test?

# branch是无条件的跳转，可能导致死循环
[root@localhost ~]# echo "This, is , a, test, to, remove, coomas."  | sed ':start;s/,// b start'
# 卡住，无输出，cpu嗡嗡响

test：有条件的跳转到指定label，如果label后面的命令执行失败停止跳转（没label不跳转）

[address]t [LABEL]

[root@localhost ~]# echo "This, is , a, test, to, remove, coomas."  | sed ':start;s/,//; t start'
This is  a test to remove coomas.

模式替代

主要用于 s/// 结构

&：用来替换正则表达式

[root@localhost ~]# echo 1234567 | sed -r 's/[0-9]+/abc&efg/g'
abc1234567efg

\N：后向引用

[root@localhost ~]# echo "abc 123 def "|sed -r 's/(^.*)(\b[0-9]+\b.)(.*$)/\2\1\3/g' 
123 abc def

努力的阿玮

发布了67 篇原创文章 · 获赞 2 · 访问量 1381

私信关注