Article directory

awk overview
Basic use of awk
awk common built-in variables
pattern pattern
Conditional judgment if-else
while loop

awk overview

awk is spliced from the first letters of the last names of its three founders (Alfred Aho, Peter Weinberger and Brian Kernighan).
awk, report generator, formatted text output, awk released by GNU/Linux is currently developed and maintained by the Free Software Foundation (FSF), and is usually also called GNU AWK

You can use "rpm -qi gawk" to view an official description of awk "The gawk package contains the GNU version of AWK text processing utility. AWK is a programming language designed for text processing and typically used as a dataextraction and reporting tool"

The key point is "AWK is a programming language..." Translation: "AWK is a programming language"... NB, it turns out to be a programming language, so it's so awesome. No wonder he is the leader of the Three Musketeers.

[root@rocky8 ~]# rpm -qi gawk
Name        : gawk
Version     : 4.2.1
Release     : 2.el8
Architecture: x86_64
Install Date: Sat 12 Nov 2022 05:24:52 PM CST
Group       : Unspecified
Size        : 2699078
License     : GPLv3+ and GPLv2+ and LGPLv2+ and BSD
Signature   : RSA/SHA256, Wed 19 May 2021 01:32:23 PM CST, Key ID 15af5dac6d745a60
Source RPM  : gawk-4.2.1-2.el8.src.rpm
Build Date  : Wed 19 May 2021 10:35:13 AM CST
Build Host  : ord1-prod-x86build001.svc.aws.rockylinux.org
Relocations : (not relocatable)
Packager    : [email protected]
Vendor      : Rocky
URL         : https://www.gnu.org/software/gawk/
Summary     : The GNU version of the AWK text processing utility
Description :
The gawk package contains the GNU version of AWK text processing utility. 
AWK is a programming language designed for text processing and typically used as a dataextraction and reporting tool.

The gawk utility can be used to do quick and easy text pattern matching,extracting or reformatting. 
It is considered to be a standard Linux tool for text processing.

Basic use of awk

awk [options] 'program' var= value file....
awk [options]  -f programfile var= value file... 

program格式：  pattern{
    
    action statements;...}
   pattern:决定动作语句何时触发事件，比如：BEGIN,END,正则表达式等
   action statements：对数据进行处理，放在{
    
    }内指明，常见：print,printf

   program通常是放在单引号中，并可以由三种部分组成（EGNIN语句块，模式匹配的通用语句块，END语句块）

   常用选项
     -F "分隔符"指明输入时用到的字段分隔符，默认的分隔符是若干个连续空白符
     -v var=value 变量赋值

Insert image description here
From the above example, we can see that "awk '{print "hello world"}'" is a standard input command, so you can use standard input redirection

awk中打印字符串一定要使用""双引号，
其实java也一样，双引号括起来的都是字符串，否则awk视之为变量。

Insert image description here

[root@rocky8 ~]# awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options:		GNU long options: (standard)
	-f progfile		--file=progfile
	-F fs			--field-separator=fs
	-v var=val		--assign=var=val
Short options:		GNU long options: (extensions)
	-b			--characters-as-bytes
	-c			--traditional
	-C			--copyright
	-d[file]		--dump-variables[=file]
	-D[file]		--debug[=file]
	-e 'program-text'	--source='program-text'
	-E file			--exec=file
	-g			--gen-pot
	-h			--help
	-i includefile		--include=includefile
	-l library		--load=library
	-L[fatal|invalid]	--lint[=fatal|invalid]
	-M			--bignum
	-N			--use-lc-numeric
	-n			--non-decimal-data
	-o[file]		--pretty-print[=file]
	-O			--optimize
	-p[file]		--profile[=file]
	-P			--posix
	-r			--re-interval
	-s			--no-optimize
	-S			--sandbox
	-t			--lint-old
	-V			--version

delimiter in awk default text

查看磁盘利用率时，df命令格式化输出了6列信息，可以看出这6列信息使用空格隔开。
awk则可以自动识别以空格或tab键分割的列，
awk中使用$1,$2,$3...$5..$n等表示第几列

Insert image description here

Take out the username and uid in /etc/passwd

Insert image description here

Count the top IPs that appear most frequently

Insert image description here
Use awk to filter

awk '{print $1}' /data/nginx.access.log-20230721 |sort |uniq -c|sort -nr|head -3

Insert image description here
Filter using cut

cut -d" " -f1  /data/nginx.access.log-20230721 |sort |uniq -c|sort -nr|head -3

Insert image description here

Get the number of partition utilization

Insert image description here

Extract the ip address in the network card configuration file

ifconfig ens160 | sed -n '2p' | awk '{print $2}'

Insert image description here

awk common built-in variables

Variables in awk are divided into: built-in and custom variables

awk内置变量可以使用man帮助查看使用说明
   摁“/”开始搜索关键字，
   摁“n(小写)”查看下一个匹配，
   摁“N(大写)”查看上一个匹配）
[root@rocky8 ~]# man awk

Insert image description here

FS

The input field separator, a space by default. See Fields, above.

 FS    The input field separator, a space by default.  See Fields, above.

 FS:输入字段分隔符，默认为空白字符，功能相当于-F
 
	-F fs			--field-separator=fs

Select the first word in the /etc/passwd file

Insert image description here

awk -v FS=":" '{print $1}' /etc/passwd

Insert image description here

虽然"-F"也能达到"FS"一样的的效果,但是要知道这两者有着本质上的区别，
"FS"是变量，是变量就可以在不同的地方引用，而"-f"就不行。

/etc/passwd文件的内容中，使用“：”做分隔的较多，那么我在输出打印时，我可以继续引用"FS=":""这个变量作为我输出内容的格式分隔符。

Insert image description here

OFS

OFS         The output field separator, a space by default.
OFS:输出字段分隔符，默认空白字符。

Insert image description here

RS

RS          The input record separator, by default a newline.
RS:输入记录record分隔符，指定输入时的换行符。

Use ";" semicolon as the delimiter to wrap the output content in new lines.
Insert image description here

awk -v RS=";" '{print}' test.txt

Insert image description here

ORS

ORS         The output record separator, by default a newline.
ORS:输出记录分隔符，输出时用指定符号代替换行符

awk -v RS=";" -v FS="," -v ORS="->" '{print $1}' test.txt

Insert image description here

NF

 NF          The number of fields in the current input record.
 NF:当前输入记录中的字段数。

Insert image description here
There are 7 fields separated by ":", and "$NF" is the value of the last field.

awk -F: '{print $NF}' /etc/passwd

Insert image description here

Get connection status information

netstat -nta | awk '{print $NF}'

Insert image description here

Get the penultimate field of the /etc/passwd file

Insert image description here

NR

 NR          The total number of input records seen so far.
 NR:输出记录的行号。

Insert image description here

FNR

FNR         The input record number in the current input file.
FNR:各文件分别计数，记录的行号。

Insert image description here

ARGC and ARGV

The number of output parameters and specific parameters
Insert image description here

pattern pattern

PATTERN:根据pattern条件，过滤匹配到的行，在做处理

1. If not specified: empty pattern, matches every line

  1.如果未指定：空模式，匹配每一行
     例如：awk -F: '{print $1,$3}' /etc/passwd

Insert image description here

2./regular expression/: Only processes lines that can match the pattern and needs to be enclosed in //

  2./regular expression/:仅处理能够模式匹配到的行，需要用//括起来
     例如：awk '/UUID/{print $1}' /etc/fstab
          awk '!/UUID/{print $1}' /etc/fstab    "!为取反"

Insert image description here

3.relational expression: Relational expression will be processed only if the result is "true"

 真：结果为非0值，非空字符串
 假：结果为0值或空字符串

Insert image description here

4.line ranges: line ranges

不支持直接用行号，但可以使用变量NR间接指定行号
/part1/,/part2/   不支持直接给出数字格式

Insert image description here
Print the lines starting with b and ending with h in the /etc/passwd file

Conditional judgment if-else

[root@rocky8 ~]# awk 'BEGIN{score=88;if(score>=80){print "good"}else if(score>=60){print "pass"}else{print "no pass"}}'
good
[root@rocky8 ~]# awk 'BEGIN{score=78;if(score>=80){print "good"}else if(score>=60){print "pass"}else{print "no pass"}}'
pass
[root@rocky8 ~]# awk 'BEGIN{score=58;if(score>=80){print "good"}else if(score>=60){print "pass"}else{print "no pass"}}'
no pass
[root@rocky8 ~]# 


[root@rocky8 ~]# cat score.txt 
name       score
zhangsan   100
lisi       90
zhaoliu    70
wangwu     55
[root@rocky8 ~]# awk 'NR!=1{score=$2;if(score>=80){print $1,"good"}else if(score>=60){print $1, "pass"}else{print $1, "no pass"}}' score.txt
zhangsan good
lisi good
zhaoliu pass
wangwu no pass
[root@rocky8 ~]#

while loop

[root@rocky8 ~]# awk 'BEGIN{i=1;sum=0;while(i<=100){sum+=i;i++};print sum}'
5050
[root@rocky8 ~]# sum=0;for((i=1;i<=100;i++));do let sum+=i;done;echo $sum
5050
[root@rocky8 ~]#

Insert image description here

Linux-Text processing three musketeers awk

Article directory

awk overview

Basic use of awk

Take out the username and uid in /etc/passwd

Count the top IPs that appear most frequently

Get the number of partition utilization

Extract the ip address in the network card configuration file

awk common built-in variables

FS

Select the first word in the /etc/passwd file

OFS

RS

ORS

NF

Get connection status information

Get the penultimate field of the /etc/passwd file

NR

FNR

ARGC and ARGV

pattern pattern

1. If not specified: empty pattern, matches every line

2./regular expression/: Only processes lines that can match the pattern and needs to be enclosed in //

3.relational expression: Relational expression will be processed only if the result is "true"

4.line ranges: line ranges

Conditional judgment if-else

while loop

Guess you like