12 Efficient Text Filtering Commands on Linux

Summary: In this article, we'll take a look at some of the filter command-line tools in Linux. A filter is a program that reads data from standard input, performs operations on the data, and writes the result to standard output. As such, it can be used to process information in powerful ways, such as re-structuring output to generate useful reports, modifying text in files, and many other system administration tasks.

In this article, we'll take a look at some filter command line tools in Linux. A filter is a program that reads data from standard input, performs operations on the data, and writes the result to standard output.

As such, it can be used to process information in powerful ways, such as re-structuring output to generate useful reports, modifying text in files, and many other system administration tasks.

Below are some useful file or text filters on Linux.

1, awk command
awk is an excellent pattern scanning and processing language, it can be used to construct useful filters under Linux. You can get started with it by reading parts 1 through 13 of our awk series.

Alternatively, read the awk man page for more information and options.

$ man awk
2, sed command
sed is a powerful stream editor for filtering and transforming text. We've written two useful articles about sed, which you can read here:

How to Use GNU sed Commands to Create, Edit, and Manipulate Files under Linux 15 Helpful sed Command Tips for
Everyday Linux System Administrator Tasks
sed's man manual has added control options and descriptions:

$ man sed
3, grep, egrep, fgrep, rgrep command line
These filters output lines matching the specified pattern. They read lines from a file or standard input, and output all matching lines, defaulting to standard output.

NOTE: The main program is grep, these variants are the same as grep with specific options as follows (they are still used for backward compatibility):

$ egrep = grep -E
$ fgrep = grep -F
$ rgrep = grep -r
Here are some basic grep commands:

tecmint@TecMint ~ $ grep "aaronkilik" /etc/passwd
aaronkilik:x:1001:1001::/home/aaronkilik:
tecmint@TecMint ~ $ cat /etc/passwd | grep "aronkilik"
aaronkilik:x:1001:1001::/home/aaronkilik:
Difference between grep, egrep and fgrep under Linux? , you can learn more.

4. The head command
head is used to display the front part of the file, by default it outputs the first 10 lines. You can use the -n flag to specify the number of lines to display:

tecmint@TecMint ~ $ head /var/log/auth.log
Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session closed for user root
Jan 2 10:51:34 TecMint sudo: tecmint : TTY=unknown ; PWD=/home/tecmint ; USER=root ; COMMAND=/usr/lib/linuxmint/mintUpdate/checkAPT.py
Jan 2 10:51:34 TecMint sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 2 10:51:39 TecMint sudo: pam_unix(sudo:session): session closed for user root
Jan 2 10:55:01 TecMint CRON[4099]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 2 10:55:01 TecMint CRON[4099]: pam_unix(cron:session): session closed for user root
Jan 2 11:05:01 TecMint CRON[4138]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 2 11:05:01 TecMint CRON[4138]: pam_unix(cron:session): session closed for user root
Jan 2 11:09:01 TecMint CRON[4146]: pam_unix(cron:session): session opened for user root by (uid=0)
tecmint@TecMint ~ $ head -n 5 /var/log/auth.log
Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session closed for user root
Jan 2 10:51:34 TecMint sudo: tecmint : TTY=unknown ; PWD=/home/tecmint ; USER=root ; COMMAND=/usr/lib/linuxmint/mintUpdate/checkAPT.py
Jan 2 10:51:34 TecMint sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
Jan 2 10:51:39 TecMint sudo: pam_unix(sudo:session): session closed for user root
Learn how to use the head command with tail and cat commands for more efficient use under Linux.

5. The tail command
tail outputs the latter part of a file (default 10 lines). Use the -n option to specify the number of lines to display.

The following command will output the last 5 lines of the specified file:

tecmint@TecMint ~ $ tail -n 5 /var/log/auth.log
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting.
Jan 6 13:01:27 TecMint sshd [1269]: Server listening on 0.0.0.0 port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Also, tail has a special option -f to view a file in real time changes (especially log files).

The following command will enable you to monitor changes to the specified file:

tecmint@TecMint ~ $ tail -f /var/log/auth.log
Jan 6 12:58:01 TecMint sshd[1269]: Server listening on :: port 22.
Jan 6 12:58:11 TecMint sshd[1269]: Received SIGHUP; restarting.
Jan 6 12:58:12 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22.
Jan 6 12:58:12 TecMint sshd[1269]: Server listening on :: port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22.
Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Read the man manual for tail for a complete list of options and instructions for using it:

$ man tail
6. The sort The sort command
sorts the lines of a text file or standard input.

Here is the content of a file called domain.list:

tecmint@TecMint ~ $ cat domains.list
tecmint.com
tecmint.com
news.tecmint.com
news.tecmint.com
linuxsay.com
linuxsay.com
windowsmint.com
windowsmint.com
You can run a simple sort command like this to sort the file contents:

tecmint@TecMint ~ $ sort domains.list
linuxsay.com
linuxsay.com
news.tecmint.com
news.tecmint.com
tecmint.com
tecmint.com
windowsmint.com
windowsmint .com
You can use the sort command in a number of ways, see below for some useful articles on the sort command.

14 Useful Examples of
Linux's 'sort' Command (1) Seven Interesting Examples of Linux's 'sort' Command (2) How to
Find and Sort Files Based on Modification Date and Time Ignoring duplicate lines, it filters lines from standard input and writes the result to standard output. After running sort on an input stream, you can use uniq to remove duplicate lines, as shown in the following example. To display the number of line occurrences, use the -c option. To ignore case differences when comparing, use the -i option: tecmint@TecMint ~ $ cat domains.list tecmint.com tecmint.com news.tecmint.com news.tecmint .com linuxsay.com linuxsay.com windowsmint.com tecmint@TecMint ~ $ sort domains.list | uniq -c 2 linuxsay.com 2 news.tecmint.com 2 tecmint.com 1 windowsmint.com Obtained by reading the uniq man manual Further usage information and options: $ man uniq 8, fmt command line
























fmt is a simple optimized text formatter that reformats paragraphs of a specified file and prints the result to standard output.

The following is extracted from the file domain-list.txt:

1.tecmint.com 2.news.tecmint.com 3.linuxsay.com 4.windowsmint.com
To reformat the above into a standard list, run the following command, use the -w option to define the maximum line width:

tecmint@TecMint ~ $ cat domain-list.txt
1.tecmint.com 2.news.tecmint.com 3.linuxsay.com 4.windowsmint.com
tecmint@TecMint ~ $ fmt -w 1 domain-list.txt
1.tecmint.com
2.news.tecmint.com
3.linuxsay.com
4.windowsmint.com
9. pr command The
pr command converts a text file or prints out the standard input. For example on a Debian system, you can display all installed packages like this:

$ dpkg -l
To organize the printed list into pages and columns, use the following command.

tecmint@TecMint ~ $ dpkg -l | pr --columns 3 -l 20
2017-01-06 13:19 Page 1
Desired=Unknown/Install ii adduser ii apg
| Status=Not/Inst/Conf- ii adwaita-icon-theme ii app-install-data
|/ Err?=(none)/Reinst-r ii adwaita-icon-theme- ii apparmor
||/ Name ii alsa-base ii apt
+++-=================== ii alsa-utils ii apt-clone
ii accountsservice ii anacron ii apt-transport-https
ii acl ii apache2 ii apt-utils
ii acpi-support ii apache2-bin ii apt-xapian-index
ii acpid ii apache2-data ii aptdaemon
ii add-apt-key ii apache2-utils ii aptdaemon-data
2017-01-06 13:19 Page 2
ii aptitude ii avahi-daemon ii bind9-host
ii aptitude-common ii avahi-utils ii binfmt-support
ii apturl ii aview ii binutils
ii apturl-common ii banshee ii bison
ii archdetect-deb ii baobab ii blt
ii aspell ii base-files ii blueberry
ii aspell-en ii base-passwd ii bluetooth
ii at-spi2-core ii bash ii bluez
ii attr ii bash-completion ii bluez-cups
ii avahi-autoipd ii bc ii bluez-obexd
.....
where the flags used are as follows:

--column Defines the number of columns to create in the output.
-l Specifies the length of the page (default is 66 lines).
10. tr command line
This command converts or deletes characters from standard input, and then outputs the result to standard output.

The syntax for using tr is as follows:

$ tr options set1 set2
Take a look at the following example, in the first command, set1( [:upper:] ) represents the upper and lower case of the specified input characters (all uppercase characters). set2([:lower:]) represents the upper and lower case of the expected result character. The second example has a similar meaning, the escape character \n means print the output on a new line:

tecmint@TecMint ~ $ echo "WWW.TECMINT.COM" | tr [:upper:] [:lower:]
www.tecmint.com
tecmint@TecMint ~ $ echo "news.tecmint.com" | tr [:lower:] [:upper:]
NEWS.TECMINT.COM
11. more command
The more command is a useful file filter, originally built for viewing certificates . It displays the contents of the file page by page, and the user can display more information by pressing Enter.

You can use it like this to display large files:

tecmint@TecMint ~ $ dmesg | more
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 4.4.0-21 -generic (buildd@lgw01-21) (gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2) ) #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 (Ubuntu 4.4.0-21.37-generic
4.4 .6)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-21-generic root=UUID=bb29dda3-bdaa-4b39-86cf-4a6dc9634a1b ro quiet splash vt.handoff=7
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[ 0.000000] x86/fpu: Using 'eager' FPU context switches.
[ 0.000000] e820: BIOS-provided physical RAM map:
[0.000000] the BIOS-E820: [MEM 0x0000000000000000-0x000000000009d3ff] Usable
[0.000000] the BIOS-E820: [MEM 0x000000000009d400-0x000000000009ffff] Reserved
[0.000000] the BIOS-E820: [MEM 0x00000000000e0000-0x00000000000fffff] Reserved
[0.000000] the BIOS-E820: [ 0x0000000000100000-0x00000000a56affff MEM] Usable
[0.000000] the BIOS-E820: [MEM 0x00000000a56b0000-0x00000000a5eaffff] Reserved
[0.000000] the BIOS-E820: [MEM 0x00000000a5eb0000-0x00000000aaabefff]
Usable --More--
12 is, less commands
less and is more command above The opposite of a command, but it provides extra features and is faster for large files.

Use it the same way as the more command:

tecmint@TecMint ~ $ dmesg | less
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 4.4.0-21-generic (buildd@lgw01-21) (gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2) ) #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 (Ubuntu 4.4.0-21.37-generic
4.4.6)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-21-generic root=UUID=bb29dda3-bdaa-4b39-86cf-4a6dc9634a1b ro quiet splash vt.handoff=7
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[ 0.000000] x86/fpu: Using 'eager' FPU context switches.
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d3ff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009d400-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000a56affff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000a56b0000-0x00000000a5eaffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000a5eb0000-0x'00000000aaabefff] usable
:
Learn why the more command is faster than Linux.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326208503&siteId=291194637