Linux problem location command

Linux problem location command

1. The overall situation

【1】top

Use this command to view the system's CPU, load, etc.

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
2014 root      20   0 2653m  28m  10m S 100.1  1.5   8:44.59 java

Check the process with the highest CPU utilization rate, you can see that it is a Java program

load average: 1.00, 0.80, 0.42

The upper right corner is the load situation of the system, respectively representing 1 minute, 5 minutes and 15 minutes of sampling. If the sum of these values ​​is divided by 3 and then multiplied by 100%, it exceeds 60%, which means that the current system load is under pressure

【2】uptime

topShort version of the command

 21:09:18 up 15 min,  2 users,  load average: 1.00, 0.89, 0.53

This command is mainly able to view the load of the machine

2. CPU status

【1】vmstat

grammar vmstat -n 采样时间间隔 采样次数

For example:, vmstat -n 2 3sampling once every 2 seconds, a total of 3 samples

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 1645680  18376 153000    0    0    68     2  413   44 39  0 61  0  0	
 1  0      0 1645600  18376 153000    0    0     0     0 1031   56 50  0 50  0  0	
 1  0      0 1645600  18376 153000    0    0     0     0 1030   58 50  0 50  0  0

Through this command, you can view some system information including CPU

procs:

  • r: The number of processes running and waiting for the CPU time slice, the number should not exceed 2 times the total number of cores, otherwise it means that the system is under high pressure
  • b: The number of processes waiting for resources, including disk I/O waiting, network I/O waiting, etc.

cpu:

  • us: CPU usage of user processes
  • sy: CPU usage of system processes
  • id: percentage of free CPU
  • wa: CPU percentage waiting for resource I/O

If the value of us + sy is greater than 80%, the CPU usage may be too high

【2】mpstat

grammar mpstat -P ALL 采样时间间隔 采样次数

For example:, mpstat -P ALL 2 3sampling once every 2 seconds, a total of 3 samples

Linux 2.6.32-642.el6.x86_64 (???) 	2019年??月??日 	_x86_64_	(2 CPU)

21时30分07秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
21时30分09秒  all   50.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   50.00
21时30分09秒    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
21时30分09秒    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

21时30分09秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
21时30分11秒  all   50.12    0.00    0.00    0.00    0.00    0.00    0.00    0.00   49.88
21时30分11秒    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
21时30分11秒    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

21时30分11秒  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
21时30分13秒  all   50.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   50.00
21时30分13秒    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
21时30分13秒    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

平均时间:  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
平均时间:  all   50.04    0.00    0.00    0.00    0.00    0.00    0.00    0.00   49.96
平均时间:    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
平均时间:    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

It can be seen that the CPU of this machine is dual-core. One of them (CPU0) has a user usage rate (%usr) of 100.00%, and the idle rate (%idle) is 0.00%, which shows that the current CPU pressure of the machine is very high.

【3】pidstat

grammar pidstat -p 进程号 -u 采样间隔 采样次数

E.g:

We ps -ef|grep javacommand to get to the Java program's process ID 2014

root       2014   1981 99 20:58 pts/0    00:44:03 java A

Pass again pidstat -p 2014 -u 2 3to get the CPU usage of the program

Linux 2.6.32-642.el6.x86_64 (???) 	2019年??月??日 	_x86_64_	(2 CPU)

21时44分03秒       PID    %usr %system  %guest    %CPU   CPU  Command
21时44分05秒      2014  100.00    0.00    0.00  100.00     1  java
21时44分07秒      2014  100.00    0.00    0.00  100.00     1  java
21时44分09秒      2014  100.00    0.00    0.00  100.00     1  java
平均时间:      2014  100.00    0.00    0.00  100.00     -  java

You can see that the current CPU usage of the program has reached 100.00% (%usr)

[4] High CPU usage in actual combat

  1. topTo see %CPUthe highest processPID
  2. ps -mp 进程ID -o THREAD,tid,time, View the specific threadTID
    • -m: display all threads
    • -o: display the format specified by the user
    • -p process ID: display the time the process uses the CPU
  3. The thread TID in decimal, 转换为十六进制the thread TID, and must be lowercase
  4. jstack 进程ID | grep 十六进制线程ID -A显示的行数, View the stack information of the specified thread, the previous specified number of lines

Example:

public class HighCpu {
    
    
    public static void main(String[] args) {
    
    
        for (;;) {
    
    
        }
    }
}

Problem code

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
3619 root      20   0 2653m  22m  10m S 100.1  1.1   0:10.39 java

[1] The top view process ID is 3619

ps -mp 3619 -o THREAD,tid,time

USER     %CPU PRI SCNT WCHAN  USER SYSTEM    TID     TIME
root     98.9   -    - -         -      -      - 00:00:25
root      0.0  19    - futex_    -      -   3619 00:00:00
root     98.8  19    - -         -      -   3620 00:00:25
root      0.0  19    - futex_    -      -   3621 00:00:00
root      0.0  19    - futex_    -      -   3622 00:00:00
root      0.0  19    - futex_    -      -   3623 00:00:00
root      0.0  19    - futex_    -      -   3624 00:00:00
root      0.0  19    - futex_    -      -   3625 00:00:00
root      0.0  19    - futex_    -      -   3626 00:00:00
root      0.0  19    - futex_    -      -   3627 00:00:00
root      0.0  19    - futex_    -      -   3628 00:00:00
root      0.0  19    - futex_    -      -   3629 00:00:00
root      0.0  19    - futex_    -      -   3630 00:00:00

[2] View the specific thread ID as 3620

[3] Convert 3620 to hexadecimal, lowercase, namely: e24

jstack 3619 | grep e24 -A50

"main" #1 prio=5 os_prio=0 tid=0x00007f7f24009000 nid=0xe8f runnable [0x00007f7f28d08000]
   java.lang.Thread.State: RUNNABLE
	at ???.HighCpu.main(HighCpu.java:3)

"VM Thread" os_prio=0 tid=0x00007f7f24073000 nid=0xe92 runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f7f2401e000 nid=0xe90 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f7f24020000 nid=0xe91 runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f7f240d7000 nid=0xe99 waiting on condition 

JNI global references: 5

[4] Through the jstack command, check the first 50 lines of the code executed by thread e24, and you can see that the problem code is in line 3 of the HighCpu.java file

3. Memory status

【1】free

grammar free -m

For example: free -mUse MB as the unit to view the memory usage

             total       used       free     shared    buffers     cached
Mem:          1990        562       1427          1         51        276
-/+ buffers/cache:        234       1756
Swap:         2047          0       2047

You can see that the physical memory is 2GB and 562MB has been used. 2GB of swap memory, unused

The application occupies memory/physical memory, which is reasonable at 20% ~ 70%

【2】pidstat

grammar pidstat -p 进程号 -r 采样间隔 采样次数

For example:, pidstat -p 4526 -r 2 3view the memory usage of the 4526 process

Linux 2.6.32-642.el6.x86_64 (???) 	2019年??月??日 	_x86_64_	(2 CPU)

22时31分18秒       PID  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
22时31分20秒      4526      0.00      0.00 2716884 262204  12.86  java
22时31分22秒      4526      0.00      0.00 2716884 262204  12.86  java
22时31分24秒      4526    271.50      0.00 2716884 299048  14.67  java
平均时间:      4526     90.50      0.00 2716884 274485  13.47  java

You can see that the memory usage of the program has reached 14.67%

4. Disk condition

【1】df

Use commands df -hto view the disk usage and remaining status

E.g:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        18G  4.3G   13G  26% /
tmpfs           996M   72K  996M   1% /dev/shm
/dev/sda1       190M   39M  142M  22% /boot

You can see the system boot area (mount point is /boot), a total of 190MB, 39MB (about 22%) used, and 142MB remaining

The root directory (mount point is /), a total of 18GB, 4.3GB (about 26%) used, and 13GB remaining

【2】iostat

Syntax iostat -dkx 采样间隔 采样次数, you can view the disk I/O status

  • -d: display the usage status of the device (disk)
  • -k: Display disk output in KB
  • -x: include expanded disk metrics in the output

This command can be used with lsblkcommands (list all block devices and display the dependencies between them)

E.g:lsblk

NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1 1024M  0 rom  
sda      8:0    0   20G  0 disk 
├─sda1   8:1    0  200M  0 part /boot
├─sda2   8:2    0    2G  0 part [SWAP]
└─sda3   8:3    0 17.8G  0 part /

For example: iostat -dkx 1 3to view the current disk I/O status

Linux 2.6.32-642.el6.x86_64 (???) 	2019年??月??日 	_x86_64_	(2 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.88     0.52    2.20    0.36    41.71     3.52    35.35     0.00    0.96    0.90    1.32   0.78   0.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Need to pay attention to several parameters:

  • rkB/s: The number of gigabytes read per second
  • wkB/s: The number of gigabytes written per second
  • await: Disk response time, the average waiting time (milliseconds) for each device I/O operation. The shorter the time, the better
  • %util: What percentage of the statistical time interval seconds are used for I/O operations, that is, the value = device I/O operation time/statistical interval time. So this value implies how busy the device is. If the value is 100%, it means that the device is running at full capacity (if it is multiple disks, because of the concurrency of the disk, the disk usage may not have reached the bottleneck)

【3】pidstat

grammar pidstat -p 进程ID -d 采样间隔 采样次数

Note: The -d parameter is only available for kernel 2.6.20 and later versions . You can uname -rview the kernel release number information through

E.g:

2.6.32-642.el6.x86_64

You can see that the kernel version is 2.6.32

E.g:pidstat -p 4969 -d 1 3

Linux 2.6.32-642.el6.x86_64 (???) 	2019年??月??日 	_x86_64_	(2 CPU)

23时30分41秒       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
23时30分42秒      4969      0.00      0.00      0.00  java
23时30分43秒      4969      0.00      0.00      0.00  java
23时30分44秒      4969      0.00      0.00      0.00  java
平均时间:      4969      0.00      0.00      0.00  java

You can view the number of gigabytes read per second (kB_rd/s) and the number of gigabytes that have been completed or will be written (kB_wr/s)

5. Network status

ifstat

Note that this command is not a Linux system comes, we need to the official website to download and compile and install it

The latest version is v1.1-01/01/2004 "The Happy New Year Release"

-rwxr-xr-x. 1 root root     67920 10月  5 23:57 ifstat-1.1.tar.gz

grammar ifstat -a 采样间隔 采样次数

Listen to all network ports, including loopback port (lo)

E.g:ifstat -a 1 3

        lo                 eth0       
 KB/s in  KB/s out   KB/s in  KB/s out
    0.00      0.00      0.06      0.18
    0.00      0.00      0.06      0.13
    0.00      0.00      0.06      0.13

You can view the read-in volume (KB/s in) and read-out volume (KB/s out) of the eth0 network card per second

Guess you like

Origin blog.csdn.net/adsl624153/article/details/103865696