Linux problem location command
1. The overall situation
【1】top
Use this command to view the system's CPU, load, etc.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2014 root 20 0 2653m 28m 10m S 100.1 1.5 8:44.59 java
Check the process with the highest CPU utilization rate, you can see that it is a Java program
load average: 1.00, 0.80, 0.42
The upper right corner is the load situation of the system, respectively representing 1 minute, 5 minutes and 15 minutes of sampling. If the sum of these values is divided by 3 and then multiplied by 100%, it exceeds 60%, which means that the current system load is under pressure
【2】uptime
top
Short version of the command
21:09:18 up 15 min, 2 users, load average: 1.00, 0.89, 0.53
This command is mainly able to view the load of the machine
2. CPU status
【1】vmstat
grammar vmstat -n 采样时间间隔 采样次数
For example:, vmstat -n 2 3
sampling once every 2 seconds, a total of 3 samples
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 1645680 18376 153000 0 0 68 2 413 44 39 0 61 0 0
1 0 0 1645600 18376 153000 0 0 0 0 1031 56 50 0 50 0 0
1 0 0 1645600 18376 153000 0 0 0 0 1030 58 50 0 50 0 0
Through this command, you can view some system information including CPU
procs:
- r: The number of processes running and waiting for the CPU time slice, the number should not exceed 2 times the total number of cores, otherwise it means that the system is under high pressure
- b: The number of processes waiting for resources, including disk I/O waiting, network I/O waiting, etc.
cpu:
- us: CPU usage of user processes
- sy: CPU usage of system processes
- id: percentage of free CPU
- wa: CPU percentage waiting for resource I/O
If the value of us + sy is greater than 80%, the CPU usage may be too high
【2】mpstat
grammar mpstat -P ALL 采样时间间隔 采样次数
For example:, mpstat -P ALL 2 3
sampling once every 2 seconds, a total of 3 samples
Linux 2.6.32-642.el6.x86_64 (???) 2019年??月??日 _x86_64_ (2 CPU)
21时30分07秒 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
21时30分09秒 all 50.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50.00
21时30分09秒 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
21时30分09秒 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
21时30分09秒 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
21时30分11秒 all 50.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 49.88
21时30分11秒 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
21时30分11秒 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
21时30分11秒 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
21时30分13秒 all 50.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50.00
21时30分13秒 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
21时30分13秒 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
平均时间: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
平均时间: all 50.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 49.96
平均时间: 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
平均时间: 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
It can be seen that the CPU of this machine is dual-core. One of them (CPU0) has a user usage rate (%usr) of 100.00%, and the idle rate (%idle) is 0.00%, which shows that the current CPU pressure of the machine is very high.
【3】pidstat
grammar pidstat -p 进程号 -u 采样间隔 采样次数
E.g:
We ps -ef|grep java
command to get to the Java program's process ID 2014
root 2014 1981 99 20:58 pts/0 00:44:03 java A
Pass again pidstat -p 2014 -u 2 3
to get the CPU usage of the program
Linux 2.6.32-642.el6.x86_64 (???) 2019年??月??日 _x86_64_ (2 CPU)
21时44分03秒 PID %usr %system %guest %CPU CPU Command
21时44分05秒 2014 100.00 0.00 0.00 100.00 1 java
21时44分07秒 2014 100.00 0.00 0.00 100.00 1 java
21时44分09秒 2014 100.00 0.00 0.00 100.00 1 java
平均时间: 2014 100.00 0.00 0.00 100.00 - java
You can see that the current CPU usage of the program has reached 100.00% (%usr)
[4] High CPU usage in actual combat
top
To see%CPU
the highest processPID
ps -mp 进程ID -o THREAD,tid,time
, View the specific threadTID
- -m: display all threads
- -o: display the format specified by the user
- -p process ID: display the time the process uses the CPU
- The thread TID in decimal,
转换为十六进制
the thread TID, and must be lowercase jstack 进程ID | grep 十六进制线程ID -A显示的行数
, View the stack information of the specified thread, the previous specified number of lines
Example:
public class HighCpu {
public static void main(String[] args) {
for (;;) {
}
}
}
Problem code
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3619 root 20 0 2653m 22m 10m S 100.1 1.1 0:10.39 java
[1] The top view process ID is 3619
ps -mp 3619 -o THREAD,tid,time
USER %CPU PRI SCNT WCHAN USER SYSTEM TID TIME
root 98.9 - - - - - - 00:00:25
root 0.0 19 - futex_ - - 3619 00:00:00
root 98.8 19 - - - - 3620 00:00:25
root 0.0 19 - futex_ - - 3621 00:00:00
root 0.0 19 - futex_ - - 3622 00:00:00
root 0.0 19 - futex_ - - 3623 00:00:00
root 0.0 19 - futex_ - - 3624 00:00:00
root 0.0 19 - futex_ - - 3625 00:00:00
root 0.0 19 - futex_ - - 3626 00:00:00
root 0.0 19 - futex_ - - 3627 00:00:00
root 0.0 19 - futex_ - - 3628 00:00:00
root 0.0 19 - futex_ - - 3629 00:00:00
root 0.0 19 - futex_ - - 3630 00:00:00
[2] View the specific thread ID as 3620
[3] Convert 3620 to hexadecimal, lowercase, namely: e24
jstack 3619 | grep e24 -A50
"main" #1 prio=5 os_prio=0 tid=0x00007f7f24009000 nid=0xe8f runnable [0x00007f7f28d08000]
java.lang.Thread.State: RUNNABLE
at ???.HighCpu.main(HighCpu.java:3)
"VM Thread" os_prio=0 tid=0x00007f7f24073000 nid=0xe92 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f7f2401e000 nid=0xe90 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f7f24020000 nid=0xe91 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f7f240d7000 nid=0xe99 waiting on condition
JNI global references: 5
[4] Through the jstack command, check the first 50 lines of the code executed by thread e24, and you can see that the problem code is in line 3 of the HighCpu.java file
3. Memory status
【1】free
grammar free -m
For example: free -m
Use MB as the unit to view the memory usage
total used free shared buffers cached
Mem: 1990 562 1427 1 51 276
-/+ buffers/cache: 234 1756
Swap: 2047 0 2047
You can see that the physical memory is 2GB and 562MB has been used. 2GB of swap memory, unused
The application occupies memory/physical memory, which is reasonable at 20% ~ 70%
【2】pidstat
grammar pidstat -p 进程号 -r 采样间隔 采样次数
For example:, pidstat -p 4526 -r 2 3
view the memory usage of the 4526 process
Linux 2.6.32-642.el6.x86_64 (???) 2019年??月??日 _x86_64_ (2 CPU)
22时31分18秒 PID minflt/s majflt/s VSZ RSS %MEM Command
22时31分20秒 4526 0.00 0.00 2716884 262204 12.86 java
22时31分22秒 4526 0.00 0.00 2716884 262204 12.86 java
22时31分24秒 4526 271.50 0.00 2716884 299048 14.67 java
平均时间: 4526 90.50 0.00 2716884 274485 13.47 java
You can see that the memory usage of the program has reached 14.67%
4. Disk condition
【1】df
Use commands df -h
to view the disk usage and remaining status
E.g:
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 18G 4.3G 13G 26% /
tmpfs 996M 72K 996M 1% /dev/shm
/dev/sda1 190M 39M 142M 22% /boot
You can see the system boot area (mount point is /boot), a total of 190MB, 39MB (about 22%) used, and 142MB remaining
The root directory (mount point is /), a total of 18GB, 4.3GB (about 26%) used, and 13GB remaining
【2】iostat
Syntax iostat -dkx 采样间隔 采样次数
, you can view the disk I/O status
- -d: display the usage status of the device (disk)
- -k: Display disk output in KB
- -x: include expanded disk metrics in the output
This command can be used with lsblk
commands (list all block devices and display the dependencies between them)
E.g:lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 200M 0 part /boot
├─sda2 8:2 0 2G 0 part [SWAP]
└─sda3 8:3 0 17.8G 0 part /
For example: iostat -dkx 1 3
to view the current disk I/O status
Linux 2.6.32-642.el6.x86_64 (???) 2019年??月??日 _x86_64_ (2 CPU)
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.88 0.52 2.20 0.36 41.71 3.52 35.35 0.00 0.96 0.90 1.32 0.78 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Need to pay attention to several parameters:
- rkB/s: The number of gigabytes read per second
- wkB/s: The number of gigabytes written per second
- await: Disk response time, the average waiting time (milliseconds) for each device I/O operation. The shorter the time, the better
- %util: What percentage of the statistical time interval seconds are used for I/O operations, that is, the value = device I/O operation time/statistical interval time. So this value implies how busy the device is. If the value is 100%, it means that the device is running at full capacity (if it is multiple disks, because of the concurrency of the disk, the disk usage may not have reached the bottleneck)
【3】pidstat
grammar pidstat -p 进程ID -d 采样间隔 采样次数
Note: The -d parameter is only available for kernel 2.6.20 and later versions . You can uname -r
view the kernel release number information through
E.g:
2.6.32-642.el6.x86_64
You can see that the kernel version is 2.6.32
E.g:pidstat -p 4969 -d 1 3
Linux 2.6.32-642.el6.x86_64 (???) 2019年??月??日 _x86_64_ (2 CPU)
23时30分41秒 PID kB_rd/s kB_wr/s kB_ccwr/s Command
23时30分42秒 4969 0.00 0.00 0.00 java
23时30分43秒 4969 0.00 0.00 0.00 java
23时30分44秒 4969 0.00 0.00 0.00 java
平均时间: 4969 0.00 0.00 0.00 java
You can view the number of gigabytes read per second (kB_rd/s) and the number of gigabytes that have been completed or will be written (kB_wr/s)
5. Network status
ifstat
Note that this command is not a Linux system comes, we need to the official website to download and compile and install it
The latest version is v1.1-01/01/2004 "The Happy New Year Release"
-rwxr-xr-x. 1 root root 67920 10月 5 23:57 ifstat-1.1.tar.gz
grammar ifstat -a 采样间隔 采样次数
Listen to all network ports, including loopback port (lo)
E.g:ifstat -a 1 3
lo eth0
KB/s in KB/s out KB/s in KB/s out
0.00 0.00 0.06 0.18
0.00 0.00 0.06 0.13
0.00 0.00 0.06 0.13
You can view the read-in volume (KB/s in) and read-out volume (KB/s out) of the eth0 network card per second