zabbix3.0监控详解

zabbix3.0监控详解

运维我生哥

 


第1章 前言

1.1 我们的职责

1.    保障企业数据的安全可靠。

2.    为客户提供7*24小时服务。

3.    不断提升用户的体验。

    http://blog.csdn.net/pan_tian/article/details/23270119

网站可用性

所谓网站可用性(availability)也即网站正常运行时间的百分比,业界用 N 个9 来量化可用性, 最常说的就是类似 “4个9(也就是99.99%)” 的可用性。

    描述    通俗叫法    可用性级别  年度停机时间

    基本可用性  2个9   99% 87.6小时

    较高可用性  3个9   99.9%   8.8小时

    具有故障自动恢复能力的可用性    4个9   99.99%  53分钟

    极高可用性  5个9   99.999% 5分钟

1.2 通过命令监控服务器

        如果想远程管理服务器就有远程管理卡,比如Dell idRAC,HP ILO,IBM IMM

        查看硬件的温度/风扇转速,电脑有撸大师,服务器就有ipmitool。使用ipmitool实现对服务器的命令行远程管理

    查看硬件的温度/风扇转速,电脑有撸大师,服务器就有ipmitool。使用ipmitool实现对服务器的命令行远程管理

<span style="color:#333333"><span style="color:black"><code class="language-bash">       yum -y <span style="color:#dd4a68">install</span> OpenIPMI ipmitool  <span style="color:slategray">#->IPMI在物理机可以成功,虚拟机不行</span>
 
        <span style="color:#999999">[</span>root@KVM ~<span style="color:#999999">]</span><span style="color:slategray"># ipmitool sdr type Temperature</span>
        Temp             <span style="color:#9a6e3a">|</span> 01h <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  3.1 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 02h <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  3.2 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 05h <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span> 10.1 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 06h <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span> 10.2 <span style="color:#9a6e3a">|</span> Disabled
        Ambient Temp     <span style="color:#9a6e3a">|</span> 0Eh <span style="color:#9a6e3a">|</span> ok  <span style="color:#9a6e3a">|</span>  7.1 <span style="color:#9a6e3a">|</span> 22 degrees C
        Planar Temp      <span style="color:#9a6e3a">|</span> 0Fh <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  7.1 <span style="color:#9a6e3a">|</span> Disabled
        IOH THERMTRIP    <span style="color:#9a6e3a">|</span> 5Dh <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  7.1 <span style="color:#9a6e3a">|</span> Disabled
        CPU Temp Interf  <span style="color:#9a6e3a">|</span> 76h <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  7.1 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 0Ah <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  8.1 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 0Bh <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  8.1 <span style="color:#9a6e3a">|</span> Disabled
        Temp             <span style="color:#9a6e3a">|</span> 0Ch <span style="color:#9a6e3a">|</span> ns  <span style="color:#9a6e3a">|</span>  8.1 <span style="color:#9a6e3a">|</span> Disabled</code></span></span>

1.2.1 查看cpu的信息

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># lscpu</span>
Architecture:          x86_64
CPU op-mode<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:        32-bit, 64-bit
Byte Order:            Little Endian
CPU<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:                2
On-line CPU<span style="color:#999999">(</span>s<span style="color:#999999">)</span> list:   0,1
Thread<span style="color:#999999">(</span>s<span style="color:#999999">)</span> per core:    1
Core<span style="color:#999999">(</span>s<span style="color:#999999">)</span> per socket:    1
Socket<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:             2
NUMA node<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 78
Model name:            Intel<span style="color:#999999">(</span>R<span style="color:#999999">)</span> Core<span style="color:#999999">(</span>TM<span style="color:#999999">)</span> i5-6200U CPU @ 2.30GHz
Stepping:              3
CPU MHz:               2400.001
BogoMIPS:              4800.00
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              3072K
NUMA node0 CPU<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:     0,1
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

1.2.2 查看系统负载

 

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># uptime</span>
 03:17:33 up  1:03,  2 users,  load average: 0.18, 0.05, 0.01
当前系统时间up 运行时间 2users登录的用户数 平均负载 1,5,15minutes</code></span></span>

 

最佳负载:过去一分钟的平均负载等于CPU的核数(或者两倍)

怎么判断服务器的负载过高:就是看你的过去1分钟的平均负载是否超过CPU的核数(或者2倍)

1.2.3 top实时动态

 

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#dd4a68">top</span> - 03:27:14 up  1:13,  2 users,  load average: 0.00, 0.02, 0.00
<span style="color:slategray">#第一行和uptime一样</span>
Tasks:  90 total,   1 running,  89 sleeping,   0 stopped,   0 zombie
<span style="color:slategray">#第二行 显示当前进程统计信息</span>
Cpu<span style="color:#999999">(</span>s<span style="color:#999999">)</span>:  0.0%us,  0.2%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  <span style="color:slategray">#第三行 CPU的统计信息  0.0%us,用户使用的cpu百分比 0.2%sy,系统使用CPU百分比 %id空闲的CPU百分比</span>
Mem:   1004112k total,   395132k used,   608980k free,    26712k buffer <span style="color:slategray">#第四行:内存的统计信息</span>
Swap:   786428k total,        0k used,   786428k free,   256904k cached
<span style="color:slategray">#第五行:swap统计信息  </span>
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
     1 root      20   0 19352 1528 1228 S  0.0  0.2   0:01.70 init     
     2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd 
     3 root      RT   0     0    0    0 S  0.0  0.0   0:00.87 migration/
     4 root      20   0     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd/
     5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/0
     6 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
     7 root      RT   0     0    0    0 S  0.0  0.0   0:00.53 migration/
     8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/1
     9 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/ 快捷键 z 加颜色 x高亮显示 <span style="color:#9a6e3a">></span> 向右<span style="color:#9a6e3a"><</span>向左</code></span></span>

 

1.2.4 显示内存信息(free、vmstat)

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># free -h</span>
             total       used       <span style="color:#dd4a68">free</span>     shared    buffers     cached
Mem:          980M       480M       500M       228K        26M       341M
-/+ buffers/cache:       112M       868M
Swap:         767M         0B       767M
说明:centos6.5以前没有-h参数,只有-m</code></span></span>

vmstat用法

l  参数

l  -a:显示活跃和非活跃内存

l  -f:显示从系统启动至今的fork数量 。-m:显示slabinfo

l  -n:只在开始时显示一次各字段名称。

l  -s:显示内存相关统计信息及多种系统活动数量。

u  delay:刷新时间间隔。如果不指定,只显示一条结果。

u  count:刷新次数。如果不指定刷新次数,但指定了刷新时间间隔,这时刷新次数为无穷。

l  -d:显示磁盘相关统计信息。

l  -p:显示指定磁盘分区统计信息

l  -S:使用指定单位显示。参数有 k 、K 、m 、M ,分别代表1000、1024、1000000、1048576字节(byte)。默认单位为K(1024 bytes)

l  -V:显示vmstat版本信息。

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># vmstat 2    #每二秒显示一次系统内存的统计信息</span>
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   <span style="color:#dd4a68">free</span>   buff  cache   si   so    bi    bo   <span style="color:#0077aa">in</span>   cs us sy <span style="color:#dd4a68">id</span> wa st
 0  0   5748   7808  24796 168440    0    1    88   243   74  143  1  1 95  4  0   
 0  0   5748   7644  24828 168444    0    0     2    42   68  123  3  1 93  4  0   
 0  0   5748   7644  24828 168444    0    0     0     0   28   56  0  0 100  0  0  
 0  0   5748   7620  24852 168444    0    0     0   138   41  126  0  0 99  0  0   
^C
说明: 类别    项目    含义                                        说明
Procs    r        等待执行的任务数                            展示了正在执行和等待CPU资源的任务个数。当这个值超过了CPU数目,就会出现CPU瓶颈了
        b        处在非中断睡眠状态的进程数     
       
Memory    swpd    正在使用的swap大小单位K    
        <span style="color:#dd4a68">free</span>    空闲的内存空间    
        buff    已使用的buff大小,对块设备的读写进行缓冲    
        cache    已使用的cache大小,文件系统的cache    
        inact    非活跃内存大小     
        active    活跃的内存大小     
       
Swap    si    交换内存使用,由磁盘调入内存    
        so    交换内存使用,由内存调入磁盘    
       
IO      bi    从块设备读入的数据总量<span style="color:#999999">(</span>读磁盘<span style="color:#999999">)</span> <span style="color:#999999">(</span>KB/s<span style="color:#999999">)</span>,    
        bo    写入到块设备的数据总理<span style="color:#999999">(</span>写磁盘<span style="color:#999999">)</span> <span style="color:#999999">(</span>KB/s<span style="color:#999999">)</span>    
       
System    <span style="color:#0077aa">in</span>    每秒产生的中断次数    
          cs      每秒产生的上下文切换次数                        上面这2个值越大,会看到由内核消耗的CPU时间会越多
       
CPU       us    用户进程消耗的CPU时间百分比                     us 的值比较高时,说明用户进程消耗的CPU时间多,但是如果长期超过50% 的使用,那么我们就该考虑优化程序算法或者进行加速了
          sy    内核进程消耗的CPU时间百分比                     sy 的值高时,说明系统内核消耗的CPU资源多,这并不是良性的表现,我们应该检查原因。
          <span style="color:#dd4a68">id</span>    空闲    
          wa    IO等待消耗的CPU时间百分比                       wa 的值高时,说明IO等待比较严重,这可能是由于磁盘大量作随机访问造成,也有可能是磁盘的带宽出现瓶颈<span style="color:#999999">(</span>块操作<span style="color:#999999">)</span>。 <span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># vmstat 2 5   #每二秒显示一次系统内存的统计信息,总共5次</span>
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   <span style="color:#dd4a68">free</span>   buff  cache   si   so    bi    bo   <span style="color:#0077aa">in</span>   cs us sy <span style="color:#dd4a68">id</span> wa st
 0  0   5748   8108  25668 168456    0    1    86   239   73  142  1  1 95  3  0   
 0  0   5748   8108  25724 168472    0    0     0   180   48  108  0  0 99  1  0   
 0  0   5748   8108  25740 168456    0    0     0    16   33   76  0  1 100  0  0  
 0  0   5748   8108  25788 168480    0    0     2    68   77  143  3  0 96  1  0   
 0  0   5748   8108  25788 168476    0    0     0     0   26   57  0  1 100  0  0  
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

1.2.5 htop

<span style="color:#333333"><span style="color:black"><code class="language-bash">安装
<span style="color:#0077aa">echo</span> <span style="color:#669900">"192.168.12.200 mirrors.aliyun.com"</span> <span style="color:#9a6e3a">>></span> /etc/hosts
<span style="color:#dd4a68">wget</span> -O /etc/yum.repos.d/CentOS-Base.repo http://192.168.12.200/repo/Centos-6.repo
<span style="color:#dd4a68">wget</span> -O /etc/yum.repos.d/epel.repo http://192.168.12.200/repo/epel-6.repo
yum clean all
yum -y <span style="color:#dd4a68">install</span> <span style="color:#dd4a68">htop</span></code></span></span>

image.png

1.2.6 显示磁盘信息

<span style="color:#333333"><span style="color:black"><code class="language-cf">[root@m01 tools]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       8.8G  1.1G  7.4G  13% /
tmpfs           491M     0  491M   0% /dev/shm
/dev/sda1       190M   35M  146M  19% /boot
[root@m01 tools]#
[root@m01 tools]# dd if=/dev/zero of=tese.data bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0389443 s, 269 MB/s
[root@m01 tools]# if input file输入文件 /dev/zero 这是系统的特殊设备,能够源源不断的产生0字符流 of  output file 输出设备
bs block size 块大小
count block 快的数量 总结:产生的test.data文件大小 bs * count 经验:最佳测试磁盘写的速度的测试文件</code></span></span>

 

 

1.2.7 iotop实时查看系统io(输入输出)负载

<span style="color:#333333"><span style="color:black"><code class="language-bash">Total DISK READ: 0.00 B/s <span style="color:#9a6e3a">|</span> Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO   COMMAND<span style="color:#9a6e3a"><</span> 
 1308 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % -bash
 2488 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % -bash
   50 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>aio/0<span style="color:#999999">]</span>
   51 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>aio/1<span style="color:#999999">]</span>
   22 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>async/mgr<span style="color:#999999">]</span>
   33 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>ata_aux<span style="color:#999999">]</span>
   34 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>ata_sff/0<span style="color:#999999">]</span>
   35 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>ata_sff/1<span style="color:#999999">]</span>
   25 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>bdi-~fault<span style="color:#999999">]</span>
  773 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>bluetooth<span style="color:#999999">]</span>
   19 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>cgroup<span style="color:#999999">]</span>
   52 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>crypto/0<span style="color:#999999">]</span>
   53 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>crypto/1<span style="color:#999999">]</span>
   66 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % <span style="color:#999999">[</span>deferwq<span style="color:#999999">]</span></code></span></span>

1.2.8 网络太卡找iftop nethogs

<span style="color:#333333"><span style="color:black"><code class="language-bash">yum -y <span style="color:#dd4a68">install</span> iftop nethogs
iftop:查看网卡流量(默认监控网卡eth0)
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># iftop</span>
interface: eth0
IP address is: 10.0.0.61
MAC address is: 00:0c:29:ab:6f:34
              12.5Kb         25.0Kb        37.5Kb         50.0Kb   62.5Kb
└─────────────┴──────────────┴─────────────┴──────────────┴──────────────
10.0.0.61              <span style="color:#9a6e3a">=</span><span style="color:#9a6e3a">></span> 10.0.0.253               992b   1.17Kb  1.50Kb
                       <span style="color:#9a6e3a"><=</span>                          160b    160b    187b
10.0.0.255             <span style="color:#9a6e3a">=</span><span style="color:#9a6e3a">></span> 10.0.0.253                 0b      0b      0b
                       <span style="color:#9a6e3a"><=</span>                            0b    187b     78b
10.0.0.61              <span style="color:#9a6e3a">=</span><span style="color:#9a6e3a">></span> public1.alidns.com         0b     55b     91b
                       <span style="color:#9a6e3a"><=</span>                            0b    117b    179b
 
 
 
 
─────────────────────────────────────────────────────────────────────────
TX:             cum:   4.77KB   peak:   4rates:    992b   1.22Kb  1.59Kb
RX:                    1.30KB           1.84Kb     160b    464b    444b
TOTAL:                 6.07KB           6.33Kb    1.12Kb  1.68Kb  2.02Kb
监控eth1网卡
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># iftop -i eth1</span>
interface: eth1
IP address is: 172.16.1.61
MAC address is: 00:0c:29:ab:6f:3e
              12.5Kb         25.0Kb        37.5Kb         50.0Kb   62.5Kb
└─────────────┴──────────────┴─────────────┴──────────────┴──────────────
 
 
 
 
 
 
 
 
 
 
─────────────────────────────────────────────────────────────────────────
TX:             cum:      0B    peak:    rates:      0b      0b      0b
RX:                       0B               0b        0b      0b      0b
TOTAL:                    0B               0b        0b      0b      0b
nethogs:查看每个进程流量
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># nethogs</span>
Waiting <span style="color:#0077aa">for</span> first packet to arrive <span style="color:#999999">(</span>see sourceforge.net bug 1019381<span style="color:#999999">)</span>
NetHogs version 0.8.5
 
    PID USER     PROGRAM             DEV        SENT      RECEIVED      
   2486 root     sshd: root@pts/2    eth0        0.142       0.047 KB/sec
      ? root     unknown TCP                     0.000       0.000 KB/sec
 
  TOTAL                                          0.142       0.047 KB/sec</code></span></span>

1.2.9 实时监控cpu

mpstat是Multiprocessor Statistics的缩写,是实时系统监控工具。其报告与CPU的一些统计信息,这些信息存放在/proc/stat文件中。在多CPUs系统里,其不但能查看所有CPU的平均状况信息,而且能够查看特定CPU的信息。mpstat最大的特点是:可以查看多核心cpu中每个计算核心的统计数据;而类似工具vmstat只能查看系统整体cpu情况。

语法

mpstat [-P {|ALL}] [internal [count]]

参数

-P {|ALL} 表示监控哪个CPU, cpu在[0,cpu个数-1]中取值

internal 相邻的两次采样的间隔时间、

count 采样的次数,count只能和delay一起使用

当没有参数时,mpstat则显示系统启动以后所有信息的平均值。有interval时,第一行的信息自系统启动以来的平均信息。从第二行开始,输出为前一个interval时间段的平均信息。

实例1-1 #查看多核CPU核心的当前运行状况信息, 每2秒更新一次

 

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># mpstat -P ALL 2</span>
Linux 2.6.32-696.el6.x86_64 <span style="color:#999999">(</span>m01<span style="color:#999999">)</span>   10/10/2017  _x86_64_    <span style="color:#999999">(</span>1 CPU<span style="color:#999999">)</span>
 
04:25:04 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
04:25:06 PM  all    3.02    0.00    0.00    2.51    0.00    0.00    0.00    0.00   94.47
04:25:06 PM    0    3.02    0.00    0.00    2.51    0.00    0.00    0.00    0.00   94.47
 
04:25:06 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
04:25:08 PM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
04:25:08 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
 
04:25:08 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
04:25:10 PM  all    0.00    0.00    0.00    1.50    0.00    0.00    0.00    0.00   98.50
04:25:10 PM    0    0.00    0.00    0.00    1.50    0.00    0.00    0.00    0.00   98.50
 
04:25:10 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
04:25:12 PM  all    0.00    0.00    0.50    7.00    0.00    0.00    0.00    0.00   92.50
04:25:12 PM    0    0.00    0.00    0.50    7.00    0.00    0.00    0.00    0.00   92.50
说明: %user      在internal时间段里,用户态的CPU时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span>,不包含nice值为负进程  <span style="color:#999999">(</span>usr/total<span style="color:#999999">)</span>*100
%nice      在internal时间段里,nice值为负进程的CPU时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span>   <span style="color:#999999">(</span>nice/total<span style="color:#999999">)</span>*100
%sys       在internal时间段里,内核时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span>       <span style="color:#999999">(</span>system/total<span style="color:#999999">)</span>*100
%iowait    在internal时间段里,硬盘IO等待时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span> <span style="color:#999999">(</span>iowait/total<span style="color:#999999">)</span>*100
%irq       在internal时间段里,硬中断时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span>     <span style="color:#999999">(</span>irq/total<span style="color:#999999">)</span>*100
%soft      在internal时间段里,软中断时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span>     <span style="color:#999999">(</span>softirq/total<span style="color:#999999">)</span>*100
%idle      在internal时间段里,CPU除去等待磁盘IO操作外的因为任何原因而空闲的时间闲置时间<span style="color:#999999">(</span>%<span style="color:#999999">)</span> <span style="color:#999999">(</span>idle/total<span style="color:#999999">)</span>*100</code></span></span>

 

1.3 需要监控什么

l  内存

l  磁盘

l  网络

l  cpu负载

l  硬件、温度、风扇

l  软件服务

第2章 zabbix介绍

2.1 zabbix简介

一个软件能够实现99%监控

Zabbix是一个企业级的、开源的、分布式的监控套件

Zabbix可以监控网络和服务的监控状况. Zabbix利用灵活的告警机制,允许用户对事件发送基于Email的告警. 这样可以保证快速的对问题作出相应. Zabbix可以利用存储数据提供杰出的报告及图形化方式. 这一特性将帮助用户完成容量规划.

Zabbix支持polling和trapping两种方式. 所有的Zabbix报告都可以通过配置参数在WEB前端进行访问. Web前端将帮助你在任何区域都能够迅速获得你的网络及服务状况. Zabbix可以通过尽可能的配置来扮演监控你的IT基础框架的角色,而不管你是来自于小型组织还是大规模的公司.

Zabbix是零成本的. 因为Zabbix编写和发布基于GPL V2协议. 意味着源代码是免费发布的.

2.2 常见的监控软件:nagios+cacti、zabbix

1.      nagios+cacti

Nagios是插件式的结构,它本身没有任何监控功能,所有的监控都是通过插件进行的,因此其是高度模块化和富于弹性的。Nagios监控的对象可分为两类:主机和服务。主机通常指的是物理主机,如服务器、路由器、工作站和打印机等,这里的主机也可以是虚拟设备,如xen虚拟出的Linux系统;而服务通常指某个特定的功能,如提供http服务的httpd进程等。而为了管理上的方便,主机和服务还可以分别被规划为主机组和服务组等。

Nagios不监控任何具体数值指标(如操作系统上的进程个数),它仅用四种抽象属性对被监控对象的状态进行描述:OK、WARNING, CRITICAL和UNKNOWN。于是,管理员只需要对某种被监控对象的WARNING和CRITICAL状态的阈值进行关注和定义即可。Nagios通过将WARTING和CRTICAL的阈值传递给插件,并由插件负责某具体对象的监控及结果分析,其输出信息为状态信息(OK,WARNING,CRITICAL或UNKOWN)以及一些附加的详细说明信息。

 

Cacti是一套基于PHP,MySQL,SNMP及RRDTool开发的网络流量监测图形分析工具。

Cacti是通过 snmpget来获取数据,使用 RRDtool绘画图形,而且你完全可以不需要了解RRDtool复杂的参数。它提供了非常强大的数据和用户管理功能,可以指定每一个用户能查看树状结构、host以及任何一张图,还可以与LDAP结合进行用户验证,同时也能自己增加模板,功能非常强大完善。界面友好。软件 Cacti 的发展是基于让 RRDTool 使用者更方便使用该软件,除了基本的 Snmp 流量跟系统资讯监控外,Cacti 也可外挂 Scripts 及加上 Templates 来作出各式各样的监控图。

cacti是用php语言实现的一个软件,它的主要功能是用snmp服务获取数据,然后用rrdtool储存和更新数据,当用户需要查看数据的时候用rrdtool生成图表呈现给用户。因此,snmp和rrdtool是cacti的关键。Snmp关系着数据的收集,rrdtool关系着数据存储和图表的生成。

Mysql配合PHP程序存储一些变量数据并对变量数据进行调用,如:主机名、主机ip、snmp团体名、端口号、模板信息等变量。

snmp抓到数据不是存储在mysql中,而是存在rrdtool生成的rrd文件中(在cacti根目录的rra文件夹下)。rrdtool对数据的更新和存储就是对rrd文件的处理,rrd文件是大小固定的档案文件(Round Robin Archive),它能够存储的数据笔数在创建时就已经定义。

 

第3章 zabbix部署

3.1 zabbix服务端

1.环境准备

 

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># hostname -I</span>
10.0.0.61 172.16.1.61
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/iptables status</span>
iptables: Firewall is not running.
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># ll -d /tmp/</span>
drwxrwxrwt. 3 root root 4096 Sep 23 02:20 /tmp/
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray">#  cat /etc/redhat-release</span>
CentOS release 6.9 <span style="color:#999999">(</span>Final<span style="color:#999999">)</span>
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray">#</span>
安装环境选择
选择LAMP还是LNMP
    LAMP 作为测试软件快速平台  开源软件基于LAMP架构
 
 
    这是全新的物理机:
        yum安装 LAMP+zabbix
        编译安装nginx,php,二进制mysql
          apache<span style="color:#9a6e3a">==</span>》cp -R /usr/share/zabbix/ /var/www/html/
          nginx <span style="color:#9a6e3a">==</span>》 <span style="color:#dd4a68">cp</span> -R /usr/share/zabbix/ /application/nginx/html
         
          apache的配置文件是有zabbix安装包代为修改
          但是nginx的配置文件就需要自己改了
         
          php :和LAMP一样的操作,执行sed命令修改配置文件<span style="color:#999999">..</span>需要注意php单独启动服务和编译参数的模块是否齐全
                <span style="color:#dd4a68">sed</span> -i.ori <span style="color:#669900">'s#max_execution_time = 30#max_execution_time = 300#;s#max_input_time = 60#max_input_time = 300#;s#post_max_size = 8M#post_max_size = 16M#;910a date.timezone = Asia/Shanghai'</span> /application/php/lib/php.ini
 
          MySQL:创建数据库,授权,导入数据
 
    这是乱七八糟物理机
    能否重装系统
    只能编译安装,yum安装虽然能够解决软件依赖问题,但是解决不了依赖冲突的问题
        编译安装LNMP
        编译安装zabbix server
2.下载zabbix包
<span style="color:#dd4a68">wget</span> http://192.168.12.200/zabbix/zabbix3.0.9_yum.tar.gz</code></span></span>

 

3.安装zabbix

<span style="color:#333333"><span style="color:black"><code class="language-bash">离线安装
第一步下载
<span style="color:#dd4a68">wget</span> http://192.168.12.200/zabbix/zabbix3.0.9_yum.tar.gz
第二步解压包
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># tar xfP zabbix3.0.9_yum.tar.gz</span>
第三步一键安装
<span style="color:#999999">[</span>root@m01 tools<span style="color:#999999">]</span><span style="color:slategray"># yum -y --nogpgcheck -C install httpd mysql-server php55w php55w-mysql php55w-common php55w-gd php55w-mbstring php55w-mcrypt php55w-devel php55w-xml php55w-bcmath</span>
yum -y --nogpgcheck -C <span style="color:#dd4a68">install</span> zabbix-web zabbix-server-mysql zabbix-web-mysql zabbix-get zabbix-java-gateway wqy-microhei-fonts net-snmp net-snmp-utils
    (1)安装LAMP:yum安装Apache、mysql、PHP5.5
        安装http:
        yum -y <span style="color:#dd4a68">install</span> httpd
        安装mysql:
        yum -y <span style="color:#dd4a68">install</span> mysql-server
        安装php5.5:
        rpm -ivh http://repo.webtatic.com/yum/el6/x86_64/webtatic-release-6-9.noarch.rpm
        yum -y <span style="color:#dd4a68">install</span> php55w php55w-mysql php55w-common php55w-gd php55w-mbstring php55w-mcrypt php55w-devel php55w-xml php55w-bcmath
   
 
    (2)安装Zabbix Server
        rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/6/x86_64/zabbix-release-3.0-1.el6.noarch.rpm
        yum -y <span style="color:#dd4a68">install</span> zabbix-web zabbix-server-mysql zabbix-web-mysql
   
   
    安装配置Zabbix Agent
        yum -y localinstall http://mirrors.aliyun.com/zabbix/zabbix/3.0/rhel/6/x86_64/zabbix-agent-3.0.9-1.el6.x86_64.rpm
       
   
    (3)配置相关服务
        MySQL配置
        \cp /usr/share/mysql/my-medium.cnf /etc/my.cnf
        启动MySQL
        /etc/init.d/mysqld start
        创建用户并授权
        mysql
        create database zabbix character <span style="color:#0077aa">set</span> utf8 collate utf8_bin<span style="color:#999999">;</span>
        grant all on zabbix.* to zabbix@<span style="color:#669900">'localhost'</span> identified by <span style="color:#669900">'zabbix'</span><span style="color:#999999">;</span>
        flush privileges<span style="color:#999999">;</span>
        <span style="color:#0077aa">exit</span>
 
        导入数据文件
        <span style="color:#dd4a68">cd</span> /usr/share/doc/zabbix-server-mysql-3.0.9
        zcat create.sql.gz <span style="color:#9a6e3a">|</span>mysql -uzabbix -pzabbix zabbix
 
        相关数据修改
        <span style="color:slategray"># 修改php配置文件</span>
        <span style="color:#dd4a68">egrep</span> -n <span style="color:#669900">"^post_max_size|^max_execution_time|^max_input_time|^date.timezone"</span> /etc/php.ini
 
        <span style="color:#dd4a68">sed</span> -i <span style="color:#669900">'s#max_execution_time = 30#max_execution_time = 300#;s#max_input_time = 60#max_input_time = 300#;s#post_max_size = 8M#post_max_size = 16M#;910a date.timezone = Asia/Shanghai'</span> /etc/php.ini
 
        <span style="color:slategray"># 修改zabbix_server配置文件</span>
        <span style="color:#dd4a68">sed</span> -i <span style="color:#669900">'115a DBPassword=zabbix'</span> /etc/zabbix/zabbix_server.conf
 
        <span style="color:slategray"># 网页文件</span>
        <span style="color:#dd4a68">cp</span> -R /usr/share/zabbix/ /var/www/html/
 
        <span style="color:slategray"># 文件授权</span>
        <span style="color:#dd4a68">chmod</span> -R 755 /etc/zabbix/web
        <span style="color:#dd4a68">chown</span> -R apache.apache /etc/zabbix/web
 
        <span style="color:slategray"># 启动apache && zabbix</span>
        <span style="color:#0077aa">echo</span> <span style="color:#669900">"ServerName 127.0.0.1:80"</span><span style="color:#9a6e3a">>></span>/etc/httpd/conf/httpd.conf
        /etc/init.d/httpd start
        /etc/init.d/zabbix-server start
   
    服务开机启动顺序:必须先启动mysql,然后启动zabbix server
    <span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/rc.local</span>
    /etc/init.d/mysqld start
    /etc/init.d/zabbix-server start
    /etc/init.d/httpd start
    /etc/init.d/zabbix-agent start
   
    <span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray"># tail -1 /etc/rc.local</span>
    /etc/init.d/zabbix-agent start</code></span></span>

3.2 zabbix客户端

 

<span style="color:#333333"><span style="color:black"><code class="language-bash">客户端服务端都需要安装agent程序
        rpm -ivh http://mirrors.aliyun.com/zabbix/zabbix/3.0/rhel/6/x86_64/zabbix-agent-3.0.9-1.el6.x86_64.rpm
客户端配置
    修改配置文件
    <span style="color:#dd4a68">sed</span> -i <span style="color:#669900">'s#Server=127.0.0.1#Server=172.16.1.61#'</span> /etc/zabbix/zabbix_agentd.conf
   
    启动agent
    /etc/init.d/zabbix-agent start
   
    在服务端执行检查命令
    <span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.61 -p 10050 -k "system.cpu.load[all,avg1]"</span>
    0.000000
    <span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k "system.cpu.load[all,avg1]"</span>
    0.000000   
总结图示</code></span></span>

 

image.png

3.3 网页进行安装

<span style="color:#333333"><span style="color:black"><code class="language-bash">第一步:
http://10.0.0.61/zabbix/setup.php</code></span></span>

image.png

第二步:点击下一步

image.png

 

第三步:点击下一步

image.png

 

第四步:点击下一步

image.png

 

第五步

image.png

 

第六步:点击完成

image.png

 

第七步:点击完成后安装结束

设置中文显示中文界面

image.png

 

最后界面就是中文的界面了。

image.png

 

3.4 添加监控主机

主机名称:zabbix程序识别用的名字

可见的名称:给人看,显示在网页上的

群组:同学(主机)与小组(群组)方便管理

agent代理程序的接口

agent代理程序的接口:指定客户端IP地址

image.png

image.png

检查客户端是否被监控的命令

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.61 -p 10050 -k "system.cpu.load[all,avg1]"</span>
0.090000
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k "system.cpu.load[all,avg1]"</span>
0.000000
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

3.5 添加监控主机模板

image.png

 

3.6 查看最新数据

监控出现图形了,但是可能会出现乱码

<span style="color:#333333"><span style="color:black"><code class="language-bash">解决中文乱码
    <span style="color:#dd4a68">wget</span> -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-6.repo
    yum -y <span style="color:#dd4a68">install</span> wqy-microhei-fonts
    \cp /usr/share/fonts/wqy-microhei/wqy-microhei.ttc /usr/share/fonts/dejavu/DejaVuSans.ttf</code></span></span>

image.png

第4章 zabbix自定义监控

4.1 被监控主机修改配置文件

<span style="color:#333333"><span style="color:black"><code class="language-bash">web01客户端执行
<span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray"># sed -i '293a UserParameter=login-user,who|wc -l' /etc/zabbix/zabbix_agentd.conf [root@web01 ~]# /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>
<span style="color:#333333"><span style="color:black"><code class="language-bash"> 
服务端操作
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k "login-user"</span>
1
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

4.2 网页上添加自定义监控

4.2.1 添加模板

模板的功能:一处创建,处处使用

 

image.png

image.png

4.2.2 添加应用集:一个目录,统一存放具有相关性的监控项

image.png

image.png

image.png

4.2.3 添加监控项

告诉服务端server你该去哪里获取什么方面数据

 

image.png

image.png

image.png

 

4.2.4 添加触发器

需要报警的监控项设置触发器

严重性:

²  警告级别的报警发给初级运维

²  一般严重级别的报警发给初级运维,中级运维

²  严重级别的报警发给初级运维,中级运维,高级运维

²  灾难级别的报警发给初级运维,中级运维,高级运维,总监

 

image.png

image.png

image.png

image.png

image.png

 

4.2.5 添加图形

image.png

image.png

image.png

image.png

4.2.6 使用模板

image.png

image.png

第5章 报警

5.1 报警种类

邮件报警:存在收不到的风险

微信报警:通知及时

短信报警:不依赖网络

电话报警:有信号就可以报警

APP报警

5.2 安装报警客户端

<span style="color:#333333"><span style="color:black"><code class="language-bash">2安装 Agent
1.切换到zabbix脚本目录<span style="color:#999999">(</span>如何查看zabbix脚本目录<span style="color:#999999">)</span>:
<span style="color:#dd4a68">vi</span> /etc/zabbix/zabbix_server.conf
查看AlertScriptsPath
<span style="color:#dd4a68">cd</span> /usr/lib/zabbix/alertscripts/
2.获取OneITSM agent包:
<span style="color:#dd4a68">wget</span> http://www.onealert.com/agent/release/oneitsm_zabbix_release-1.0.0.tar.gz <span style="color:#dd4a68">tar</span> -zxf oneitsm_zabbix_release-1.0.0.tar.gz
<span style="color:#dd4a68">cd</span> oneitsm/bin
<span style="color:#dd4a68">bash</span> install.sh 7145112f-7f7a-8cfb-cd26-810036d6d479 start to create config file<span style="color:#999999">..</span>.
Zabbix管理地址: 10.0.0.61/zabbix
Zabbix管理员账号: Admin
Zabbix管理员密码:
start to auth by zabbix admin user and password<span style="color:#999999">..</span>.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0    70    0    70    0   125    847   1514 --:--:-- --:--:-- --:--:--     0
auth success<span style="color:#9a6e3a">!</span>
start to create mediatype<span style="color:#999999">..</span>.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
397   130  130   130    0   267   2706   5559 --:--:-- --:--:-- --:--:--     0
media <span style="color:#dd4a68">type</span> resonse:<span style="color:#999999">{</span><span style="color:#669900">"jsonrpc"</span><span style="color:#0077aa">:</span><span style="color:#669900">"2.0"</span>,<span style="color:#669900">"error"</span>:<span style="color:#999999">{</span><span style="color:#669900">"code"</span>:-32602,<span style="color:#669900">"message"</span><span style="color:#0077aa">:</span><span style="color:#669900">"Invalid params."</span>,<span style="color:#669900">"data"</span><span style="color:#0077aa">:</span><span style="color:#669900">"Media type \"oneitsm media\" already exists."</span><span style="color:#999999">}</span>,<span style="color:#669900">"id"</span>:1<span style="color:#999999">}</span>
create media <span style="color:#dd4a68">type</span> failed<span style="color:#9a6e3a">!</span> error message:<span style="color:#999999">{</span><span style="color:#669900">"jsonrpc"</span><span style="color:#0077aa">:</span><span style="color:#669900">"2.0"</span>,<span style="color:#669900">"error"</span>:<span style="color:#999999">{</span><span style="color:#669900">"code"</span>:-32602,<span style="color:#669900">"message"</span><span style="color:#0077aa">:</span><span style="color:#669900">"Invalid params."</span>,<span style="color:#669900">"data"</span><span style="color:#0077aa">:</span><span style="color:#669900">"Media type \"oneitsm media\" already exists."</span><span style="color:#999999">}</span>,<span style="color:#669900">"id"</span>:1<span style="color:#999999">}</span>
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

image.png

第6章 监控可视化

6.1 聚合图形

聚合图形:将同一类型的监控放在一起看,容易对比分析

 

image.png

image.png

image.png

image.png

image.png 

6.2 幻灯片演示

<span style="color:#333333"><span style="color:black"><code class="language-bash">  幻灯片:轮流播放聚合图形
       <span style="color:#0077aa">.</span> 模板的共享
       https://github.com/zhangyao8/zabbix-community-repos
       https://share.zabbix.com/</code></span></span>

image.png

image.png

第7章 应用服务监控

7.1 监控rsync服务端口

1.      创建一个监控认识有哪些服务端口的模板

image.png

image.png

2.      添加一个应用集

image.png

image.png

3.      新建一个监控项

image.png

4.      添加触发器

 image.png

image.png

7.2 监控NFS服务器,使用监控NFS进程来判断NFS服务器正常


怎么监控进程?

<span style="color:#333333"><span style="color:black"><code class="language-bash">        proc.num<span style="color:#999999">[</span><span style="color:#9a6e3a"><</span>name<span style="color:#9a6e3a">></span>,<span style="color:#9a6e3a"><</span>user<span style="color:#9a6e3a">></span>,<span style="color:#9a6e3a"><</span>state<span style="color:#9a6e3a">></span>,<span style="color:#9a6e3a"><</span>cmdline<span style="color:#9a6e3a">></span><span style="color:#999999">]</span>   进程数。返回整数
        zabbix_get -s 172.16.1.8 -p 10050 -k <span style="color:#669900">'proc.num[nginx]'</span>
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.31 -p 10050 -k "proc.num[nfsd]"</span>
8
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

怎么监控端口?

<span style="color:#333333"><span style="color:black"><code class="language-bash">        net.tcp.listen<span style="color:#999999">[</span>port<span style="color:#999999">]</span>    检查 TCP 端口 是否处于侦听状态。返回 0 - 未侦听;1 - 正在侦听
        net.tcp.port<span style="color:#999999">[</span><span style="color:#9a6e3a"><</span>ip<span style="color:#9a6e3a">></span>,port<span style="color:#999999">]</span> 检查是否能建立 TCP 连接到指定端口。返回 0 - 不能连接;1 - 可以连接 <span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k 'net.tcp.port[,80]'</span>
        1
        <span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k 'net.tcp.port[,873]'</span>
        1
        <span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k 'net.tcp.port[873]'</span>
        ZBX_NOTSUPPORTED: Invalid second parameter.</code></span></span>

7.3 监控3台web服务器

<span style="color:#333333"><span style="color:black"><code class="language-bash">网页创建模板进行监控80端口
三个web服务器中进行web检测
写脚本
<span style="color:#999999">[</span>root@web01 conf<span style="color:#999999">]</span><span style="color:slategray"># cat  /server/scripts/nginx_check.sh</span>
char<span style="color:#9a6e3a">=</span><span style="color:#ee9900"><span style="color:#ee9900">`</span><span style="color:#dd4a68">curl</span> -s http://10.0.0.8/test.html<span style="color:#ee9900">`</span></span>
<span style="color:#999999">[</span> <span style="color:#669900">"<span style="color:#ee9900">$char</span>"</span> <span style="color:#9a6e3a">==</span> <span style="color:#669900">"oldboy"</span> <span style="color:#999999">]</span> <span style="color:#9a6e3a">&&</span> <span style="color:#0077aa">echo</span> 1 <span style="color:#9a6e3a">||</span><span style="color:#0077aa">echo</span> 0
<span style="color:#999999">[</span>root@web01 conf<span style="color:#999999">]</span><span style="color:slategray">#</span>
写模板文件
<span style="color:#999999">[</span>root@web01 conf<span style="color:#999999">]</span><span style="color:slategray"># cat  /etc/zabbix/zabbix_agentd.d/userparameter_nginx.conf</span>
UserParameter<span style="color:#9a6e3a">=</span>nginx_check,/bin/sh /server/scripts/nginx_check.sh
<span style="color:#999999">[</span>root@web01 conf<span style="color:#999999">]</span><span style="color:slategray">#</span>
重启客户端
<span style="color:#999999">[</span>root@web01 conf<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
监控机进行测试
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8 -p 10050 -k "nginx_check"</span>
1
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.7 -p 10050 -k "nginx_check"</span>
1
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.9 -p 10050 -k "nginx_check"</span>
1
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray">#</span>
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.7 -p 10050 -k "nginx_check"</span>
0  <span style="color:slategray">#0是不正常1是正常</span>
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

7.4 监控mysql

如何排查自定义监控报错:

<span style="color:#333333"><span style="color:black"><code class="language-bash">UserParameter<span style="color:#9a6e3a">=</span>key,shell <span style="color:#dd4a68">command</span>
1. 现在命令行测试你的shell command的结果和你的期望是否一致
<span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray"># mysqladmin -uroot -poldboy123 ping 2>/dev/null|grep -c alive</span>
1
2. 将符合预期的shell command写入到我们的自定义监控文件中
UserParameter<span style="color:#9a6e3a">=</span>mysql.ping,HOME<span style="color:#9a6e3a">=</span>/var/lib/zabbix mysqladmin <span style="color:#dd4a68">ping</span> <span style="color:#9a6e3a">|</span> <span style="color:#dd4a68">grep</span> -c alive
改为
UserParameter<span style="color:#9a6e3a">=</span>mysql.ping,HOME<span style="color:#9a6e3a">=</span>/var/lib/zabbix mysqladmin -uroot -poldboy123 <span style="color:#dd4a68">ping</span> 2<span style="color:#9a6e3a">></span>/dev/null<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">grep</span> -c alive
 
3. 改完配置文件记住重启生效
4. 在服务端用zabbix_get命令
 
<span style="color:#999999">[</span>root@db01 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># tail -2 userparameter_mysql.conf</span>
UserParameter<span style="color:#9a6e3a">=</span>mysql.ping,/application/mysql/bin/mysqladmin -uroot -p123456  <span style="color:#dd4a68">ping</span> 2<span style="color:#9a6e3a">></span>/dev/null <span style="color:#9a6e3a">|</span> <span style="color:#dd4a68">grep</span> -c alive
UserParameter<span style="color:#9a6e3a">=</span>mysql.version,mysql -V
重启
 <span style="color:#999999">[</span>root@db01 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
服务端测试
<span style="color:#999999">[</span>root@m01 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.51 -p 10050 -k 'mysql.ping'</span>
0<span style="color:slategray">#发现mysql命令找不到 在客户端模板配置文件里面命令使用绝对路径</span>
<span style="color:#999999">[</span>root@m01 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.51 -p 10050 -k 'mysql.ping'</span>
1
<span style="color:#999999">[</span>root@m01 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

7.5 监控URL地址来更精确的监控我们的网站运行正常

image.png

image.png

image.png

image.png

image.png

 

最后在监测中查看web监测

image.png

7.6 监控Nginx的7种连接状态

<span style="color:#333333"><span style="color:black"><code class="language-bash">修改配置加入server标签
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray"># cat /application/nginx/conf/nginx.conf</span>
worker_processes  1<span style="color:#999999">;</span>
events <span style="color:#999999">{</span>
    worker_connections  1024<span style="color:#999999">;</span>
<span style="color:#999999">}</span>
http <span style="color:#999999">{</span>
    include       mime.types<span style="color:#999999">;</span>
    default_type  application/octet-stream<span style="color:#999999">;</span>
    sendfile        on<span style="color:#999999">;</span>
    keepalive_timeout  65<span style="color:#999999">;</span>
<span style="color:slategray">######status#########</span>
    server <span style="color:#999999">{</span>
        listen       127.0.0.1:80<span style="color:#999999">;</span>
            stub_status on<span style="color:#999999">;</span>
            access_log off<span style="color:#999999">;</span>
        <span style="color:#999999">}</span>
server <span style="color:#999999">{</span>
        listen       10.0.0.9<span style="color:#999999">;</span>
        location / <span style="color:#999999">{</span>
            root   html<span style="color:#999999">;</span>
            index  index.html index.htm<span style="color:#999999">;</span>
                   <span style="color:#999999">}</span>
      <span style="color:#999999">}</span>
include extra/www.conf<span style="color:#999999">;</span>
include extra/bbs.conf<span style="color:#999999">;</span>
include extra/blog.conf<span style="color:#999999">;</span>
 <span style="color:#999999">}</span>
重启服务
检查状态
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray"># curl 127.0.0.1/nginx_status</span>
Active connections: 1
server accepts handled requests
 8 8 8
Reading: 0 Writing: 1 Waiting: 0
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray">#</span>
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray"># cat >> /etc/zabbix/zabbix_agentd.d/userparameter_nginx_status.conf <<END</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_active,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'/Active/ {print <span style="color:#ee9900">$NF</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_accepts,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==3 {print <span style="color:#ee9900">$1</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_handled,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==3 {print <span style="color:#ee9900">$2</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_requests,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==3 {print <span style="color:#ee9900">$3</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_reading,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==4 {print <span style="color:#ee9900">$2</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_writing,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==4 {print <span style="color:#ee9900">$4</span>}'</span>
<span style="color:#9a6e3a">></span> UserParameter<span style="color:#9a6e3a">=</span>nginx_waiting,curl -s  127.0.0.1/nginx_status<span style="color:#9a6e3a">|</span><span style="color:#dd4a68">awk</span> <span style="color:#669900">'NR==4 {print <span style="color:#ee9900">$6</span>}'</span>
<span style="color:#9a6e3a">></span> END
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@web03 ~<span style="color:#999999">]</span><span style="color:slategray">#</span>
服务端测试
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8  -p 10050 -k "nginx_waiting"</span>
Reading: 0 Writing: 1 Waiting: 0
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray"># zabbix_get -s 172.16.1.8  -p 10050 -k "nginx_accepts"</span>
 58 58 58
<span style="color:#999999">[</span>root@m01 bin<span style="color:#999999">]</span><span style="color:slategray">#</span>
网页进行自定义监控</code></span></span>

image.png

image.png

image.png

image.png

image.png

image.png

 

第8章 自动发现和自动注册

8.1 自动发现:服务端发现客户端

服务端自动发现局域网中所有的客户端agent(主动模式)

优点:方便找到所有的客户端,不会遗漏

       缺点:一旦agent过多,server压力很大,每隔一段时间server会扫描一次局域网中的所有机器

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

8.2 自动注册:客户端主动到服务端登记信息

所有的客户端agent主动去服务端server登记注册(小弟上门求收留)被动模式

       优点:对服务端的压力最低。

       缺点:配置过程稍微复杂

应用场景:当自动发现时,server端压力过大时,用自动注册

<span style="color:#333333"><span style="color:black"><code class="language-bash">修改配置文件
<span style="color:#999999">[</span>root@web03 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># sed -i 's#Hostname=Zabbix server#Hostname=web03#' /etc/zabbix/zabbix_agentd.conf</span>
<span style="color:#999999">[</span>root@web03 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># sed -i 's#ServerActive=127.0.0.1#ServerActive=172.16.1.61#' /etc/zabbix/zabbix_agentd.conf [root@web03 zabbix_agentd.d]#  sed -i '176a HostMetadataItem=system.uname' /etc/zabbix/zabbix_agentd.conf</span>
重启
<span style="color:#999999">[</span>root@web03 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@web03 zabbix_agentd.d<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

第9章 分布式监控

默认只能监控同一个局域网的机器

一台sever的监控主机数量是有限的

功能:

能够减轻服务端的压力

zabbix server 只能在同一个局域网中监控

9.1 常规安装部署

9.1.1 在客户端进行下载安装

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@web01 tools<span style="color:#999999">]</span><span style="color:slategray"># wget http://192.168.12.200/zabbix/zabbix-proxy-mysql-3.0.9-1.el6.x86_64.rpm</span>
<span style="color:#0077aa">echo</span> <span style="color:#669900">"192.168.12.200 mirrors.aliyun.com"</span> <span style="color:#9a6e3a">>></span> /etc/hosts
<span style="color:#dd4a68">wget</span> -O /etc/yum.repos.d/CentOS-Base.repo http://192.168.12.200/repo/Centos-6.repo
<span style="color:#dd4a68">wget</span> -O /etc/yum.repos.d/epel.repo http://192.168.12.200/repo/epel-6.repo
 
yum clean all</code></span></span>
<span style="color:#333333"><span style="color:black"><code class="language-bash">yum -y  localinstall zabbix-proxy-mysql-3.0.9-1.el6.x86_64.rpm</code></span></span>

9.1.2 安装配置数据库

<span style="color:#333333"><span style="color:black"><code class="language-bash">zabbix proxy需要数据库存储相关配置,但不存储监控数据
生产环境:在proxy
 
1  
mysql<span style="color:#9a6e3a">></span> create database zabbix_proxy character <span style="color:#0077aa">set</span> utf8 collate utf8_bin<span style="color:#999999">;</span>
Query OK, 1 row affected <span style="color:#999999">(</span>0.06 sec<span style="color:#999999">)</span>
 
mysql<span style="color:#9a6e3a">></span> grant all privileges on zabbix_proxy.* to zabbix@<span style="color:#669900">'172.16.1.%'</span> identified by <span style="color:#669900">'zabbix'</span><span style="color:#999999">;</span>
Query OK, 0 rows affected <span style="color:#999999">(</span>0.15 sec<span style="color:#999999">)</span>
 
mysql<span style="color:#9a6e3a">></span></code></span></span>

9.1.3 导入sql文件

<span style="color:#333333"><span style="color:black"><code class="language-bash">web客户端传文件到服务端
<span style="color:#dd4a68">scp</span> /usr/share/doc/zabbix-proxy-mysql-3.0.9/schema.sql.gz 10.0.0.61:/server/
m01导入文件
zcat /server/schema.sql.gz <span style="color:#9a6e3a">|</span>mysql -uroot   zabbix_proxy</code></span></span>

9.1.4 配置文件修改

<span style="color:#333333"><span style="color:black"><code class="language-bash">vim /etc/zabbix/zabbix_proxy.conf
Server<span style="color:#9a6e3a">=</span>10.0.0.61
Hostname<span style="color:#9a6e3a">=</span>web01
DBName<span style="color:#9a6e3a">=</span>zabbix_proxy
DBUser<span style="color:#9a6e3a">=</span>zabbix
DBPassword<span style="color:#9a6e3a">=</span>zabbix DBHost<span style="color:#9a6e3a">=</span>172.16.1.61</code></span></span>

 

9.1.5 启动并检查日志

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@web01 tools<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-proxy start</span>
Starting Zabbix proxy:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@web01 tools<span style="color:#999999">]</span><span style="color:slategray"># tailf /var/log/zabbix/zabbix_proxy.log</span>
 39637:20171013:113558.590 cannot send heartbeat message to server at <span style="color:#669900">"172.16.1.61"</span><span style="color:#0077aa">:</span> proxy <span style="color:#669900">"web01"</span> not found
 39651:20171013:113558.608 proxy <span style="color:slategray">#16 started [housekeeper #1]</span>
 39652:20171013:113558.618 proxy <span style="color:slategray">#17 started [http poller #1]</span>
 39654:20171013:113558.619 proxy <span style="color:slategray">#19 started [history syncer #1]</span>
 39650:20171013:113558.621 proxy <span style="color:slategray">#15 started [icmp pinger #1]</span>
 39653:20171013:113558.621 proxy <span style="color:slategray">#18 started [discoverer #1]</span>
 39655:20171013:113558.626 proxy <span style="color:slategray">#20 started [history syncer #2]</span>
 39658:20171013:113558.639 proxy <span style="color:slategray">#23 started [self-monitoring #1]</span>
 39657:20171013:113558.642 proxy <span style="color:slategray">#22 started [history syncer #4]</span>
 39656:20171013:113558.648 proxy <span style="color:slategray">#21 started [history syncer #3]</span></code></span></span>

9.1.6 配置zabbix agent

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray"># sed -i 's#172.16.1.61#172.16.1.8#g' /etc/zabbix/zabbix_agentd.conf</span>
<span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray"># /etc/init.d/zabbix-agent restart</span>
Shutting down Zabbix agent:                                <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
Starting Zabbix agent:                                     <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@web01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span>

9.2 开始网页操作部署

将代理节点注册到zabbix server中

 

 image.png

image.png

image.png

image.png

image.png

第10章 snmp监控(交换机监控)

能用zabbix agent:系统能够安装zabbix软件的时候用,交换机打印机等智能设备不能使用zabbix程序。

SNMP是专用来设备监控

优点:软件小巧,所以设备都可以安装使用

缺点:支持功能少

生产环境建议:先安装使用agent,若不能安装则使用SNMP。

10.1 在Linux系统安装启动服务

<span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@m01 server<span style="color:#999999">]</span><span style="color:slategray"># rpm -qa |grep snmp</span>
net-snmp-5.5-60.el6.x86_64
net-snmp-utils-5.5-60.el6.x86_64
net-snmp-libs-5.5-60.el6.x86_64
<span style="color:#999999">[</span>root@m01 server<span style="color:#999999">]</span><span style="color:slategray">#</span>
没有就安装
    yum -y <span style="color:#dd4a68">install</span> net-snmp net-snmp-utils
配置snmp
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># sed -i.ori '57a view systemview   included  .1' /etc/snmp/snmpd.conf</span>
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># [root@m01 ~]# /etc/init.d/snmpd start</span>
Starting snmpd:                                            <span style="color:#999999">[</span>  OK  <span style="color:#999999">]</span>
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray">#</span>
使用
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># snmpwalk -v 2c -c public 127.0.0.1 sysname</span>
SNMPv2-MIB::sysName.0 <span style="color:#9a6e3a">=</span> STRING: m01
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># [root@m01 ~]# snmpwalk -v 2c -c public 127.0.0.1 sysContact</span>
SNMPv2-MIB::sysContact.0 <span style="color:#9a6e3a">=</span> STRING: Root <span style="color:#9a6e3a"><</span>root@localhost<span style="color:#9a6e3a">></span> <span style="color:#999999">(</span>configure /etc/snmp/snmp.local.conf<span style="color:#999999">)</span>
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># snmpwalk -v 2c -c public 127.0.0.1 SysService</span>
SNMPv2-MIB::sysServices <span style="color:#9a6e3a">=</span> No Such Instance currently exists at this OID
<span style="color:#999999">[</span>root@m01 ~<span style="color:#999999">]</span><span style="color:slategray"># snmpwalk -v 2c -c public 127.0.0.1 hrSWRunName</span>
HOST-RESOURCES-MIB::hrSWRunName.1 <span style="color:#9a6e3a">=</span> STRING: <span style="color:#669900">"init"</span>
HOST-RESOURCES-MIB::hrSWRunName.2 <span style="color:#9a6e3a">=</span> STRING: <span style="color:#669900">"kthreadd"</span>
HOST-RESOURCES-MIB::hrSWRunName.3 <span style="color:#9a6e3a">=</span> STRING: <span style="color:#669900">"migration/0"</span>
HOST-RESOURCES-MIB::hrSWRunName.4 <span style="color:#9a6e3a">=</span> STRING: <span style="color:#669900">"ksoftirqd/0"</span>
HOST-RESOURCES-MIB::hrSWRunName.5 <span style="color:#9a6e3a">=</span> STRING: <span style="color:#669900">"stopper/0"</span> 说明: http://www.ttlsa.com/monitor/snmp-oid/</code></span></span>

10.2 在网页上配置使用

image.png

image.png

image.png

image.png

原文链接:http://blog.51cto.com/shengge520/2054879

老实说,终于见到一个比我还无聊的人了。辣么多

||ヽ(* ̄▽ ̄*)ノミ|Ю

猜你喜欢

转载自blog.csdn.net/qq_37960324/article/details/82182298