day14 - centos7部署nagios(客户端,服务端的安装)(ob16)

参考:https://www.cnblogs.com/benjamin77/p/8565798.html 《centos7部署nagios》
环境:centos7
服务端IP:192.168.26.136 (nagios监控服务器,Apache、Php、Nagios、nagios-plugins nginx)
客户端IP:192.168.26.137 (Lamp服务器,被监控的客户端服务器,nagios-plugins、nrpe)

一、服务端的安装

1.1 nagios监控工具原理图

在这里插入图片描述

Nagios 通过NRPE 来远端管理服务
1. Nagios 执行安装在它里面的check_nrpe 插件,并告诉check_nrpe 去检测哪些服务。
2. 通过SSL,check_nrpe 连接远端机子上的NRPE daemon
3. NRPE 运行本地的各种插件去检测本地的服务和状态(check_disk,..etc)
4. 最后,NRPE 把检测的结果传给主机端的check_nrpe,check_nrpe 再把结果送到Nagios状态队列中。
5. Nagios 依次读取队列中的信息,再把结果显示出来。

1.2 服务端安装nagios前的准备

1、解决perl编译问题

echo 'export LC_ALL=C'>> /etc/profile
source /etc/profile

2、关闭防火墙

systemctl stop firewalld

3、系统时间同步

ntpdate pool.ntp.org
echo '*/10 * * * * /usr/sbin/ntpdate pool.ntp.org >/dev/null 2>&1'>>/var/spool/root
或者
crontab -e 编辑,放上面定时同步语句到里面。

4、所需基础软件包

yum install gcc glibc glibc-common -y
yum install gd gd-devel -y
yum install httpd php php-gd -y
yum -y install httpd httpd-devel gcc glibc glibc-common gd gd-devel perl-devel perl-CPAN fcgi perl-FCGI perl-FCGI-ProcManager

此时他的版本为httpd-2.4.6 、php-5.4.16

1.3 创建nagios需要的用户及用户组

/usr/sbin/useradd -m nagios
/usr/sbin/useradd apache
/usr/sbin/useradd nagcmd

/usr/sbin/usermod -a -G nagcmd nagios
/usr/sbin/usermod -a -G nagcmd apache 

#检查
id -n -G nagios
id -n -G apache 
groups nagios
groups apache 

1.4 安装nagios

#nagios-4.3.1.tar.gz  nagios-plugins-2.2.1.tar.gz  nrpe-3.1.0.tar.gz
cd /server/tools
wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.3.1.tar.gz
wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz 
wget https://sourceforge.net/projects/nagios/files/nrpe-3.x/nrpe-3.1.0.tar.gz

tar -zxf nagios-4.3.1.tar.gz
cd nagios-4.3.1/
./configure --with-command-group=nagcmd

在这里插入图片描述

make all

在这里插入图片描述

#随后: 
make install
make install-init
make install-config
make install-commandmode
make install-webconf

cat /etc/httpd/conf.d/nagios.conf

在这里插入图片描述
在这里插入图片描述

1.5 创建nagios web监控界面登录的账号密码

# oldboy / 123456
htpasswd -c /usr/local/nagios/etc/htpasswd.users oldboy
[root@lb-136 nagios-4.3.1]# cat /usr/local/nagios/etc/htpasswd.users
oldboy:$apr1$78E5XW/o$KWeQOuJlOwuem.y9y4Df8/

这里的密码文件路径其实对应的是nagios.conf的这一段
在这里插入图片描述

1.6 添加监控报警的接收email地址 (35行的地方)

sed -i 's#nagios@localhost#[email protected]#g' /usr/local/nagios/etc/objects/contacts.cfg

#参考 https://blog.51cto.com/13043516/2139030
yum -y install sendmail
yum -y install mailx
systemctl start sendmail
netstat -lnt | grep 25

在这里插入图片描述

1.7 配置启动Apache服务

主配置目录:
源码包位于手动指定目录下的conf/httpd.conf;yum的位于/etc/httd/conf/httpd.conf。

systemctl start httpd
systemctl enable httpd
apachectl graceful
lsof -i:80

1.8 安装nagios插件软件包

tar zxf nagios-plugins-2.2.1.tar.gz
cd nagios-plugins-2.2.1/
./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-perl-modules
make
make install

#查看生成的插件数量
ll /usr/local/nagios/libexec/ | wc -l

1.9 启动nagios

chkconfig nagios on
/etc/init.d/nagios start
#检查语法   Error 0  为正常
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios checkconfig

在这里插入图片描述
访问:http://192.168.26.136/nagios/ (oldboy / 123456)
在这里插入图片描述

1.10 安装nrpe

yum -y install openssl-devel
tar -zxf nrpe-3.1.0.tar.gz
cd nrpe-3.1.0/
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config

ls 

nrpe版本3.1.0执行make install-daemon-config的时候会报错。目前暂认为是新版本没有该模块,忽略不管。。。
在这里插入图片描述
nagios服务端的安装到此。

二、客户端的安装

ntpdate time.windows.com
yum install perl-devel perl-CPAN -y

/usr/sbin/adduser nagios -M -s /sbin/nologin
#安装nagios-plugins
tar -zxf nagios-plugins-2.2.1.tar.gz
cd nagios-plugins-2.2.1/
./configure \
--with-nagios-user=nagios --with-nagios-group=nagios --enable-perl-modules
make
make install
#查看插件个数
ls /usr/local/nagios/libexec/ | wc -l
55

#安装nrpe
yum -y install openssl-devel 
tar zxf nrpe-3.1.0.tar.gz
cd nrpe-3.1.0/
./configure
make all
make install-plugin
make install-daemon
#生成 nrpe.cfg
make install-daemon-config
cp sample-config/nrpe.cfg /usr/local/nagios/etc/nrpe.cfg

yum install sysstat -y

2.1 测试启动nrpe (客户端)

#-d 后台启动 ,  -c 接配置文件
 /usr/local/nagios/bin/nrpe -d -c /usr/local/nagios/etc/nrpe.cfg 
[root@nagios-client nrpe-3.1.0]# echo "/usr/local/nagios/bin/nrpe -d -c /usr/local/nagios/etc/nrpe.cfg" >> /etc/rc.local
[root@nagios-client nrpe-3.1.0]# chmod +x /etc/rc.d/rc.local           # centos 7下需要这一步, 不然/etc/rc.local中的内容开机可能不执行
[root@nagios-client nrpe-3.1.0]# netstat -lnput|grep 5666
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN      28296/nrpe          
tcp6       0      0 :::5666                 :::*                    LISTEN      28296/nrpe 
[root@nagios-client nrpe-3.1.0]# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v3.1.0-rc1

在这里插入图片描述

2.2 修改配置文件

允许服务端IP和本机访问,192.168.26.136是nagios服务端IP地址

cd /usr/local/nagios/etc/
vi nrpe.cfg

allowed_hosts=127.0.0.1,::1,192.168.26.136

在这里插入图片描述
注释304~308的文本
在这里插入图片描述
换成下面的内容。
注:command[ ],中括号里面是方法名, w指warning , c指critical 、超过这些数值就会显示警报。
/usr/local/nagios/libexec/check_users 这里就是nrpe里面那些监控插件/脚本

# my custom monitor items
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -r -w .15,.10,.05 -c .30,.25,.20
# -p后面接的是磁盘分区
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /
command[check_mem]=/usr/local/nagios/libexec/check_mem.pl -w 90% -c 95%
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%

注*:上面依次为对负载,内存,硬盘,虚拟内存,磁盘 IO的监控,这些都是本地的服务〔我们这里称之为被动监控),由 nagios 服务器端通过nrpe插件定时去 client的nrpe服务定期获取信息。原理如下图:
在这里插入图片描述

创建check_mem.pl

vi /usr/local/nagios/libexec/check_mem.pl
chmod 755 /usr/local/nagios/libexec/check_mem.pl

#! /usr/bin/perl -w
#
# $Id: check_mem.pl 8 2008-08-23 08:59:52Z rhomann $
#
# check_mem v1.7 plugin for nagios
#
# uses the output of `free` to find the percentage of memory used
#
# Copyright Notice: GPL
#
# History:
# v1.8 Rouven Homann - [email protected]
# + added findbin patch from Duane Toler
# + added backward compatibility patch from Timour Ezeev
#
# v1.7 Ingo Lantschner - ingo AT boxbe DOT com
# + adapted for systems with no swap (avoiding divison through 0)
#
# v1.6 Cedric Temple - cedric DOT temple AT cedrictemple DOT info
# + add swap monitoring
#       + if warning and critical threshold are 0, exit with OK
#       + add a directive to exclude/include buffers
#
# v1.5 Rouven Homann - [email protected]
# + perfomance tweak with free -mt (just one sub process started instead of 7)
# + more code cleanup
#
# v1.4 Garrett Honeycutt - [email protected]
# + Fixed PerfData output to adhere to standards and show crit/warn values
#
# v1.3 Rouven Homann - [email protected]
#   + Memory installed, used and free displayed in verbose mode
# + Bit Code Cleanup
#
# v1.2 Rouven Homann - [email protected]
# + Bug fixed where verbose output was required (nrpe2)
#       + Bug fixed where perfomance data was not displayed at verbose output
# + FindBin Module used for the nagios plugin path of the utils.pm
#
# v1.1 Rouven Homann - [email protected]
#     + Status Support (-c, -w)
# + Syntax Help Informations (-h)
#       + Version Informations Output (-V)
# + Verbose Output (-v)
#       + Better Error Code Output (as described in plugin guideline)
#
# v1.0 Garrett Honeycutt - [email protected]
#   + Initial Release
#
use strict;
use FindBin;
FindBin::again();
use lib $FindBin::Bin;
use utils qw($TIMEOUT %ERRORS &print_revision &support);
use vars qw($PROGNAME $PROGVER);
use Getopt::Long;
use vars qw($opt_V $opt_h $verbose $opt_w $opt_c);

$PROGNAME = "check_mem";
$PROGVER = "1.8";

# add a directive to exclude buffers:
my $DONT_INCLUDE_BUFFERS = 0;

sub print_help ();
sub print_usage ();

Getopt::Long::Configure('bundling');
GetOptions ("V"   => \$opt_V, "version"    => \$opt_V,
  "h"   => \$opt_h, "help"       => \$opt_h,
        "v" => \$verbose, "verbose"  => \$verbose,
  "w=s" => \$opt_w, "warning=s"  => \$opt_w,
  "c=s" => \$opt_c, "critical=s" => \$opt_c);

if ($opt_V) {
    
    
  print_revision($PROGNAME,'$Revision: '.$PROGVER.' $');
  exit $ERRORS{
    
    'UNKNOWN'};
}

if ($opt_h) {
    
    
  print_help();
  exit $ERRORS{
    
    'UNKNOWN'};
}

print_usage() unless (($opt_c) && ($opt_w));

my ($mem_critical, $swap_critical);
my ($mem_warning, $swap_warning);
($mem_critical, $swap_critical) = ($1,$2) if ($opt_c =~ /([0-9]+)[%]?(?:,([0-9]+)[%]?)?/);
($mem_warning, $swap_warning)   = ($1,$2) if ($opt_w =~ /([0-9]+)[%]?(?:,([0-9]+)[%]?)?/);

# Check if swap params were supplied
$swap_critical ||= 100;
$swap_warning  ||= 100;

# print threshold in output message
my $mem_threshold_output = " (";
my $swap_threshold_output = " (";

if ( $mem_warning > 0 && $mem_critical > 0) {
    
    
  $mem_threshold_output .= "W> $mem_warning, C> $mem_critical";
}
elsif ( $mem_warning > 0 ) {
    
    
  $mem_threshold_output .= "W> $mem_warning";
}
elsif ( $mem_critical > 0 ) {
    
    
  $mem_threshold_output .= "C> $mem_critical";
}

if ( $swap_warning > 0 && $swap_critical > 0) {
    
    
  $swap_threshold_output .= "W> $swap_warning, C> $swap_critical";
}
elsif ( $swap_warning > 0 ) {
    
    
  $swap_threshold_output .= "W> $swap_warning";
}
elsif ( $swap_critical > 0 )  {
    
    
  $swap_threshold_output .= "C> $swap_critical";
}

$mem_threshold_output .= ")";
$swap_threshold_output .= ")";

my $verbose = $verbose;

my ($mem_percent, $mem_total, $mem_used, $swap_percent, $swap_total, $swap_used) = &sys_stats();
my $free_mem = $mem_total - $mem_used;
my $free_swap = $swap_total - $swap_used;

# set output message
my $output = "Memory Usage".$mem_threshold_output.": ". $mem_percent.'% <br>';
$output .= "Swap Usage".$swap_threshold_output.": ". $swap_percent.'%';

# set verbose output message
my $verbose_output = "Memory Usage:".$mem_threshold_output.": ". $mem_percent.'% '."- Total: $mem_total MB, used: $mem_used MB, free: $free_mem MB<br>";
$verbose_output .= "Swap Usage:".$swap_threshold_output.": ". $swap_percent.'% '."- Total: $swap_total MB, used: $swap_used MB, free: $free_swap MB<br>";

# set perfdata message
my $perfdata_output = "MemUsed=$mem_percent\%;$mem_warning;$mem_critical";
$perfdata_output .= " SwapUsed=$swap_percent\%;$swap_warning;$swap_critical";


# if threshold are 0, exit with OK
if ( $mem_warning == 0 ) {
    
     $mem_warning = 101 };
if ( $swap_warning == 0 ) {
    
     $swap_warning = 101 };
if ( $mem_critical == 0 ) {
    
     $mem_critical = 101 };
if ( $swap_critical == 0 ) {
    
     $swap_critical = 101 };


if ($mem_percent>$mem_critical || $swap_percent>$swap_critical) {
    
    
    if ($verbose) {
    
     print "<b>CRITICAL: ".$verbose_output."</b>|".$perfdata_output."\n";}
    else {
    
     print "<b>CRITICAL: ".$output."</b>|".$perfdata_output."\n";}
    exit $ERRORS{
    
    'CRITICAL'};
} elsif ($mem_percent>$mem_warning || $swap_percent>$swap_warning) {
    
    
    if ($verbose) {
    
     print "<b>WARNING: ".$verbose_output."</b>|".$perfdata_output."\n";}
    else {
    
     print "<b>WARNING: ".$output."</b>|".$perfdata_output."\n";}
    exit $ERRORS{
    
    'WARNING'};
} else {
    
    
    if ($verbose) {
    
     print "OK: ".$verbose_output."|".$perfdata_output."\n";}
    else {
    
     print "OK: ".$output."|".$perfdata_output."\n";}
    exit $ERRORS{
    
    'OK'};
}

sub sys_stats {
    
    
    my @memory = split(" ", `free -mt`);
    my $mem_total = $memory[7];
    my $mem_used;
    if ( $DONT_INCLUDE_BUFFERS) {
    
     $mem_used = $memory[15]; }
    else {
    
     $mem_used = $memory[8];}
    my $swap_total = $memory[18];
    my $swap_used = $memory[19];
    my $mem_percent = ($mem_used / $mem_total) * 100;
    my $swap_percent;
    if ($swap_total == 0) {
    
    
  $swap_percent = 0;
    } else {
    
    
  $swap_percent = ($swap_used / $swap_total) * 100;
    }
    return (sprintf("%.0f",$mem_percent),$mem_total,$mem_used, sprintf("%.0f",$swap_percent),$swap_total,$swap_used);
}

sub print_usage () {
    
    
    print "Usage: $PROGNAME -w <warn> -c <crit> [-v] [-h]\n";
    exit $ERRORS{
    
    'UNKNOWN'} unless ($opt_h);
}

sub print_help () {
    
    
    print_revision($PROGNAME,'$Revision: '.$PROGVER.' $');
    print "Copyright (c) 2005 Garrett Honeycutt/Rouven Homann/Cedric Temple\n";
    print "\n";
    print_usage();
    print "\n";
    print "-w <MemoryWarn>,<SwapWarn> = Memory and Swap usage to activate a warning message (eg: -w 90,25 ) .\n";
    print "-c <MemoryCrit>,<SwapCrit> = Memory and Swap usage to activate a critical message (eg: -c 95,50 ).\n";
    print "-v = Verbose Output.\n";
    print "-h = This screen.\n\n";
    support();
}

重启nrpe 并本地测试

# 方法一
[root@client1 etc]# killall nrpe               
[root@client1 etc]# /usr/local/nagios/bin/nrpe -d -c /usr/local/nagios/etc/nrpe.cfg
# 方法二
[root@client1 etc]# kill -HUP `ps -ef|grep nrpe|awk 'NR==1{print $2}'`

netstat -lnt | grep 5666
在本机执行两个命令看下效果
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_mem

/usr/local/nagios/libexec/check_nrpe -H localhost -c check_disk

在这里插入图片描述
小tips:
ps -ef 可以看出最后有 nagios的启动语句,下次killall nagios ,killall nrpe ,pkill nrpe 在启动程序的话,直接复制粘贴即可

[root@memcache137 etc]# ps -ef | grep nagios
nagios    79948      1  0 11:07 ?        00:00:00 /usr/local/nagios/bin/nrpe -d -c /usr/local/nagios/etc/nrpe.cfg

END
这客户端CV参考了些许文章,搞得有丶缝合了

猜你喜欢

转载自blog.csdn.net/Nightwish5/article/details/114262388