Storm之——使用Monit监控Storm

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/l1028386804/article/details/89165346

转载请注明出处:https://blog.csdn.net/l1028386804/article/details/89165346

一、下载并安装Monit

cd ~/Downloads
wget https://mmonit.com/monit/dist/monit-5.20.0.tar.gz
tar -zxvf monit-5.20.0.tar.gz
cd monit-5.20.0
./configure --without-pam --without-ssl
make
make install

二、配置Monit

1.复制配置文件并设置权限

首先,需要把Monit源码目录下的配置文件monitrc复制到/etc目录下并修改访问权限,切换到root用户,执行如下命令:

cp -f monitrc /etc/monitrc
chown root:root /etc/monitrc
chmod 0700 /etc/monitrc

2.修改配置文件的内容

使用命令vim /etc/monitrc修改/etc/monitrc配置文件的内容
2-1.修改Monit的检查周期
找到"set daemon 30" 一行,修改检查周期,默认为30秒,可以根据需要进行调整

2-2.修改http服务相关的配置
找到"set httpd port 2812 and" 一行,修改http服务相关的配置,原内容如下:

set httpd port 2812 and
    use address localhost  # only accept connection from localhost
    allow localhost        # allow localhost to connect to the server and
    allow admin:monit      # require user 'admin' with password 'monit'

第一行的2812是端口号,第2行的localhost是访问ip,第3行的localhost表示只允许本机访问,第4行的用户名和密码分别为admin和monit
为了可以远程进行访问,可以把第2行的访问IP localhost改为主机的IP地址,第3行的localhost改为0.0.0.0/0.0.0.0以允许任何主机访问。也可以修改用户名和密码。
修改后如下所示:

set httpd port 2812 and
    use address 192.168.175.11  # only accept connection from localhost
    allow 0.0.0.0/0.0.0.0        # allow localhost to connect to the server and
    allow admin:monit      # require user 'admin' with password 'monit'

2-3.添加对Storm主机的监控
在montric配置文件中添加如下配置:

check host node11 with address 192.168.175.11
  if failed icmp type echo count 3 with timeout 3 seconds then alert 

注意:如果Storm和Monit安装在同一台服务器上,则check host后不能写本机的主机名,否则会报错!!!

2-4.添加对Storm进程的监控
在montric配置文件中添加如下配置:

check program storm-nimbus with path "/opt/storm/storm-nimbus-exist.sh"
  start program = "/opt/storm/storm-nimbus-start.sh"
  stop program = "/opt/storm/storm-nimbus-stop.sh"
  if status = 0 then restart
  if 3 restarts within 5 cycles then alert

storm-nimbus-exist.sh文件的内容如下:

#!/bin/sh
cmd=`/opt/storm/storm-deamon-exist.sh nimbus`
id=$cmd
echo $id
exit $id

storm-deamon-exist.sh文件内容如下:

#!/bin/sh
process=$1
id=0
if [ $# -ge 1 ] && [ -n $process ] ; then
  cmd=`jps | grep "$process" | awk '{print $1}'`
  if [ -n "$cmd" ]; then
     id=1
  fi
fi
echo $id

storm-nimbus-start.sh文件的内容如下:

#!/bin/sh
cmd=`/opt/storm/storm-deamon-start.sh nimbus`
$cmd

storm-deamon-start.sh文件的内容如下:

#!/bin/sh

if [ $# -ge 1 ] && [ -n $1 ] ; then
    nohup storm $1 > $STORM_HOME/logs/nohup.out 2>&1 &
	echo "start java process [$1]"
fi

storm-nimbus-stop.sh文件的内容如下:

#!/bin/sh
cmd=`/opt/storm/storm-deamon-stop.sh nimbus`
$cmd

storm-deamon-stop.sh文件的内容如下:

#!/bin/sh
if [ $# -ge 1 ] && [ -n $1] ; then
    kill -s 9 `jps | grep "$1" | awk '{print $1}'`
	echo "java process [$1] killed."
fi

三、启动Monit

#检测monitrc文件的语法是否正确

monit -t

正确会输出如下信息:

New Monit id: f9b0a61c9cb9ced6a65d56324f4d61c5
Stored in '/root/.monit.id'
Control file syntax OK

#启动Monit守护进程

monit
Starting Monit 5.20.0 daemon with http interface at [192.168.175.11]:2812

#检测是否启动成功

ps -ef | grep monit
root      10069      1  0 11:01 ?        00:00:00 monit
root      10076   2013  0 11:02 pts/0    00:00:00 grep monit

启动Monit的所有服务

monit restart all

#查看Monit的所有服务的状态信息

monit status

所有配置和启动都正确的情况下会输出如下信息:

Monit 5.20.0 uptime: 8m

Remote Host 'node11'
  status                       Online with all services
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  ping response time           0.016 ms
  data collected               Mon, 08 Apr 2019 11:36:19

Program 'storm-nimbus'
  status                       Status ok
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  last exit value              1
  last output                  1
  data collected               Mon, 08 Apr 2019 11:36:19

System 'liuyazhuang11'
  status                       Running
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  load average                 [0.21] [0.35] [0.29]
  cpu                          0.6%us 0.5%sy 0.0%wa
  memory usage                 858.7 MB [22.5%]
  swap usage                   0 B [0.0%]
  uptime                       2h 36m
  boot time                    Mon, 08 Apr 2019 08:59:45
  data collected               Mon, 08 Apr 2019 11:36:19

也可以在浏览器中输出http://192.168.175.11:2812/ 输入用户名admin和密码monit进入web页面查看状态

四、获取Monit帮助信息

命令行输入:

monit -h
Usage: monit [options]+ [command]
Options are as follows:
 -c file       Use this control file
 -d n          Run as a daemon once per n seconds
 -g name       Set group name for monit commands
 -l logfile    Print log information to this file
 -p pidfile    Use this lock file in daemon mode
 -s statefile  Set the file monit should write state information to
 -I            Do not run in background (needed when run from init)
 --id          Print Monit's unique ID
 --resetid     Reset Monit's unique ID. Use with caution
 -B            Batch command line mode (do not output tables or colors)
 -t            Run syntax check for the control file
 -v            Verbose mode, work noisy (diagnostic output)
 -vv           Very verbose mode, same as -v plus log stacktrace on error
 -H [filename] Print SHA1 and MD5 hashes of the file or of stdin if the
               filename is omited; monit will exit afterwards
 -V            Print version number and patchlevel
 -h            Print this text
Optional commands are as follows:
 start all             - Start all services
 start <name>          - Only start the named service
 stop all              - Stop all services
 stop <name>           - Stop the named service
 restart all           - Stop and start all services
 restart <name>        - Only restart the named service
 monitor all           - Enable monitoring of all services
 monitor <name>        - Only enable monitoring of the named service
 unmonitor all         - Disable monitoring of all services
 unmonitor <name>      - Only disable monitoring of the named service
 reload                - Reinitialize monit
 status [name]         - Print full status information for service(s)
 summary [name]        - Print short status information for service(s)
 report [up|down|..]   - Report state of services. See manual for options
 quit                  - Kill the monit daemon process
 validate              - Check all services and start if not running
 procmatch <pattern>   - Test process matching pattern 

猜你喜欢

转载自blog.csdn.net/l1028386804/article/details/89165346