关于Zabbix BGP监控
Zabbix 是一款著名的软件工具,可帮助监控网络、服务器、虚拟机和云服务等 IT 基础设施。此外,它还可以监控 BGP,以确保路由信息正确交换,并且网络也按预期运行。
为了通过Zabbix监控BGP,我们可以使用SNMP协议从支持BGP的设备(如路由器或交换机)收集数据。我们可以轻松地将Zabbix配置为使用SNMP查询BGP设备并收集BGP相关数据。这包括 BGP 对等体数、BGP 路由数和 BGP 会话状态等数据。
今天我们介绍二种方案
1.为手动添加监控项
2.使用ZABBIX模版
其实原理都是一样的,第二种会方便一点,自动发现BGP Peer
Zabbix 监控BGP状态的前提:
1.查到交换机或路由、防火墙的OID节点号
2.设备开始SNMP Agent
- 华为OID 查询如:S7700
https://support.huawei.com/enterprise/zh/doc/EDOC1100126910/5dad6178
思科OID 号
https://oidref.com/1.3.6.1.2.1.15.3.1.2
好像华3 也是一样的OID节点号,可自行查看官网确认.
- 科普下BGP邻居关系的6种状态
下面6种状态就是我们要在ZABBIX 取值的码,这样就可以监控到BGP的状态了
1.Idle(空闲)状态:
BGP总以Idle状态为起点,该状态拒绝所有入站的连接。只有在BGP启动之后,BGP进程将初始化所有BGP资源,初始化去往邻居的TCP连接,监听来自邻居的TCP初始化消息,并改为连接状态。启动事件一般是配置BGP进程
2.Connect(连接)状态
进入connect状态后,BGP进程将一直等待TCP连接的完成(三次握手),当TCP连接成功,BGP将会向邻居发送Open消息,并进入Opensent(打开发送)状态。如果TCP会话没有建立,BGP将继续监听邻居初始化的连接,开启连接重试定时器,迁移到Active(激活)状态。Connect→OpenSent(TCP连接建立) Connect→Active(TCP连接暂未建立)
3.Acitve(激活)状态
该状态下,BGP进程急需尝试和邻居建立TCP连接,如果TCP连接建立成功,BGP进程清除连接重试定时器,完后初始化过程,并向邻居发送open报文,迁移到OpenSent(打开发送)状态。
如果连接重试定时器到期BGP进程依然处在激活状态,进程将返回连接状态,监听邻居发起的TCP会话,这个过程将反复循环,直到监听由邻居发起的TCP会话。
4.OpenSent(打开发送)状态
进入了该状态,已经发送了Open消息,BGP将一直等待来自邻居的Open消息。一旦接收到Open消息后,将检查该消息每一个字段,如果有差错,将会发送Notification消息迁入Idle状态。如果接收到的Open消息没错,就会发送Keeplive消息并商讨Keeplive定时器和Keeplive发送的时间。并迁移到OpenConfirm(打开确认)状态 。
5.OpenConfirm(打开确认)状态
进入了这个状态,BGP进程将等待邻居确认的Kepplive或Notification消息。如果接收到的是Keeplive消息则迁移到Established(建立)状态,如果是Notification消息那么就迁移到Idle状态。
6.Established(建立)状态
进入了这个状态就说明了BGP的对等连接正式建立了,对等体之间交互Update、Keeplive和Notification消息,则会迁移到Idle状态。
- 直接开始,我们先查看交换有多少个BGP对等体
我这里有4个BPG对等体信息
1.ZABBIX 里添加监控项,格式为OID节点号加上BPG对等体IP如下:
1.3.6.1.2.1.15.3.1.2.10.0.0.250
1.3.6.1.2.1.15.3.1.2. #OID节点号
10.0.0.250 #BGP 对等体IP
注:键值不能有冲突
因此我们要建4个不同的
因为有4个BPG对等体所我们建4个监控项
2.新建触发器
3.新建图形
也可以分开建4个图形。
4.zabbix报警动作
根据你自己ZABBIX的配置,可以是邮件或者微信、钉钉
手动方案完成。
**
下面我要使用第二种方案,采用zabbix 模版监控BGP
**
1,导入zabbix 模版,我使用的是Zabbix 6.0 的TLS 版本
zabbix_export:
version: '6.0'
date: '2021-11-21T22:04:29Z'
groups:
-
uuid: e0242822f07b4633a283b8dc935d3dd9
name: Templates_Created
templates:
-
uuid: 6bf016389ccf49c8b491ac73c41a4fd0
template: 'BGPv4 SNMP'
name: 'BGPv4 SNMP'
description: |
## Description
Template BGPv4 Sessions -Admin Status -Established Time -Operation Status -AS Name -BGP last Error By: Flavio Gomes Figueira Camacho Junior Require: ValueMaps .BGP4-MIB::bgpPeerAdminStatus .BGP4-MIB::bgpPeerState ExternalScript .as_name Base: BGP4-MIB.mib https://www.iana.org/assignments/bgp-parameters/bgp-parameters.xhtml
## Overview
Template BGPv4 Sessions
-Admin Status
-Established Time
-Operation Status
-AS Name
-BGP last Error
By: Flavio Gomes Figueira Camacho Junior
Require:
ValueMaps
.BGP4-MIB::bgpPeerAdminStatus
.BGP4-MIB::bgpPeerState
ExternalScript
.as\_name
Base:
BGP4-MIB.mib
<https://www.iana.org/assignments/bgp-parameters/bgp-parameters.xhtml>
External Scripts and Value Mapping on my Github:
https://github.com/flaviojunior1995/Zabbix-Templates
## Author
Flavio Camacho Junior
groups:
-
name: Templates_Created
discovery_rules:
-
uuid: 860002e447bb45bca37b3c75527d38ee
name: 'BGPv4 Peers Discovery'
type: SNMP_AGENT
snmp_oid: 'discovery[{#SNMPVALUE},.1.3.6.1.2.1.15.3.1.7,{#SNMPASN},.1.3.6.1.2.1.15.3.1.9]'
key: BgpPeerDiscovery
delay: 1h
lifetime: 7d
item_prototypes:
-
uuid: 9b1be237207d4e45bdd916f3b7af5a48
name: 'AS Name for IPv4 peer $1'
type: EXTERNAL
key: 'as_name["{#SNMPVALUE}","{#SNMPASN}"]'
delay: 1d
history: 7d
trends: '0'
value_type: CHAR
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
-
uuid: 44dbf3e1fac84ff8a70047810ee08a62
name: 'Administrative status for peer $1'
type: SNMP_AGENT
snmp_oid: '1.3.6.1.2.1.15.3.1.3.{#SNMPVALUE}'
key: 'bgpPeerAdminStatus[{#SNMPVALUE}]'
delay: 10m
history: 7d
trends: '0'
valuemap:
name: 'BGP4-MIB::bgpPeerAdminStatus'
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
-
uuid: 28010bead7854f5c871557f5b6dd9681
name: 'Established time for peer $1'
type: SNMP_AGENT
snmp_oid: '1.3.6.1.2.1.15.3.1.16.{#SNMPVALUE}'
key: 'bgpPeerFsmEstablishedTime[{#SNMPVALUE}]'
delay: 10m
history: 30d
trends: '0'
units: uptime
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
trigger_prototypes:
-
uuid: bca14f8136ae4df9ad04c56c7399ac53
expression: 'last(/BGPv4 SNMP/bgpPeerFsmEstablishedTime[{#SNMPVALUE}])<{$PEER_LOW_TIME}'
name: 'BGP peer up time low {#SNMPVALUE} ASN {#SNMPASN}'
priority: INFO
dependencies:
-
name: 'BGP peer {#SNMPVALUE} ASN {#SNMPASN} is DOWN'
expression: 'last(/BGPv4 SNMP/bgpPeerState[{#SNMPVALUE}],#3)<>6 and last(/BGPv4 SNMP/bgpPeerAdminStatus[{#SNMPVALUE}])=2'
-
uuid: fb9b74ba11ee4b62a38cfde81f0593f4
name: 'BGP peer last error {#SNMPVALUE}'
type: SNMP_AGENT
snmp_oid: '1.3.6.1.2.1.15.3.1.14.{#SNMPVALUE}'
key: 'bgpPeerLastError[{#SNMPVALUE}]'
delay: 10m
history: 30d
trends: '0'
value_type: CHAR
preprocessing:
-
type: JAVASCRIPT
parameters:
- |
value = (value.replace(/\s+/g, ''));
if (value === "0000") {
return "NO last error"
}
if (value === "0100") {
return "Message Header Error - Unspecific"
}
if (value === "0101") {
return "Message Header Error - Connection Not Synchronized"
}
if (value === "0102") {
return "Message Header Error - Bad Message Length"
}
if (value === "0103") {
return "Message Header Error - Bad Message Type"
}
if (value === "0200") {
return "OPEN Message Error - Unspecific"
}
if (value === "0201") {
return "OPEN Message Error - Unsupported Version Number"
}
if (value === "0202") {
return "OPEN Message Error - Bad Peer AS"
}
if (value === "0203") {
return "OPEN Message Error - Bad BGP Identifier"
}
if (value === "0204") {
return "OPEN Message Error - Unsupported Optional Parameter"
}
if (value === "0205") {
return "OPEN Message Error - [Deprecated]"
}
if (value === "0206") {
return "OPEN Message Error - Unacceptable Hold Time"
}
if (value === "0207") {
return "OPEN Message Error - Unsupported Capability";
}
if (value === "0208") {
return "OPEN Message Error - Role Mismatch (Expire on 2021-03-29)"
}
if (value === "0300") {
return "UPDATE Message Error - Unspecific"
}
if (value === "0301") {
return "UPDATE Message Error - Malformed Attribute List"
}
if (value === "0302") {
return "UPDATE Message Error - Unrecognized Well-known Attibute"
}
if (value === "0303") {
return "UPDATE Message Error - Missing Well-know Attribute"
}
if (value === "0304") {
return "UPDATE Message Error - Attribute Flags Error"
}
if (value === "0305") {
return "UPDATE Message Error - Attribute Length Error"
}
if (value === "0306") {
return "UPDATE Message Error - Invalid ORIGIN Attribute"
}
if (value === "0307") {
return "UPDATE Message Error - [Deprecated]"
}
if (value === "0308") {
return "UPDATE Message Error - Invalid NEXT_HOP Attribute"
}
if (value === "0309") {
return "UPDATE Message Error - Optional Attribute Error"
}
if (value === "0310") {
return "UPDATE Message Error - Invalid Network Field"
}
if (value === "0311") {
return "UPDATE Message Error - Malformed AS_PATH"
}
if (value === "0400") {
return "Hold Timer Expired"
}
if (value === "0500") {
return "Finite State Machine Error - Unspecified Error"
}
if (value === "0501") {
return "Finite State Machine Error - Receive Unexpected Message in OpenSent State"
}
if (value === "0502") {
return "Finite State Machine Error - Receive Unexpected Message in OpenConfirm State"
}
if (value === "0503") {
return "Finite State Machine Error - Receive Unexpected Message in Established State"
}
if (value === "0600") {
return "Cease NOTIFICATION - Reserved"
}
if (value === "0601") {
return "Cease NOTIFICATION - Maximum Number of Prefixes Reached"
}
if (value === "0602") {
return "Cease NOTIFICATION - Administrative Shutdown"
}
if (value === "0603") {
return "Cease NOTIFICATION - Peer De-configured"
}
if (value === "0604") {
return "Cease NOTIFICATION - Administrative Reset"
}
if (value === "0605") {
return "Cease NOTIFICATION - Connection Rejected"
}
if (value === "0606") {
return "Cease NOTIFICATION - Other Configuration Change"
}
if (value === "0607") {
return "Cease NOTIFICATION - Connection Collision Resolution"
}
if (value === "0608") {
return "Cease NOTIFICATION - Out of Resources"
}
if (value === "0609") {
return "Cease NOTIFICATION - Hard Reset"
}
if (value === "0700") {
return "ROUTE-REFRESH Message Error - Reserded"
}
if (value === "0701") {
return "ROUTE-REFRESH Message Error - Invalid Message Length"
}
return value
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
-
uuid: 32ffca92bd1644e1b99ba8f8b6b2cf65
name: 'Remote AS for peer $1'
type: SNMP_AGENT
snmp_oid: '1.3.6.1.2.1.15.3.1.9.{#SNMPVALUE}'
key: 'bgpPeerRemoteAs[{#SNMPVALUE}]'
delay: 1d
history: 7d
trends: '0'
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
-
uuid: af084bc48aed4b189ef4c8a5caf21405
name: 'Operational status for peer $1'
type: SNMP_AGENT
snmp_oid: '1.3.6.1.2.1.15.3.1.2.{#SNMPVALUE}'
key: 'bgpPeerState[{#SNMPVALUE}]'
history: 30d
trends: '0'
valuemap:
name: 'BGP4-MIB::bgpPeerState'
tags:
-
tag: Application
value: BGPv4
-
tag: Application
value: 'BGPv4 {#SNMPVALUE}'
trigger_prototypes:
-
uuid: 454e93c4d273469ba4e9baac37a5ff5e
expression: 'last(/BGPv4 SNMP/bgpPeerState[{#SNMPVALUE}],#3)<>6 and last(/BGPv4 SNMP/bgpPeerAdminStatus[{#SNMPVALUE}])=2'
name: 'BGP peer {#SNMPVALUE} ASN {#SNMPASN} is DOWN'
priority: HIGH
description: 'Trigger for peer that has a remote AS matching {$BGP_PEER_AS} macro.'
macros:
-
macro: '{$PEER_LOW_TIME}'
value: '14400'
description: 'time in sec alarm for low uptime bgp session'
valuemaps:
-
uuid: f94f32d1110545879431b948674e0be4
name: 'BGP4-MIB::bgpPeerAdminStatus'
mappings:
-
value: '1'
newvalue: stoped
-
value: '2'
newvalue: started
-
uuid: 292fe22b2d484c27a5fe38ff1f95cfa5
name: 'BGP4-MIB::bgpPeerState'
mappings:
-
value: '1'
newvalue: idle
-
value: '2'
newvalue: connect
-
value: '3'
newvalue: active
-
value: '4'
newvalue: opensent
-
value: '5'
newvalue: openconfirm
-
value: '6'
newvalue: established
将上面代码另存为template_bgpv4_snmp.yaml 就可以了,直接导入6.0了
2.找到导入的模版,点自动发现
将模版应用到交换机上
点自动发现
点测试
这样我们就可以自动发现BGP 的对等休及BGP 号等相关信息了
在点立即运行
这时在监控项里就有了
后面动作警告,自动模版和手动都差不多。就不过多的说明了.