收集日志的工具
- 日志易(收费)
- splunk(国外,按流量收费)
介绍
发展史:使用java语言,在luncen的基础上做二次封装,提供restful接口
搜索的原理:倒排索引
特点:水平扩展方便、提供高可用、分布式存储、使用简单
配置文件
/etc/elasticsearch/elasticsearch.yml#es的主要配置文件
/etc/elasticsearch/jvm.options #配置jvm虚拟机的内存信息
/etc/sysconfig/elasticsearch #配置相关系统变量
/usr/lib/sysctl.d/elasticsearch.conf#配置相关系统变量
/usr/lib/systemd/system/elasticsearch.service#es的服务程序
[root@es01 elasticsearch]# grep -Ev ‘^$|#’ /etc/elasticsearch/elasticsearch.yml
node.name: oldboy01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true #内存锁定
network.host: 10.0.0.240
http.port: 9200
systemctl edit elasticsearch
[Service]
LimitMEMLOCK=infinity
概念
1、索引:相当于在mysql当中创建一个数据库(database)
2、类型:相当于数据库当中的一张表(table)
3、docs:表中字段信息
命令行交互
创建一个索引
curl -XPUT http://10.0.0.240:9200/oldboy
写入一条数据
curl -XPUT ‘10.0.0.240:9200/oldboy/student/1?pretty’ -H ‘Content-Type: application/json’ -d’
{
“first_name” : “zhang”,
“last_name”: “san”,
“age” : 28,
“about” : “I love to go rock climbing”,
“interests”: [ “sports” ]
}’
随机id写入数据
curl -XPOST ‘10.0.0.240:9200/oldboy/student/?pretty’ -H ‘Content-Type: application/json’ -d’ {
“first_name”: “li”,
“last_name” : “mingming”,
“age” : 45,
“about”: “I like to swime”,
“interests”: [ “reading” ]
}’
查询指定id的信息
curl -XGET ‘10.0.0.240:9200/oldboy/student/1?pretty’
查询索引内所有信息
curl -XGET ‘10.0.0.240:9200/oldboy/_search/?pretty’
删除指定id的信息
curl -XDELETE ‘10.0.0.240:9200/oldboy/student/1?pretty’
删除索引
curl -XDELETE ‘10.0.0.240:9200/oldboy/?pretty’
kibana交互
配置文件
[root@es01 es-software]# grep -Ev ‘^$|#’ /etc/kibana/kibana.yml
server.port: 5601
server.host: “10.0.0.240”
elasticsearch.hosts: [“http://10.0.0.240:9200”]
kibana.index: “.kibana”
修改系统默认分片和副本信息
PUT _template/template_http_request_record
{
“index_patterns”: ["*"],
“settings”: {
“number_of_shards” : 5,
“number_of_replicas” : 1
}
}
es集群
cluster.name: oldboy-cluster
node.name: oldboy01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 10.0.0.240
http.port: 9200
discovery.zen.ping.unicast.hosts: [“10.0.0.240”, “10.0.0.241”] #相互通信即可
discovery.zen.minimum_master_nodes: 2 #两个节点选择主节点
/var/log/elasticsearch/my-application.log #日志文件
分片作用
主分片:粗框代表主分片,主要响应修改索引相关请求信息,提供对外查询的服务
副本分片:细框代表副本分片,也就是主分片的备份,提供对外查询服务
冗余
集群当中有3台主机,最多可以宕机几台
1、前提是一个副本,一台一台的宕机最多可以宕机2台(需要手动的修改一下配置文件)
2、前提是一个副本,一台一台的宕机,每宕机一台修复一台,集群的健康值始终为绿色
3、前提是二个副本,如一次性宕机2台,然后后动修改es的配置文件,可以让集群正常运行
注意事项
索引一旦创建完成,分片的数量是不能修改,副本的数量是可以修改
分片数量是集群数量倍数,3*3。根据自己的需求来定分片的数量
监控es集群运行的状态
curl ‘10.0.0.240:9200/_cluster/health?pretty’
{
“cluster_name” : “oldboy-cluster”,
“status” : “green”, #代表集群当前的健康值
“timed_out” : false,
“number_of_nodes” : 3,#当前集群中节点的个数
“number_of_data_nodes” : 3,
“active_primary_shards” : 14,
“active_shards” : 33,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0
}
备份索引
- vim /etc/profile
export PATH=/opt/node/bin:$PATH - 更换源
npm install -g cnpm --registry=https://registry.npm.taobao.org
- cnpm install elasticdump -g
- 备份索引
mkdir /data elasticdump \ --input=http://10.0.0.240:9200/oldboy03 \ --output=/data/oldboy03.json \ --type=data
- 恢复数据
elasticdump \ --input=/data/oldboy.json \ --output=http://10.0.0.240:9200/oldboy
- 压缩式备份
elasticdump \ --input=http://10.0.0.240:9200/oldboy03 \ --output=$|gzip > /data/oldboy03.json.gz
中文分词器(所有节点安装)
安装
cd /usr/share/elasticsearch
./bin/elasticsearch-plugin install file:///opt/es-software/elasticsearch-analysis-ik-6.6.0.zip
systemctl restart elasticsearch
test索引应用
curl -XPOST http://10.0.0.240:9200/news/text/_mapping -H ‘Content-Type:application/json’ -d’
{
“properties”: {
“content”: {
“type”: “text”,
“analyzer”: “ik_max_word”,
“search_analyzer”: “ik_smart”
}
}
}’
写入数据
POST /news/text/3
{“content”:“贵宾成犬粮”}
验证
{
“query” : { “match” : { “content” : “贵宾成犬” }},
“highlight” : {
“pre_tags” : ["<tag1>", “<tag2>”],
“post_tags” : ["</tag1>", “</tag2>”],
“fields” : {
“content” : {}
}
}
}
动态添加词典
server {
listen 80;
server_name elk.oldboy.com;
location / {
root /usr/share/nginx/html/download;
charset utf-8,gbk;
autoindex on;
autoindex_localtime on;
autoindex_exact_size off;
}
}
cd /etc/elasticsearch/analysis-ik/
vim IKAnalyzer.cfg.xml
<entry key=“remote_ext_dict”>http://10.0.0.240/download/dic.txt</entry>
systemctl restart elasticsearch
filebeat
vim /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
json.keys_under_root: true #将日志格式更改为json
overwrite_keys: true
- type: log
enabled: true
paths:
- /var/log/nginx/error.log
- type: log
enabled: true
paths:
- /opt/tomcat/logs/localhost_access_log.*.txt
json.keys_under_root: true
overwrite_keys: true
tags: ["tomcat"]
- type: log
enabled: true
paths:
- /var/log/elasticsearch/elasticsearch.log
multiline.pattern: '^\[' #收集java日志,开头以[为分割
multiline.negate: true
multiline.match: after
output.elasticsearch:
hosts: ["10.0.0.240:9200","10.0.0.241:9200"]#冗余方案
indices: #不同日志生成不同的索引
- index: "nginx-access-%{[beat.version]}-%{+yyyy.MM}"
when.contains:
source: "/var/log/nginx/access.log" #根据文件名称区分
- index: "nginx-error-%{[beat.version]}-%{+yyyy.MM}"
when.contains:
source: "/var/log/nginx/error.log"
- index: "tomcat-access-%{[beat.version]}-%{+yyyy.MM}"
when.contains:
tags: "tomcat" #根据标签区分
- index: "mariadb-slow-%{[beat.version]}-%{+yyyy.MM}"
when.contains:
source: "/var/log/mariadb/slow.log"
- index: "mariadb-error-%{[beat.version]}-%{+yyyy.MM}"
when.contains:
source: "/var/log/mariadb/mariadb.log"
setup.template.name: "nginx" #索引名称
setup.template.pattern: "nginx-*" #索引样式
filebeat.config.modules: #开启模块功能
path: ${path.config}/modules.d/*.yml
reload.enabled: true
reload.period: 10s
#filebeat modules enable mysql启动mysql模块
#filebeat modules list查看模块列表
log_format json '{ "time_local": "$time_local", '
'"remote_addr": "$remote_addr", '
'"referer": "$http_referer", '
'"request": "$request", '
'"status": $status, '
'"bytes": $body_bytes_sent, '
'"agent": "$http_user_agent", '
'"x_forwarded": "$http_x_forwarded_for", '
'"up_addr": "$upstream_addr",'
'"up_host": "$upstream_http_host",'
'"upstream_time": "$upstream_response_time",'
'"request_time": "$request_time"'
' }';
access_log /var/log/nginx/access.log json;
pattern="{"clientip":"%h","ClientUser":"%l","authenticated":"%u","AccessTime":"%t","method":"%r","status":"%s","SendBytes":"%b","Query?string":"%q","partner":"%{Referer}i","AgentVersion":"%{User-Agent}i"}"/>
cat /etc/filebeat/modules.d/mysql.yml
- module: mysql
error:
enabled: true
var.paths: ["/var/log/mariadb/mariadb.log"]
slowlog:
enabled: true
var.paths: ["/var/log/mariadb/slow.log"]
nginx模块(自带json功能)无法使用
./bin/elasticsearch-plugin install ingest-user-agent
./bin/elasticsearch-plugin install ingest-geoip
filebeat+redis+logstash
filebeat
output.redis:
hosts: ["10.0.0.240"]
keys:
- key: "nginx_access" #索引名称
when.contains:
tags: "access"
- key: "nginx_error"
when.contains:
tags: "error"
redis检查日志
info
keys*
llen nginx_access #查看长度
lrange nginx_access 0 1 #查看第一个和第二个数据
logstash
cd /etc/logstash/conf.d
vim redis_nginx.conf
input {
redis {
host => "10.0.0.240"
port => "6379"
db => "0" #数据库名称
key => "nginx_access"
data_type => "list"
}
redis {
host => "10.0.0.240"
port => "6379"
db => "0"
key => "nginx_error"
data_type => "list"
}
}
filter {
mutate {
convert => ["upstream_time", "float"]
convert => ["request_time", "float"]
}
}
output {
stdout {}
if "access" in [tags] {
elasticsearch {
hosts => "http://10.0.0.240:9200"
manage_template => false
index => "nginx_access-%{+yyyy.MM}"
}
}
if "error" in [tags] {
elasticsearch {
hosts => "http://10.0.0.240:9200"
manage_template => false
index => "nginx_error-%{+yyyy.MM}"
}
}
}
filebeat+kafka+zookeeper+logstash+elasticsearch
安装zookeeper
1、上传安装包
cd /opt/es-software
2、解压安装包
tar xf zookeeper-3.4.11.tar.gz -C /opt
ln -s /opt/zookeeper-3.4.11/ /opt/zookeeper
3、创建数据目录
mkdir -p /data/zookeeper
4、复制一个配置文件
cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg
5、修改zoo.cfg配置文件
dataDir=/data/zookeeper
#在最后面加入
server.1=10.0.0.240:2888:3888
server.2=10.0.0.241:2888:3888
server.3=10.0.0.242:2888:3888
6、给每台服务添加一个myid,每台服务器的id是不同
echo "1" > /data/zookeeper/myid
echo "2" > /data/zookeeper/myid
echo "3" > /data/zookeeper/myid
7、启动zookeeper服务
/opt/zookeeper/bin/zkServer.sh start
8、检查zookeeper的运行状态
/opt/zookeeper/bin/zkServer.sh status
安装kafka
1、上传安装包
cd /opt/es-software
2、解压安装包
tar zxf kafka_2.11-1.0.0.tgz -C /opt/
ln -s /opt/kafka_2.11-1.0.0/ /opt/kafka
3、创建一个日志目录
mkdir /opt/kafka/logs
4、修改配置文件
vim /opt/kafka/config/server.properties
# broker的id,值为整数,且必须唯一,在一个集群中不能重复
broker.id=1
listeners=PLAINTEXT://10.0.0.240:9092
# 处理网络请求的线程数量,默认为3个
num.network.threads=3
# 执行磁盘IO操作的线程数量,默认为8个
num.io.threads=8
# socket服务发送数据的缓冲区大小,默认100KB
socket.send.buffer.bytes=102400
# socket服务接受数据的缓冲区大小,默认100KB
socket.receive.buffer.bytes=102400
# socket服务所能接受的一个请求的最大大小,默认为100M
socket.request.max.bytes=104857600
# kafka存储消息数据的目录
log.dirs=/opt/kafka/logs
# 每个topic默认的partition数量
num.partitions=1
# 在启动时恢复数据和关闭时刷新数据时每个数据目录的线程数量
num.recovery.threads.per.data.dir=1
#topic的offset的备份份数。建议设置更高的数字保证更高的可用性
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
#每个日志文件删除之前保存的时间
log.retention.hours=24
#这个属性就是每个文件的最大尺寸;当尺寸达到这个数值时,就会创建新文件
log.segment.bytes=1073741824
#检查日志分段文件的间隔时间,以确定是否文件属性是否到达删除要求。
log.retention.check.interval.ms=300000
# Zookeeper连接信息,如果是zookeeper集群,则以逗号隔开
zookeeper.connect=10.0.0.240:2181,10.0.0.241:2181,10.0.0.242:2181
# 连接zookeeper的超时时间,6s
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
5、前台启动测试能否正常启动
/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
6、测试创建topic
/opt/kafka/bin/kafka-topics.sh --create --zookeeper 10.0.0.241:2181,10.0.0.242:2181,10.0.0.240:2181 --partitions 3 --replication-factor 3 --topic kafkatest
7、测试获取toppid
/opt/kafka/bin/kafka-topics.sh --describe --zookeeper 10.0.0.241:2181,10.0.0.242:2181,10.0.0.240:2181 --topic kafkatest
8、测试成功之后,可以放在后台启动
/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
修改filebeat配置文件
output.kafka:
hosts: ["10.0.0.240:9092", "10.0.0.241:9092", "10.0.0.242:9092"]
topic: 'filebeat'
修改logstash配置文件
vim /etc/logstash/conf.d/kafka.conf
input {
kafka{
bootstrap_servers=>"10.0.0.240:9092"
topics=>["filebeat"]
group_id=>"logstash"
codec => "json"
}
}