收集日志的工具

日志易（收费）
splunk（国外，按流量收费）

介绍

发展史：使用java语言，在luncen的基础上做二次封装，提供restful接口
搜索的原理：倒排索引
特点：水平扩展方便、提供高可用、分布式存储、使用简单

配置文件

/etc/elasticsearch/elasticsearch.yml#es的主要配置文件
/etc/elasticsearch/jvm.options      #配置jvm虚拟机的内存信息
/etc/sysconfig/elasticsearch        #配置相关系统变量
/usr/lib/sysctl.d/elasticsearch.conf#配置相关系统变量
/usr/lib/systemd/system/elasticsearch.service#es的服务程序

[root@es01 elasticsearch]# grep -Ev ‘^$|#’ /etc/elasticsearch/elasticsearch.yml
node.name: oldboy01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true #内存锁定
network.host: 10.0.0.240
http.port: 9200

systemctl edit elasticsearch

[Service]
LimitMEMLOCK=infinity

概念

1、索引：相当于在mysql当中创建一个数据库（database）
2、类型：相当于数据库当中的一张表（table）
3、docs：表中字段信息

命令行交互

创建一个索引

curl -XPUT http://10.0.0.240:9200/oldboy

写入一条数据

curl -XPUT ‘10.0.0.240:9200/oldboy/student/1?pretty’ -H ‘Content-Type: application/json’ -d’
{
“first_name” : “zhang”,
“last_name”: “san”,
“age” : 28,
“about” : “I love to go rock climbing”,
“interests”: [ “sports” ]
}’

随机id写入数据

curl -XPOST ‘10.0.0.240:9200/oldboy/student/?pretty’ -H ‘Content-Type: application/json’ -d’ {
“first_name”: “li”,
“last_name” : “mingming”,
“age” : 45,
“about”: “I like to swime”,
“interests”: [ “reading” ]
}’

查询指定id的信息

curl -XGET ‘10.0.0.240:9200/oldboy/student/1?pretty’

查询索引内所有信息

curl -XGET ‘10.0.0.240:9200/oldboy/_search/?pretty’

删除指定id的信息

curl -XDELETE ‘10.0.0.240:9200/oldboy/student/1?pretty’

删除索引

curl -XDELETE ‘10.0.0.240:9200/oldboy/?pretty’

kibana交互

配置文件

[root@es01 es-software]# grep -Ev ‘^$|#’ /etc/kibana/kibana.yml
server.port: 5601
server.host: “10.0.0.240”
elasticsearch.hosts: [“http://10.0.0.240:9200”]
kibana.index: “.kibana”

修改系统默认分片和副本信息

PUT _template/template_http_request_record
{
“index_patterns”: ["*"],
“settings”: {
“number_of_shards” : 5,
“number_of_replicas” : 1
}
}

es集群

cluster.name: oldboy-cluster
node.name: oldboy01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 10.0.0.240
http.port: 9200
discovery.zen.ping.unicast.hosts: [“10.0.0.240”, “10.0.0.241”] #相互通信即可
discovery.zen.minimum_master_nodes: 2 #两个节点选择主节点

/var/log/elasticsearch/my-application.log #日志文件

分片作用

主分片:粗框代表主分片，主要响应修改索引相关请求信息，提供对外查询的服务
副本分片：细框代表副本分片，也就是主分片的备份，提供对外查询服务

冗余

集群当中有3台主机，最多可以宕机几台

1、前提是一个副本，一台一台的宕机最多可以宕机2台（需要手动的修改一下配置文件）
2、前提是一个副本，一台一台的宕机，每宕机一台修复一台，集群的健康值始终为绿色
3、前提是二个副本，如一次性宕机2台，然后后动修改es的配置文件，可以让集群正常运行

注意事项

索引一旦创建完成，分片的数量是不能修改，副本的数量是可以修改
分片数量是集群数量倍数，3*3。根据自己的需求来定分片的数量
监控es集群运行的状态
curl ‘10.0.0.240:9200/_cluster/health?pretty’
{
“cluster_name” : “oldboy-cluster”,
“status” : “green”, #代表集群当前的健康值
“timed_out” : false,
“number_of_nodes” : 3,#当前集群中节点的个数
“number_of_data_nodes” : 3,
“active_primary_shards” : 14,
“active_shards” : 33,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0
}

备份索引

vim /etc/profile
export PATH=/opt/node/bin:$PATH
更换源npm install -g cnpm --registry=https://registry.npm.taobao.org
cnpm install elasticdump -g
备份索引mkdir /data elasticdump \ --input=http://10.0.0.240:9200/oldboy03 \ --output=/data/oldboy03.json \ --type=data
恢复数据elasticdump \ --input=/data/oldboy.json \ --output=http://10.0.0.240:9200/oldboy
压缩式备份elasticdump \ --input=http://10.0.0.240:9200/oldboy03 \ --output=$|gzip > /data/oldboy03.json.gz

中文分词器（所有节点安装）

安装

cd /usr/share/elasticsearch
./bin/elasticsearch-plugin install file:///opt/es-software/elasticsearch-analysis-ik-6.6.0.zip
systemctl restart elasticsearch

test索引应用

curl -XPOST http://10.0.0.240:9200/news/text/_mapping -H ‘Content-Type:application/json’ -d’
{
“properties”: {
“content”: {
“type”: “text”,
“analyzer”: “ik_max_word”,
“search_analyzer”: “ik_smart”
}
}
}’

写入数据

POST /news/text/3
{“content”:“贵宾成犬粮”}

验证

{
“query” : { “match” : { “content” : “贵宾成犬” }},
“highlight” : {
“pre_tags” : ["<tag1>", “<tag2>”],
“post_tags” : ["</tag1>", “</tag2>”],
“fields” : {
“content” : {}
}
}
}

动态添加词典

server {
listen 80;
server_name elk.oldboy.com;
location / {
root /usr/share/nginx/html/download;
charset utf-8,gbk;
autoindex on;
autoindex_localtime on;
autoindex_exact_size off;
}
}
cd /etc/elasticsearch/analysis-ik/
vim IKAnalyzer.cfg.xml
<entry key=“remote_ext_dict”>http://10.0.0.240/download/dic.txt</entry>
systemctl restart elasticsearch

filebeat

vim /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  json.keys_under_root: true #将日志格式更改为json
  overwrite_keys: true
- type: log
  enabled: true
  paths:
    - /var/log/nginx/error.log
- type: log
  enabled: true
  paths:
    - /opt/tomcat/logs/localhost_access_log.*.txt
  json.keys_under_root: true
  overwrite_keys: true
  tags: ["tomcat"]
- type: log
  enabled: true
  paths:
    - /var/log/elasticsearch/elasticsearch.log
  multiline.pattern: '^\[' #收集java日志，开头以[为分割
  multiline.negate: true
  multiline.match: after
output.elasticsearch:
  hosts: ["10.0.0.240:9200","10.0.0.241:9200"]#冗余方案
  indices: #不同日志生成不同的索引
    - index: "nginx-access-%{[beat.version]}-%{+yyyy.MM}"
      when.contains:
        source: "/var/log/nginx/access.log" #根据文件名称区分
    - index: "nginx-error-%{[beat.version]}-%{+yyyy.MM}"
      when.contains:
        source: "/var/log/nginx/error.log"
     - index: "tomcat-access-%{[beat.version]}-%{+yyyy.MM}"
      when.contains:
        tags: "tomcat" #根据标签区分
          - index: "mariadb-slow-%{[beat.version]}-%{+yyyy.MM}"
      when.contains:
        source: "/var/log/mariadb/slow.log"
    - index: "mariadb-error-%{[beat.version]}-%{+yyyy.MM}"
      when.contains:
        source: "/var/log/mariadb/mariadb.log"
setup.template.name: "nginx" #索引名称
setup.template.pattern: "nginx-*" #索引样式
filebeat.config.modules: #开启模块功能
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
  reload.period: 10s
#filebeat modules enable mysql启动mysql模块
#filebeat modules list查看模块列表

log_format json '{ "time_local": "$time_local", '
                          '"remote_addr": "$remote_addr", '
                          '"referer": "$http_referer", '
                          '"request": "$request", '
                          '"status": $status, '
                          '"bytes": $body_bytes_sent, '
                          '"agent": "$http_user_agent", '
                          '"x_forwarded": "$http_x_forwarded_for", '
                          '"up_addr": "$upstream_addr",'
                          '"up_host": "$upstream_http_host",'
                          '"upstream_time":                          "$upstream_response_time",'
                          '"request_time": "$request_time"'
    ' }';
    access_log  /var/log/nginx/access.log  json;

pattern="{&quot;clientip&quot;:&quot;%h&quot;,&quot;ClientUser&quot;:&quot;%l&quot;,&quot;authenticated&quot;:&quot;%u&quot;,&quot;AccessTime&quot;:&quot;%t&quot;,&quot;method&quot;:&quot;%r&quot;,&quot;status&quot;:&quot;%s&quot;,&quot;SendBytes&quot;:&quot;%b&quot;,&quot;Query?string&quot;:&quot;%q&quot;,&quot;partner&quot;:&quot;%{Referer}i&quot;,&quot;AgentVersion&quot;:&quot;%{User-Agent}i&quot;}"/>

cat /etc/filebeat/modules.d/mysql.yml 
- module: mysql
  error:
    enabled: true
    var.paths: ["/var/log/mariadb/mariadb.log"]
   
  slowlog:
    enabled: true
    var.paths: ["/var/log/mariadb/slow.log"]

nginx模块（自带json功能）无法使用
./bin/elasticsearch-plugin install ingest-user-agent
./bin/elasticsearch-plugin install ingest-geoip

filebeat+redis+logstash

filebeat

output.redis:
  hosts: ["10.0.0.240"]
  keys:
    - key: "nginx_access" #索引名称
      when.contains:
        tags: "access"
    - key: "nginx_error"
      when.contains:
        tags: "error"

redis检查日志

info
keys*
llen nginx_access #查看长度
lrange nginx_access 0 1 #查看第一个和第二个数据

logstash

cd /etc/logstash/conf.d
vim redis_nginx.conf
input {
  redis {
    host => "10.0.0.240"
    port => "6379"
    db => "0" #数据库名称
    key => "nginx_access"
    data_type => "list"
  }
  redis {
    host => "10.0.0.240"
    port => "6379"
    db => "0"
    key => "nginx_error"
    data_type => "list"
  }
}

filter {
  mutate {
    convert => ["upstream_time", "float"]
    convert => ["request_time", "float"]
  }
}

output {
   stdout {}
   if "access" in [tags] {
      elasticsearch {
        hosts => "http://10.0.0.240:9200"
        manage_template => false
        index => "nginx_access-%{+yyyy.MM}"
      }
    }
    if "error" in [tags] {
      elasticsearch {
        hosts => "http://10.0.0.240:9200"
        manage_template => false
        index => "nginx_error-%{+yyyy.MM}"
      }
    }
}

filebeat+kafka+zookeeper+logstash+elasticsearch

安装zookeeper
1、上传安装包
cd /opt/es-software
2、解压安装包
tar xf zookeeper-3.4.11.tar.gz  -C /opt
ln -s /opt/zookeeper-3.4.11/ /opt/zookeeper
3、创建数据目录
mkdir -p /data/zookeeper
4、复制一个配置文件
cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg
5、修改zoo.cfg配置文件
dataDir=/data/zookeeper
#在最后面加入
server.1=10.0.0.240:2888:3888
server.2=10.0.0.241:2888:3888
server.3=10.0.0.242:2888:3888
6、给每台服务添加一个myid，每台服务器的id是不同
echo "1" > /data/zookeeper/myid
echo "2" > /data/zookeeper/myid
echo "3" > /data/zookeeper/myid
7、启动zookeeper服务
/opt/zookeeper/bin/zkServer.sh start
8、检查zookeeper的运行状态
/opt/zookeeper/bin/zkServer.sh status

安装kafka
1、上传安装包
cd /opt/es-software
2、解压安装包
tar zxf kafka_2.11-1.0.0.tgz -C /opt/
ln -s /opt/kafka_2.11-1.0.0/ /opt/kafka
3、创建一个日志目录
mkdir /opt/kafka/logs
4、修改配置文件
vim /opt/kafka/config/server.properties
# broker的id，值为整数，且必须唯一，在一个集群中不能重复
broker.id=1
listeners=PLAINTEXT://10.0.0.240:9092
# 处理网络请求的线程数量，默认为3个
num.network.threads=3
# 执行磁盘IO操作的线程数量，默认为8个
num.io.threads=8
# socket服务发送数据的缓冲区大小，默认100KB
socket.send.buffer.bytes=102400
# socket服务接受数据的缓冲区大小，默认100KB
socket.receive.buffer.bytes=102400
# socket服务所能接受的一个请求的最大大小，默认为100M
socket.request.max.bytes=104857600
# kafka存储消息数据的目录
log.dirs=/opt/kafka/logs
# 每个topic默认的partition数量
num.partitions=1
# 在启动时恢复数据和关闭时刷新数据时每个数据目录的线程数量
num.recovery.threads.per.data.dir=1
#topic的offset的备份份数。建议设置更高的数字保证更高的可用性
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
#每个日志文件删除之前保存的时间
log.retention.hours=24
#这个属性就是每个文件的最大尺寸；当尺寸达到这个数值时，就会创建新文件
log.segment.bytes=1073741824
#检查日志分段文件的间隔时间，以确定是否文件属性是否到达删除要求。
log.retention.check.interval.ms=300000
# Zookeeper连接信息，如果是zookeeper集群，则以逗号隔开
zookeeper.connect=10.0.0.240:2181,10.0.0.241:2181,10.0.0.242:2181
# 连接zookeeper的超时时间,6s
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
5、前台启动测试能否正常启动
/opt/kafka/bin/kafka-server-start.sh  /opt/kafka/config/server.properties
6、测试创建topic
/opt/kafka/bin/kafka-topics.sh --create  --zookeeper 10.0.0.241:2181,10.0.0.242:2181,10.0.0.240:2181 --partitions 3 --replication-factor 3 --topic kafkatest
7、测试获取toppid
/opt/kafka/bin/kafka-topics.sh --describe --zookeeper 10.0.0.241:2181,10.0.0.242:2181,10.0.0.240:2181 --topic kafkatest
8、测试成功之后,可以放在后台启动
/opt/kafka/bin/kafka-server-start.sh  -daemon /opt/kafka/config/server.properties

修改filebeat配置文件
output.kafka:
  hosts: ["10.0.0.240:9092", "10.0.0.241:9092", "10.0.0.242:9092"]
  topic: 'filebeat'

修改logstash配置文件
vim /etc/logstash/conf.d/kafka.conf
input {
  kafka{
    bootstrap_servers=>"10.0.0.240:9092"
    topics=>["filebeat"]
    group_id=>"logstash"
    codec => "json"
  }
}

春风吹尽叁佰里

发布了49 篇原创文章 · 获赞 0 · 访问量 1948

私信关注

filebeat+redis+logstash+elasticsearch filebeat+kafka+zookeeper+logstash+elasticsearch

收集日志的工具

介绍

配置文件

概念

命令行交互

创建一个索引

写入一条数据

随机id写入数据

查询指定id的信息

查询索引内所有信息

删除指定id的信息

删除索引

kibana交互

配置文件

修改系统默认分片和副本信息

es集群

分片作用

冗余

注意事项

备份索引

中文分词器（所有节点安装）

安装

test索引应用

写入数据

验证

动态添加词典

filebeat

filebeat+redis+logstash

filebeat

redis检查日志

logstash

filebeat+kafka+zookeeper+logstash+elasticsearch

猜你喜欢