Fluentd是一个日志收集系统,可以指定多种日志来源,并且配置处理规则,到最后可以输出到不同的持久化系统。EFK就是其典型的应用场景,将日志收集并输出到ElasticSearch中。
本文的目的在于搭建一套收集Nginx产生的日志,并配置td-agent配置source type为tail的方式,从日志文件中实时收集数据,最后将日志信息经过特定的处理发送到Microsoft的EventHub消息中间件中,供后续的数据处理。
不用fluentd自带的http,原因是nginx的功能更丰富,并且解耦
1. Nginx环境搭建及参数配置
环境搭建
更新一下安装方式:yum安装nginx,
首先安装nginx的依赖环境
在/etc/yum.repos.d 目录下创建 nginx.repo文件
输入命令:touch nginx.repo
上官网 http://nginx.org/en/linux_packages.html#stable 拷贝对应linux版本的yum源
本服务器是centos7.4,所以对应的yum如下,将下面这段拷贝到创建的 nginx.repo 中
[nginx]
name=nginx repo
baseurl=http://nginx.org/packages/mainline/centos/7/$basearch/
gpgcheck=0
enabled=1
输入yum list | grep nginx 可以查看yum版本
执行yum install nginx 安装完毕。
1 wget -c https://nginx.org/download/nginx-1.10.1.tar.gz
2 yum install gcc-c++
3 yum install -y pcre pcre-devel
4 yum install -y zlib zlib-devel
5 yum install -y openssl openssl-devel
6 tar -zxvf nginx-1.10.1.tar.gz
7 cd nginx-1.10.1
8 make
9 make install
参数配置
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
#开启下划线识别,并且中划线转下划线
underscores_in_headers on;
#设置body达到256k时写入临时文件,默认为两个系统页大小(4096*2)
client_body_buffer_size 1m;
client_max_body_size 1m;
#client_body_in_single_buffer on;
#client_body_in_file_only on;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
log_format unimod '$remote_addr [$time_local] "$request" $request_length $status $http_content_type $http_content_encoding "$request_body" '
'1:$http_row_priority_with_crawltime 2:$http_row_priority_without_crawltime 3:$http_column_priority_with_crawltime 4:$http_column_priority_without_crawltime';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
#include /etc/nginx/conf.d/default.conf;
server {
listen 80;
server_name 10.0.0.8;
#charset koi8-r;
#if ($request_method !~* POST) {
# return 403;
#}
proxy_ignore_client_abort on;
access_log /var/log/nginx/host.access.log unimod;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location ~ ^/v[1-9]*/ingest/(qa|weibo|weixin)/(article|document)$ {
if ($request_method !~* POST) {
return 403;
}
#ngx_http_read_client_request_body();
proxy_pass http://127.0.0.1:10086;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 =200 /index.html;
location = /index.html {
root /usr/share/nginx/html;
}
}
}
设置滚动日志文件权限,默认为640,会导致td-agent读取日志是没权限
[root@vmforplatformjobs nginx]# cat /etc/logrotate.d/nginx
/var/log/nginx/*.log {
daily
missingok
rotate 52
compress
delaycompress
notifempty
create 644 nginx adm
sharedscripts
postrotate
if [ -f /var/run/nginx.pid ]; then
kill -USR1 `cat /var/run/nginx.pid`
fi
endscript
}
2. Fluentd环境搭建及参数配置
环境搭建
Fluentd 是由Ruby和C编写的,需要ruby进行,td-agent是fluentd 的易安装版本,不用考虑太多的依赖关系
- 查看系统文件描述符最大数量限制ulimit -n,如果太小,可做适当的修改,修改
/etc/security/limits.conf
文件并reboot
root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536
-
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh
-
可以通过启动
sudo /etc/td-agent/td-agent-ui start
启用UI界面管理和安装插件
参数配置
<source>
@type tail
encoding utf-8
from_encoding utf-8
path /var/log/nginx/host.access.log
pos_file /var/log/nginx/host.access.log.pos
tag host.document
format /^(?<remote>[^ ]*) \[([^\]]*)\] "(?<method>\S+) (?<path>\S+) (?<protocol>\S+)" (?<size>[^ ]*) (?<code>[^ ]*) (?<contentType>[^ ]*) (?<contentEncoding>[^ ]*) "(?<content>[^"]*)" 1:(?<rowPriority>[^ ]*) 2:(?<columnPriority>[^ ]*)$/
time_format %d/%b/%Y:%H:%M:%S %z
</source>
<filter host.document>
@type grep
<and>
<regexp>
key code
pattern 200
</regexp>
<regexp>
key method
pattern POST
</regexp>
</and>
</filter>
<match host.document>
@type rewrite_tag_filter
<rule>
key size
pattern /^(\d{1,5}|1\d{1,5}|2[0-4]\d{1,4})$/
tag toolarge.document
invert true
</rule>
<rule>
key path
pattern /^/v[1-9]*/ingest/qa/document$/
tag qa.document
</rule>
<rule>
key path
pattern /.+/
tag unmatched.document
</rule>
</match>
<filter qa.document>
@type record_transformer
renew_record true
keep_keys ["rowPriority","columnPriority","contentType","contentEncoding","content"]
</filter>
<match qa.document>
@type azureeventhubs_buffered
connection_string Endpoint=xxx
hub_name qaarticle
batch true
max_batch_size 10000
print_records false
<buffer>
@type memory
flush_interval 60
chunk_limit_size 255KB
</buffer>
</match>
<match toolarge.document>
@type file
path /var/log/td-agent/toolarge
</match>
<match unmatched.document>
@type file
path /var/log/td-agent/unmatched
</match>
- 将Source配置为收集Nginx的日志文件,需要指定pos_file文件记录文件的读取位置,以防td-agent故障重启后可以正确找到读取位置,指定路劲需要考虑用户权限问题
- 配置Filter,保留有用字段,要使keep_keys生效需要配置renew_record为true
- 配置Match,将filter过滤后的数据输出至Event Hub中,注意:需要先安装对应的插件才能使用
sudo td-agent-gem install fluent-plugin-azureeventhubs
,配置connection_string时要使用具有Manager权限的连接字符串,否则启动时会报错 - 安装fluent-plugin-azureeventhubs时只是安装了0.0.6版本,需要将代码用github中的master分支代码替换才能支持将多条数据写到一个event data中
流程测试
- 发送POST请求
POST /v1/ingest/qa/document HTTP/1.1
Host: 42.159.89.124
Content-Type: application/json
Row-Priority: 100
Column-Priority: mainContent=11;title=21
Content-Encoding: gzip
body:
ewoJImNvbW1vbiI6Cgl7CgkJImlkIjogIjEyMzQ1NDEiLAoJCSJkb21haW4iOiAieHh4Lnh4eC54eHgiLAoJCSJ1cmwiOiAiaHR0cChzKTovL3h4eC54eHgueHh4L2FydGljbGUtNTI1NjUtMS5odG1sIiwgCgkJInRpdGxlIjogIuagh+mimCIsCgkJIm1haW5Db250ZW50IjogIuato+aWh+WGheWuuSIsCgkJImFjY291bnRJZCI6ICIiLAoJCSJjaGFubmVsVHlwZSI6ICIiLAoJCSJwcm92aWRlciI6ICIiLAoJCSJhcnRpY2xlVHlwZSI6ICIiLAoJCSJ0YWdzIjogIiIsCgkJInJlYWROdW0iOiAxLAoJCSJjb21tZW50TnVtIjogMSwKCQkicHJhaXNlTnVtIjogMSwKCQkib3Bwb3NlTnVtIjogMSwKCQkicmV0d2VldE51bSI6IDEsCgkJImZvbGxvd051bSI6IG51bGwsCgkJInBhcmVudElkIjogIiIsCgkJInBhcmVudFR5cGUiOiAiIiwKCQkicGFyZW50YWdlIjogIiIsCgkJImFuY2VzdG9ySWQiOiAiIiwKCQkiYW5jZXN0b3JUeXBlIjogIiIsCgkJImRlcHRoIjoyLAoJCSJzdWJzY3JpcHRpb25JZHMiOiBbXSwKCQkic2VlZHBhY2thZ2VJZHMiOiBbXSwKCQkicHVibGlzaFRpbWUiOiAiMjAxOC0xMS0yOVQwMzozNDowMFoiLAoJCSJsYXN0RG9jVXBkYXRlVGltZSI6ICIyMDE4LTEyLTAyVDAwOjUxOjU0KzA4OjAwIiwKCQkibGFzdE51bVVwZGF0ZVRpbWUiOiAiMjAxOC0xMi0wMlQwMDo1MTo1NCswODowMCIsCgkJImNyYXdsVGltZSI6ICIyMDE4LTEyLTAxVDE2OjUxOjQ1WiIgICAgICAgICAgICAgICAgCgl9LAoJInNwZWNpYWwiOgoJewoJCSJmbG9vciI6IDEsCgkJIlFBdGFncyI6WyJhYmMiLCJiY2QiXSwKCQkiYW5zd2VyTnVtIjogMTIzNCwKCQkiaW1hZ2VMaXN0IjogWyJodHRwOi8vYWJjLmNvbS9pbWcxLmdpZiIsImh0dHA6Ly9hYmMuY29tL2ltZzIuZ2lmIl0sCgkJIkBwZW9wbGVMaXN0IjogWyJ6aGFuZ3NhbiIsImxpc2kiXQoJfSwKCSJkeW5hbWljIjoKCXsgICAgICAKCQkicXVlc3Rpb25JZCI6ICI0MzYwMDA2MiIsCgkJInByb2plY3ROYW1lIjogIueOm+awj+iIhuaDhemhueebriIsCgkJImNpdHlJZCI6IDAsCgkJInByb3ZpbmNlSWQiOiAwCgl9Cn0=
- Nginx日志记录
140.206.187.194 [10/Jan/2019:09:37:28 +0000] "POST /v1/ingest/qa/document HTTP/1.1" 1785 200 application/json gzip "ewoJImNvbW1vbiI6Cgl7CgkJImlkIjogIjEyMzQ1NDEiLAoJCSJkb21haW4iOiAieHh4Lnh4eC54eHgiLAoJCSJ1cmwiOiAiaHR0cChzKTovL3h4eC54eHgueHh4L2FydGljbGUtNTI1NjUtMS5odG1sIiwgCgkJInRpdGxlIjogIuagh+mimCIsCgkJIm1haW5Db250ZW50IjogIuato+aWh+WGheWuuSIsCgkJImFjY291bnRJZCI6ICIiLAoJCSJjaGFubmVsVHlwZSI6ICIiLAoJCSJwcm92aWRlciI6ICIiLAoJCSJhcnRpY2xlVHlwZSI6ICIiLAoJCSJ0YWdzIjogIiIsCgkJInJlYWROdW0iOiAxLAoJCSJjb21tZW50TnVtIjogMSwKCQkicHJhaXNlTnVtIjogMSwKCQkib3Bwb3NlTnVtIjogMSwKCQkicmV0d2VldE51bSI6IDEsCgkJImZvbGxvd051bSI6IG51bGwsCgkJInBhcmVudElkIjogIiIsCgkJInBhcmVudFR5cGUiOiAiIiwKCQkicGFyZW50YWdlIjogIiIsCgkJImFuY2VzdG9ySWQiOiAiIiwKCQkiYW5jZXN0b3JUeXBlIjogIiIsCgkJImRlcHRoIjoyLAoJCSJzdWJzY3JpcHRpb25JZHMiOiBbXSwKCQkic2VlZHBhY2thZ2VJZHMiOiBbXSwKCQkicHVibGlzaFRpbWUiOiAiMjAxOC0xMS0yOVQwMzozNDowMFoiLAoJCSJsYXN0RG9jVXBkYXRlVGltZSI6ICIyMDE4LTEyLTAyVDAwOjUxOjU0KzA4OjAwIiwKCQkibGFzdE51bVVwZGF0ZVRpbWUiOiAiMjAxOC0xMi0wMlQwMDo1MTo1NCswODowMCIsCgkJImNyYXdsVGltZSI6ICIyMDE4LTEyLTAxVDE2OjUxOjQ1WiIgICAgICAgICAgICAgICAgCgl9LAoJInNwZWNpYWwiOgoJewoJCSJmbG9vciI6IDEsCgkJIlFBdGFncyI6WyJhYmMiLCJiY2QiXSwKCQkiYW5zd2VyTnVtIjogMTIzNCwKCQkiaW1hZ2VMaXN0IjogWyJodHRwOi8vYWJjLmNvbS9pbWcxLmdpZiIsImh0dHA6Ly9hYmMuY29tL2ltZzIuZ2lmIl0sCgkJIkBwZW9wbGVMaXN0IjogWyJ6aGFuZ3NhbiIsImxpc2kiXQoJfSwKCSJkeW5hbWljIjoKCXsgICAgICAKCQkicXVlc3Rpb25JZCI6ICI0MzYwMDA2MiIsCgkJInByb2plY3ROYW1lIjogIueOm+awj+iIhuaDhemhueebriIsCgkJImNpdHlJZCI6IDAsCgkJInByb3ZpbmNlSWQiOiAwCgl9Cn0=" 1:100 2:mainContent=11;title=21
- eventdata输出
{"records":[{"rowPriority":"100","columnPriority":"mainContent=11;title=21","contentType":"application/json","contentEncoding":"gzip","content":"ewoJImNvbW1vbiI6Cgl7CgkJImlkIjogIjEyMzQ1NDEiLAoJCSJkb21haW4iOiAieHh4Lnh4eC54eHgiLAoJCSJ1cmwiOiAiaHR0cChzKTovL3h4eC54eHgueHh4L2FydGljbGUtNTI1NjUtMS5odG1sIiwgCgkJInRpdGxlIjogIuagh+mimCIsCgkJIm1haW5Db250ZW50IjogIuato+aWh+WGheWuuSIsCgkJImFjY291bnRJZCI6ICIiLAoJCSJjaGFubmVsVHlwZSI6ICIiLAoJCSJwcm92aWRlciI6ICIiLAoJCSJhcnRpY2xlVHlwZSI6ICIiLAoJCSJ0YWdzIjogIiIsCgkJInJlYWROdW0iOiAxLAoJCSJjb21tZW50TnVtIjogMSwKCQkicHJhaXNlTnVtIjogMSwKCQkib3Bwb3NlTnVtIjogMSwKCQkicmV0d2VldE51bSI6IDEsCgkJImZvbGxvd051bSI6IG51bGwsCgkJInBhcmVudElkIjogIiIsCgkJInBhcmVudFR5cGUiOiAiIiwKCQkicGFyZW50YWdlIjogIiIsCgkJImFuY2VzdG9ySWQiOiAiIiwKCQkiYW5jZXN0b3JUeXBlIjogIiIsCgkJImRlcHRoIjoyLAoJCSJzdWJzY3JpcHRpb25JZHMiOiBbXSwKCQkic2VlZHBhY2thZ2VJZHMiOiBbXSwKCQkicHVibGlzaFRpbWUiOiAiMjAxOC0xMS0yOVQwMzozNDowMFoiLAoJCSJsYXN0RG9jVXBkYXRlVGltZSI6ICIyMDE4LTEyLTAyVDAwOjUxOjU0KzA4OjAwIiwKCQkibGFzdE51bVVwZGF0ZVRpbWUiOiAiMjAxOC0xMi0wMlQwMDo1MTo1NCswODowMCIsCgkJImNyYXdsVGltZSI6ICIyMDE4LTEyLTAxVDE2OjUxOjQ1WiIgICAgICAgICAgICAgICAgCgl9LAoJInNwZWNpYWwiOgoJewoJCSJmbG9vciI6IDEsCgkJIlFBdGFncyI6WyJhYmMiLCJiY2QiXSwKCQkiYW5zd2VyTnVtIjogMTIzNCwKCQkiaW1hZ2VMaXN0IjogWyJodHRwOi8vYWJjLmNvbS9pbWcxLmdpZiIsImh0dHA6Ly9hYmMuY29tL2ltZzIuZ2lmIl0sCgkJIkBwZW9wbGVMaXN0IjogWyJ6aGFuZ3NhbiIsImxpc2kiXQoJfSwKCSJkeW5hbWljIjoKCXsgICAgICAKCQkicXVlc3Rpb25JZCI6ICI0MzYwMDA2MiIsCgkJInByb2plY3ROYW1lIjogIueOm+awj+iIhuaDhemhueebriIsCgkJImNpdHlJZCI6IDAsCgkJInByb3ZpbmNlSWQiOiAwCgl9Cn0="},
{"rowPriority":"100","columnPriority":"mainContent=11;title=21","contentType":"application/json","contentEncoding":"gzip","content":"ewoJImNvbW1vbiI6Cgl7CgkJImlkIjogIjEyMzQ1NDEiLAoJCSJkb21haW4iOiAieHh4Lnh4eC54eHgiLAoJCSJ1cmwiOiAiaHR0cChzKTovL3h4eC54eHgueHh4L2FydGljbGUtNTI1NjUtMS5odG1sIiwgCgkJInRpdGxlIjogIuagh+mimCIsCgkJIm1haW5Db250ZW50IjogIuato+aWh+WGheWuuSIsCgkJImFjY291bnRJZCI6ICIiLAoJCSJjaGFubmVsVHlwZSI6ICIiLAoJCSJwcm92aWRlciI6ICIiLAoJCSJhcnRpY2xlVHlwZSI6ICIiLAoJCSJ0YWdzIjogIiIsCgkJInJlYWROdW0iOiAxLAoJCSJjb21tZW50TnVtIjogMSwKCQkicHJhaXNlTnVtIjogMSwKCQkib3Bwb3NlTnVtIjogMSwKCQkicmV0d2VldE51bSI6IDEsCgkJImZvbGxvd051bSI6IG51bGwsCgkJInBhcmVudElkIjogIiIsCgkJInBhcmVudFR5cGUiOiAiIiwKCQkicGFyZW50YWdlIjogIiIsCgkJImFuY2VzdG9ySWQiOiAiIiwKCQkiYW5jZXN0b3JUeXBlIjogIiIsCgkJImRlcHRoIjoyLAoJCSJzdWJzY3JpcHRpb25JZHMiOiBbXSwKCQkic2VlZHBhY2thZ2VJZHMiOiBbXSwKCQkicHVibGlzaFRpbWUiOiAiMjAxOC0xMS0yOVQwMzozNDowMFoiLAoJCSJsYXN0RG9jVXBkYXRlVGltZSI6ICIyMDE4LTEyLTAyVDAwOjUxOjU0KzA4OjAwIiwKCQkibGFzdE51bVVwZGF0ZVRpbWUiOiAiMjAxOC0xMi0wMlQwMDo1MTo1NCswODowMCIsCgkJImNyYXdsVGltZSI6ICIyMDE4LTEyLTAxVDE2OjUxOjQ1WiIgICAgICAgICAgICAgICAgCgl9LAoJInNwZWNpYWwiOgoJewoJCSJmbG9vciI6IDEsCgkJIlFBdGFncyI6WyJhYmMiLCJiY2QiXSwKCQkiYW5zd2VyTnVtIjogMTIzNCwKCQkiaW1hZ2VMaXN0IjogWyJodHRwOi8vYWJjLmNvbS9pbWcxLmdpZiIsImh0dHA6Ly9hYmMuY29tL2ltZzIuZ2lmIl0sCgkJIkBwZW9wbGVMaXN0IjogWyJ6aGFuZ3NhbiIsImxpc2kiXQoJfSwKCSJkeW5hbWljIjoKCXsgICAgICAKCQkicXVlc3Rpb25JZCI6ICI0MzYwMDA2MiIsCgkJInByb2plY3ROYW1lIjogIueOm+awj+iIhuaDhemhueebriIsCgkJImNpdHlJZCI6IDAsCgkJInByb3ZpbmNlSWQiOiAwCgl9Cn0="}]}
上述数据是在数据正常的情况下的测试,下面考虑一些异常情况
- 当发送的请求url不存在时,请求会报404
- 当发送的请求的url存在,但是method为非post请求时,会报403
- 当post请求的body文件超过1m时,请求会报413
- 当正常发送数据到nginx,但是大小超过250k时,fluentd会将该条数据记录到toolarge的文件夹下,而不会输出到event hub
- 当正常发送数据到nginx,但是未配置fluentd输出到event hub时,会将数据写到unmatched文件夹下
docker 搭建fluentd
docker run -d -p 8888:8888 --name fluentd -v D:\docker\docker-test\fluentd/fluentd-conf:/fluentd/etc D:\docker\docker-test\fluentd:/fluentd/log fluentd -c /fluentd/etc/td-agent.conf -v
docker run -d -p 8888:8888 -v D:\docker\docker-test\fluentd/fluentd-conf:/fluentd/etc -v D:\docker\docker-test\fluentd\data:/data --name fluentd linclaus/fluentd-eventhub:v1.7-1 -c /fluentd/etc/td-agent.conf -v
Dockerfile
FROM fluent/fluentd:v1.7-1
# Use root account to use apk
USER root
# below RUN includes plugin as examples azureeventhubs is not required
# you may customize including plugins as you wish
RUN apk add --no-cache --update --virtual .build-deps \
sudo build-base ruby-dev \
&& sudo gem install fluent-plugin-azureeventhubs \
&& sudo gem install fluent-plugin-rewrite-tag-filter \
&& sudo gem sources --clear-all \
&& apk del .build-deps \
&& rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem
COPY fluent-plugin-azureeventhubs-0.0.6 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-azureeventhubs-0.0.6
# COPY td-agent.conf /fluentd/etc/
# COPY entrypoint.sh /bin/
USER fluent