ELK log analysis platform (2)-logstash data collection

1. Introduction to logstash

  • Logstash is an open source server-side data processing pipeline.
    Logstash has more than 200 plugins that can collect data from multiple sources at the same time, transform the data, and then send the data to your favorite "repository". (Most of them are Elasticsearch.) The
    Logstash pipeline has two required elements, input and output, and an optional element filter.

  • Input: Collect data in a variety of styles, sizes, and sources
    -Logstash supports a variety of input options, while capturing events from many common sources.
    -Able to easily collect data from your logs, metrics, web applications, data storage, and various AWS services in a continuous streaming mode.

Insert picture description here

  • Input: Collect data of various styles, sizes and sources

    • Logstash supports a variety of input options, while capturing events from many common sources.
    • You can easily collect data from your logs, metrics, web applications, data storage, and various AWS services in a continuous streaming mode.
  • Filter: Analyze and transform data in real time

    • During the transfer of data from the source to the repository, Logstash filters are able to parse events, identify named fields to build structures, and convert them into a common format for easier and faster analysis and realization of business value.
      • Use Grok to derive structure from unstructured data
      • Deciphering geographic coordinates from IP address
      • Anonymize PII data and completely exclude sensitive fields
      • Simplify overall processing, independent of data source, format or architecture
  • Output: select your repository, export your data

    • Although Elasticsearch is our preferred output direction and can bring unlimited possibilities for our search and analysis, it is not the only choice.
    • Logstash provides many output options, you can send data to the place you want to specify, and can flexibly unlock many downstream use cases.

2. Logstash installation and configuration (open a new virtual machine)

2.1 Software download

Official website

[root@server16 ~]# ll  ##官网下载
total 334812
-rw-r--r-- 1 root root 170023183 Sep  5  2018 jdk-8u181-linux-x64.rpm
-rw-r--r-- 1 root root 172821011 Mar 12  2020 logstash-7.6.1.rpm

Insert picture description here

2.2 Software installation

[root@server16 ~]# rpm -ivh jdk-8u181-linux-x64.rpm
[root@server16 ~]# rpm -ivh logstash-7.6.1.rpm

Insert picture description here

2.3 Configure environment variables

[root@server16 opt]# cd /usr/share/logstash/
[root@server16 logstash]# ls
bin           Gemfile       LICENSE.txt               modules     vendor
CONTRIBUTORS  Gemfile.lock  logstash-core             NOTICE.TXT  x-pack
data          lib           logstash-core-plugin-api  tools
[root@server16 logstash]# cd bin/
[root@server16 bin]# ls
benchmark.sh         logstash               logstash.lib.sh      pqrepair
cpdump               logstash.bat           logstash-plugin      ruby
dependencies-report  logstash-keystore      logstash-plugin.bat  setup.bat
ingest-convert.sh    logstash-keystore.bat  pqcheck              system-install
[root@server16 bin]# pwd
/usr/share/logstash/bin
[root@server16 bin]# cd

[root@server16 ~]# vim .bash_profile
[root@server16 ~]# cat .bash_profile | grep bin
PATH=$PATH:$HOME/bin:/usr/share/logstash/bin
[root@server16 ~]# source .bash_profile
[root@server16 ~]# which logstash
/usr/share/logstash/bin/logstash

Insert picture description here

2.4 Standard input to standard output (command)

[root@server16 ~]# logstash -e 'input { stdin { }} output { stdout {} }'  ##标准输入输出,键盘输入,屏幕输出。ctrl+c退出


Insert picture description here
Insert picture description here

3. File output plugin

3.1 Standard input and standard output (file)

[root@server16 ~]# cd /etc/logstash/
[root@server16 logstash]# ls
conf.d       log4j2.properties     logstash.yml   startup.options
jvm.options  logstash-sample.conf  pipelines.yml
[root@server16 logstash]# cd conf.d/   ##目录下任意的conf文件都可以执行
[root@server16 conf.d]# ls
[root@server16 conf.d]# vim test.conf
[root@server16 conf.d]# cat test.conf 
input {
    
    
  stdin {
    
    }
}

output {
    
    
  stdout {
    
    }
}
[root@server16 conf.d]# logstash -f /etc/logstash/conf.d/test.conf   ##执行文件

Insert picture description here

Insert picture description here
Insert picture description here

Insert picture description here

3.2 Standard input to file

[root@server16 conf.d]# vim test.conf 
[root@server16 conf.d]# cat test.conf   ##输入到文件
input {
    
    
        stdin {
    
     }
}
output {
    
    
     	stdout {
    
    }
     	file {
    
    
                path => "/tmp/testfile"
                codec => line {
    
     format => "custom format: %{message}"}
        }
}
[root@server16 conf.d]# logstash -f /etc/logstash/conf.d/test.conf 运行
[root@server16 conf.d]# cat /tmp/testfile    ##查看文件内容
custom format: hello

Insert picture description here
Insert picture description here
Insert picture description here

3.3 View the content entered into the file

3.3.1 View content

## 文件需要确保权限是可以读的,例如:chmod 644 /var/log/message
[root@server16 conf.d]# vim test.conf 
[root@server16 conf.d]# cat test.conf 
input {
    
    
        file {
    
    
 	path => "/tmp/testfile"
	start_position => "beginning"    ##从头开始显示,end从末尾显示,默认是end
	}
}
output {
    
    
     	stdout {
    
    }
     	#file {
    
    
        #        path => "/tmp/testfile"
        #        codec => line { format => "custom format: %{message}"}
        #}
}
[root@server16 conf.d]# logstash -f /etc/logstash/conf.d/test.conf 




Insert picture description here
Insert picture description here

3.3.2 Supplement

- sincedb文件内容解释
  # cat .sincedb_*
	sincedb文件一共6个字段
		1.inode编号
		2.文件系统的主要设备号
		3.文件系统的次要设备号
		4.文件中的当前字节偏移量
		5.最后一个活动时间戳(浮点数)
		6.与此记录匹配的最后一个已知路径
##logstash如何区分设备、文件名、文件的不同版本,logstash会把进度保存到sincedb文件中
[root@server16 conf.d]# cd /usr/share/logstash/
[root@server16 logstash]# ls
bin           Gemfile       LICENSE.txt               modules     vendor
CONTRIBUTORS  Gemfile.lock  logstash-core             NOTICE.TXT  x-pack
data          lib           logstash-core-plugin-api  tools
[root@server16 logstash]# cd data/plugins/inputs/file/
[root@server16 file]# ls
[root@server16 file]# l.
.  ..  .sincedb_7fccf970aa3e421aa73e89ca5c260abf
[root@server16 file]# cat .sincedb_*
17747495 0 64768 21 1615281055.0049129 /tmp/testfile   ##修改过的文件

Insert picture description here

3.4 Output the content of the file to the es host

View official website plug-in link

[root@server2 ~]# systemctl start elasticsearch.service   ##首先将各个主机的es服务打开
[root@server2 ~]# cd elasticsearch-head-master/
[root@server2 elasticsearch-head-master]# cnpm run start &   ##启动web界面


[root@server6 ~]# cd /etc/logstash/
[root@server6 logstash]# cd conf.d/
[root@server6 conf.d]# ls
test.conf
[root@server6 conf.d]# vim test.conf 
[root@server6 conf.d]# cat test.conf 
input {
    
    
        file {
    
    
 	path => "/var/log/messages"
	start_position => "beginning"
	}
}
output {
    
    
     	stdout {
    
    }
     	#file {
    
    
        #        path => "/tmp/testfile"
        #        codec => line { format => "custom format: %{message}"}
        #}
	elasticsearch {
    
    
	hosts => ["172.25.13.2:9200"]
	index => "syslog-%{+yyyy.MM.dd}"
	}
}
[root@server6 conf.d]# pwd
/etc/logstash/conf.d
[root@server6 conf.d]# logstash -f /etc/logstash/conf.d/test.conf   ##运行去网页查看采集到的日志

Insert picture description here

Insert picture description here
Insert picture description here

3.5 Syslog input plugin

- logstash可以伪装成日志服务器,直接接受远程日志。
[root@server4 ~]# vim /etc/rsyslog.conf     
[root@server4 ~]# cat /etc/rsyslog.conf  | grep  @@172.25.13.6:514  ##将系统日志输出到172.25.13.6的主机
*.* @@172.25.13.6:514
[root@server5 ~]# vim /etc/rsyslog.conf
[root@server5 ~]# cat /etc/rsyslog.conf  | grep  @@172.25.13.6:514  ##将系统日志输出到172.25.13.6的主机
*.* @@172.25.13.6:514
[root@server4 ~]# systemctl restart rsyslog.service    ##重启
[root@server5 ~]# systemctl restart rsyslog.service    ##重启


[root@server6 conf.d]# cat syslog.conf 
input {
    
    
	syslog {
    
    }
}
output {
    
    

	stdout {
    
    }
	elasticsearch {
    
    
        hosts => ["172.25.13.2:9200"]
        index => "message-%{+yyyy.MM.dd}"
        }
	
}


[root@server6 conf.d]# logstash -f /etc/logstash/conf.d/syslog.conf  ##应用
[root@server4 ~]# logger hello  ##在server4和server5运行日志数据,查看web结果


Insert picture description here

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

3.6 Multi-line filtering plug-in codec

- 多行过滤可以把多行日志记录合并为一行事件
[root@server4 elasticsearch]# pwd
/var/log/elasticsearch
[root@server4 elasticsearch]# scp my-es.log server6:/var/log/  ##拷贝一份es日志文件

[root@server6 conf.d]# vim test.conf 
[root@server6 conf.d]# cat test.conf    
input {
    
    
        file {
    
    
 	path => "/var/log/my-es.log"     ##读取文件,文件权限必须有读
	start_position => "beginning"
	codec => multiline {
    
    
          pattern => "^\["
          negate => "true"
          what => "previous" 
        }
      }

}
output {
    
    
     	#stdout {}
     	#file {
    
    
        #        path => "/tmp/testfile"
        #        codec => line { format => "custom format: %{message}"}
        #}
	elasticsearch {
    
    
	hosts => ["172.25.13.2:9200"]
	index => "message-%{+yyyy.MM.dd}"
	}
}
[root@server6 conf.d]# logstash -f /etc/logstash/conf.d/test.conf  ##应用


Insert picture description here

Insert picture description here

Insert picture description here
Insert picture description here

Insert picture description here

3.7 Solutions to errors that do not show up after running multiple times

[root@server6 conf.d]# cd /usr/share/logstash/data/plugins/inputs/file/
[root@server6 file]# ls
[root@server6 file]# ls -a 
.  ..  .sincedb_13f094911fdac7ab3fa6f4c93fee6639  .sincedb_452905a167cf4509fd08acb964fdb20c
[root@server6 file]# cat .sincedb_*
51333969 0 64768 139573 1615344160.0945382 /var/log/my-es.log
51020273 0 64768 178545 1615339859.532701 /var/log/messages

Insert picture description here

3.8 grok filter plugin

[root@server6 conf.d]# yum install httpd -y   ##安装httpd,过滤日志信息
[root@server6 conf.d]# systemctl start httpd.service 
[root@server6 conf.d]# cd /var/www/html/
[root@server6 html]# echo server6 > index.html
[root@server6 conf.d]# vim /etc/httpd/conf/httpd.conf   ##查看日志采集格式


[root@westos ~]# ab -c1 -n 100  http://172.25.13.6/index.html  ##真机压测获取访问数据

[root@server6 conf.d]# pwd 
/etc/logstash/conf.d
[root@server6 conf.d]# cat appach.conf    ##书写配置文件
input {
    
    
        file {
    
    
 	path => "/var/log/httpd/access_log"
	start_position => "beginning"
        }
      }

filter {
    
    
      grok {
    
    
        match => {
    
     "message" => "%{HTTPD_COMBINEDLOG}" }   ##过滤信息
      }
}

output {
    
    
	elasticsearch {
    
    
	hosts => ["172.25.13.2:9200"]
	index => "apache-%{+yyyy.MM.dd}"
	}
}
[root@server6 conf.d]# logstash -f /etc/logstash/conf.d/apache.conf  ##运行,报错可以查看错误信息

Insert picture description here
Insert picture description here

Insert picture description here

Insert picture description here
Insert picture description here
Insert picture description here

Insert picture description here

Insert picture description here
Insert picture description here

Guess you like

Origin blog.csdn.net/qwerty1372431588/article/details/114584688