ELK - Logstash - Transforming Data

Performing Core Operations

-----------------------------------------------------------

date filter

解析字段中的日期,用作事件的Logstash时间戳。

filter {
  date {
    match => [ "logdate", "MMM dd yyyy HH:mm:ss" ]
  }
}

-----------------------------------------------------------

drop filter

删除时间。通常与条件组合使用。

filter {
  if [loglevel] == "debug" {
    drop { }
  }
}

-----------------------------------------------------------

fingerprint filter

通过应用一致的散列来打印字段。

filter {
  fingerprint {
    source => ["IP", "@timestamp", "message"]
    method => "SHA1"
    key => "0123"
    target => "[@metadata][generated_id]"
  }
}

-----------------------------------------------------------

mutate filter

在字段上执行一般改变。可以重命名、删除、替换和修改事件中的字段。

filter {
  mutate {
    rename => { "HOSTORIP" => "client_ip" }
  }
}
filter {
  mutate {
    strip => ["field1", "field2"]
  }
}

-----------------------------------------------------------

ruby filter

执行ruby代码

filter {
  ruby {
    code => "event.cancel if rand <= 0.90"
  }
}

----------------------------------------------------------------------------------------------------------------------

Deserializing Data

数据反序列化插入到Logstash event

avro codec

csv filter

fluent codec

json codec

protobuf codec

xml filter

----------------------------------------------------------------------------------------------------------------------

Extracting Fields and Wrangling Data

提取字段,将非结构化数据解析为字段

-----------------------------------------------------------

dissect filter

使用分隔符将非结构化事件数据提取到字段中。剖析过滤器不使用正则表达式,并且非常快。然而,如果数据的结构因行而异,则grok过滤器更合适。

Exsample:

日志格式

2018-12-26 02:00:38,220 [Test-ELK] INFO id=54, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=68, c2=a, c3=f, c4=8jaufH

logstash命令:

./logstash -e 'input { beats { port => 5044 } } filter { dissect { mapping => { "message" => "%{ts} %{+ts},%{+ts} [%{logname}] %{loglevel} id=%{id}, myname=%{hostname}, myaddr=%{ip}, c1=%{c1}, c2=%{c2}, c3=%{c3}, c4=%{c4}" } } } output { stdout { } }'

结果:

{
    "prospector" => {
        "type" => "log"
    },
            "c4" => "8jaufH",
            "ip" => "192.168.4.17",
            "ts" => "2018-12-26 02:00:38,220",
            "c1" => "68",
      "@version" => "1",
    "@timestamp" => 2018-12-26T07:00:40.425Z,
        "source" => "/root/test_elk/test_elk.log",
        "offset" => 305858,
            "id" => "54",
            "c2" => "a",
       "logname" => "Test-ELK",
          "host" => {
                 "name" => "dbwtest03bc.daodao.com",
                   "id" => "6787d9310dd84654ab8871f64df6f6d7",
         "architecture" => "x86_64",
                   "os" => {
            "platform" => "centos",
            "codename" => "Core",
              "family" => "redhat",
             "version" => "7 (Core)"
        },
        "containerized" => true
    },
          "beat" => {
            "name" => "dbwtest03bc.daodao.com",
         "version" => "6.5.4",
        "hostname" => "dbwtest03bc.daodao.com"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
       "message" => "2018-12-26 02:00:38,220 [Test-ELK] INFO id=54, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=68, c2=a, c3=f, c4=8jaufH",
      "hostname" => "dbwtest03bc.daodao.com",
            "c3" => "f",
      "loglevel" => "INFO",
         "input" => {
        "type" => "log"
    }
}

-----------------------------------------------------------

kv filter

解析键值对

Example:

日志

2018-12-26 02:08:08,946 [Test-ELK] INFO id=99, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=97, c2=c, c3=t, c4=mJPsIQ

logstash命令:

./logstash -e 'input { beats { port => 5044 } } filter { kv { } } output { stdout { } }'

输出

{
          "beat" => {
         "version" => "6.5.4",
        "hostname" => "dbwtest03bc.daodao.com",
            "name" => "dbwtest03bc.daodao.com"
    },
            "c1" => "97,",
       "message" => "2018-12-26 02:08:08,946 [Test-ELK] INFO id=99, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=97, c2=c, c3=t, c4=mJPsIQ",
    "@timestamp" => 2018-12-26T07:08:11.513Z,
         "input" => {
        "type" => "log"
    },
            "id" => "99,",
        "myaddr" => "192.168.4.17,",
          "host" => {
         "architecture" => "x86_64",
                   "os" => {
             "version" => "7 (Core)",
              "family" => "redhat",
            "codename" => "Core",
            "platform" => "centos"
        },
        "containerized" => true,
                   "id" => "6787d9310dd84654ab8871f64df6f6d7",
                 "name" => "dbwtest03bc.daodao.com"
    },
            "c4" => "mJPsIQ",
        "offset" => 311614,
            "c2" => "c,",
    "prospector" => {
        "type" => "log"
    },
        "source" => "/root/test_elk/test_elk.log",
        "myname" => "dbwtest03bc.daodao.com,",
      "@version" => "1",
            "c3" => "t,",
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ]
}

-----------------------------------------------------------

grok filter

将非结构化事件数据解析为字段。这个工具非常适合于syslog日志、Apache和其他web服务器日志、MySQL日志,以及一般情况下,任何为人类编写的日志格式,而不是计算机使用。Grok通过将文本模式组合成与日志匹配的内容来工作。

日志

2018-12-27 05:17:17,829 [Test-ELK] INFO id=19, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=10, c2=c, c3=f, c4=V6DSLg

命令

./logstash -e 'input { beats { port => 5044 } } filter { grok { match => { "message" => "%{DATESTAMP:time}%{SPACE}\[%{DATA:title}\]%{SPACE}%{LOGLEVEL:level}%{SPACE}%{GREEDYDATA:log_message}" } } } output { stdout { } }'

输出

{
           "time" => "18-12-27 05:20:18,117",
           "host" => {
         "architecture" => "x86_64",
                 "name" => "dbwtest03bc.daodao.com",
                   "id" => "6787d9310dd84654ab8871f64df6f6d7",
        "containerized" => true,
                   "os" => {
            "platform" => "centos",
            "codename" => "Core",
             "version" => "7 (Core)",
              "family" => "redhat"
        }
    },
     "prospector" => {
        "type" => "log"
    },
          "input" => {
        "type" => "log"
    },
          "title" => "Test-ELK",
        "message" => "2018-12-27 05:20:18,117 [Test-ELK] INFO id=37, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=37, c2=b, c3=f, c4=SrIzYL",
         "offset" => 351361,
     "@timestamp" => 2018-12-27T10:20:20.233Z,
           "beat" => {
         "version" => "6.5.4",
            "name" => "dbwtest03bc.daodao.com",
        "hostname" => "dbwtest03bc.daodao.com"
    },
           "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
          "level" => "INFO",
       "@version" => "1",
         "source" => "/root/test_elk/test_elk.log",
    "log_message" => "id=37, myname=dbwtest03bc.daodao.com, myaddr=192.168.4.17, c1=37, c2=b, c3=f, c4=SrIzYL"
}

----------------------------------------------------------------------------------------------------------------------

Enriching Data with Lookups

dns filter

执行标准或反向DNS查找。

elasticsearch filter

将Elasticsearch中以前的日志事件中的字段复制到当前事件。

下面的配置显示了如何使用这个过滤器的完整示例。每当Logstash收到“end”事件时,它就使用这个Elasticsearch过滤器根据一些操作标识符查找匹配的“start”事件。然后,它将@timestamp字段从“start”事件复制到“end”事件的新字段中。最后,使用日期过滤器和ruby过滤器的组合,示例中的代码计算两个事件之间的时间持续时间(以小时为单位)。

geoip filter

添加关于IP地址位置的地理信息。

在应用geoip过滤器之后,事件将用geoip字段进行充实。

jdbc_static filter

使用从远程数据库预加载的数据丰富事件。

从远程数据库获取数据,将其缓存在本地数据库中,并使用查找来用缓存在本地数据库中的数据丰富事件。

jdbc_streaming filter

用数据库数据丰富事件。

translate filter

基于散列或文件中指定的替换值替换字段内容。目前支持这些文件类型:YAML、JSON和CSV。

useragent filter

将用户代理字符串解析为字段。

猜你喜欢

转载自blog.csdn.net/chuckchen1222/article/details/85294700
今日推荐