文章作者邮箱:[email protected] 地址:广东惠州
▲ 本章节目的
⚪ 了解Interceptor的概念和配置参数;
⚪ 掌握Interceptor的使用方法;
⚪ 掌握Interceptor的Host Interceptor;
⚪ 掌握Interceptor的Static Interceptor;
⚪ 掌握Interceptor的UUID Interceptor;
⚪ 掌握Interceptor的Search And Replace Interceptor;
⚪ 掌握Interceptor的Regex Filtering Interceptor;
⚪ 掌握Interceptor的Custom Interceptor;
一、Timestamp Interceptor
1. 概述
1. Timestamp Interceptor是在headers中来添加一个timestamp字段来标记数据被收集的时间。
2. Timestamp Interceptor结合HDFS Sink可以实现数据按天存储。
2. 配置属性
属性 |
解释 |
type |
timestamp |
3. 案例
1. 编写格式文件,添加如下内容:
a1.sources = s1
a1.channels = c1
a1.sinks = k1
a1.sources.s1.type = netcat
a1.sources.s1.bind = 0.0.0.0
a1.sources.s1.port = 8090
# 给Interceptor起名
a1.sources.s1.interceptors = i1
# 指定Timestamp Interceptor
a1.sources.s1.interceptors.i1.type = timestamp
a1.channels.c1.type = memory
a1.sinks.k1.type = logger
a1.sources.s1.channels = c1
a1.sinks.k1.channel = c1
2. 启动Flume:
../bin/flume-ng agent -n a1 -c ../conf -f in.conf -
Dflume.root.logger=INFO,console
4. 数据按天存放
1. 编写格式文件,添加如下内容:
a1.sources = s1
a1.channels = c1
a1.sinks = k1
a1.sources.s1.type = netcat
a1.sources.s1.bind = hadoop01
a1.sources.s1.port = 8090
a1.sources.s1.interceptors = i1
a1.sources.s1.interceptors.i1.type = timestamp
a1.channels.c1.type = memory
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://hadoop01:9000/flumedata/date=%Y-%m-%d
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.rollInterval = 3600
a1.so