大数据课程E7——Flume的Interceptor

文章作者邮箱:[email protected]              地址:广东惠州

 ▲ 本章节目的

⚪ 了解Interceptor的概念和配置参数;

⚪ 掌握Interceptor的使用方法;

⚪ 掌握Interceptor的Host Interceptor;

⚪ 掌握Interceptor的Static Interceptor;

⚪ 掌握Interceptor的UUID Interceptor;

⚪ 掌握Interceptor的Search And Replace Interceptor;

⚪ 掌握Interceptor的Regex Filtering Interceptor;

⚪ 掌握Interceptor的Custom Interceptor;

一、Timestamp Interceptor

1. 概述

1. Timestamp Interceptor是在headers中来添加一个timestamp字段来标记数据被收集的时间。

2. Timestamp Interceptor结合HDFS Sink可以实现数据按天存储。

2. 配置属性

属性

解释

type

timestamp

3. 案例

1. 编写格式文件,添加如下内容:

a1.sources = s1

a1.channels = c1

a1.sinks = k1

a1.sources.s1.type = netcat

a1.sources.s1.bind = 0.0.0.0

a1.sources.s1.port = 8090

# 给Interceptor起名

a1.sources.s1.interceptors = i1

# 指定Timestamp Interceptor

a1.sources.s1.interceptors.i1.type = timestamp

a1.channels.c1.type = memory

a1.sinks.k1.type = logger

a1.sources.s1.channels = c1

a1.sinks.k1.channel = c1

2. 启动Flume:

../bin/flume-ng agent -n a1 -c ../conf -f in.conf -

Dflume.root.logger=INFO,console

4. 数据按天存放

1. 编写格式文件,添加如下内容:

a1.sources = s1

a1.channels = c1

a1.sinks = k1

a1.sources.s1.type = netcat

a1.sources.s1.bind = hadoop01

a1.sources.s1.port = 8090

a1.sources.s1.interceptors = i1

a1.sources.s1.interceptors.i1.type = timestamp

a1.channels.c1.type = memory

a1.sinks.k1.type = hdfs

a1.sinks.k1.hdfs.path = hdfs://hadoop01:9000/flumedata/date=%Y-%m-%d

a1.sinks.k1.hdfs.fileType = DataStream

a1.sinks.k1.hdfs.rollInterval = 3600

a1.so

猜你喜欢

转载自blog.csdn.net/u013955758/article/details/131987317