上一篇讲了flume的安装使用和ganglia的安装使用,今天我们将flume自定义source和sink。
一、自定义source
1、模拟数据源,发送数据
1.1、修改pom,增加flume依赖
<dependency>
<groupId>org.apache.flume</groupId>
<artifactId>flume-ng-core</artifactId>
<version>${flume.version}</version>
</dependency>
1.2、自定义source
自定义source需要继承AbstractSource,并实现Configurable, PollableSource这两个接口
public class MySource extends AbstractSource implements Configurable, PollableSource {
// 定义需要从配置中读取的字段
private long delay;
private String field;
public Status process() throws EventDeliveryException {
try {
Map<String, String> header = new HashMap<>();
SimpleEvent event = new SimpleEvent();
// 拿到数据,把数据存进event
for (int i = 0; i < 5; i++) {
event.setHeaders(header);
event.setBody((field + i).getBytes());
getChannelProcessor().processEvent(event);
}
return Status.READY;
} catch (Exception e) {
return Status.BACKOFF;
}
}
public void configure(Context context) {
delay = context.getLong("delay", 2000l);
field = context.getString("field", "test");
}
/**
*
* @return
*/
public long getBackOffSleepIncrement() {
return 0;
}
/**
*
* @return
*/
public long getMaxBackOffSleepInterval() {
return 0;
}
}
1.3、测试
将写的程序打包jar包,放到flume/lib目录下,新建一个conf,并修改配置
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = com.lf.test.MySource
a1.sources.r1.delay = 5000
a1.sources.r1.field = xxxxx
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
然后执行启动命令
flume-ng agent -c conf/ -n a1 -f jobs/flume-mysource.conf -Dflume.root.logger=INFO,console
结果:
已经打印出数据。
二、自定义sink
1、模拟控制台输出
1.1、自定义sink
自定义sink需要继承AbstractSink,并实现这个接口
public class MySink extends AbstractSink implements Configurable {
// 创建Logger对象
private static final Logger LOG = LoggerFactory.getLogger(AbstractSink.class);
private String prefix;
private String suffix;
@Override
public Status process() throws EventDeliveryException {
// 声明返回值状态信息
Status status = null;
// 获取当前Sink绑定的Channel
Channel channel = getChannel();
// 获取事务
Transaction transaction = channel.getTransaction();
// 声明事件
Event event;
// 事务开启
transaction.begin();
// 读取Channel中的事件,直到读取到事件结束循环
while (true) {
event = channel.take();
if (event != null) {
break;
}
}
try {
// 处理事件(打印)
LOG.info(prefix + new String(event.getBody()) + suffix);
//事务提交
transaction.commit();
status = Status.READY;
} catch (Exception e) {
transaction.rollback();
status = Status.BACKOFF;
} finally {
transaction.close();
}
return status;
}
@Override
public void configure(Context context) {
// 读取配置文件内容,有默认值
prefix = context.getString("prefix", "hello:");
// 读取配置文件内容,无默认值
suffix = context.getString("suffix");
}
}
1.2、测试
将写的程序打包jar包,放到flume/lib目录下,新建一个conf,并修改配置
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = com.lf.test.MySink
a1.sinks.k1.suffix = :world
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动agent和nc
nc localhost 44444
flume-ng agent -c conf/ -f jobs/flume-mysink.conf -n a1 -Dflume.root.logger=INFO,console
成功