版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
1、在idea中构建maven工程,并配置好pom文件
pom:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.zpark</groupId>
<artifactId>HelloStorm</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!--storm相关jar -->
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>1.1.1</version>
<!-- 运行在本地需要注释下面这句-->
<scope>provided</scope>
</dependency>
</dependencies>
</project>
2、Spout 继承 BaseRichSpout
package com.zpark;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import java.util.Map;
import java.util.Random;
public class Spout extends BaseRichSpout {
private SpoutOutputCollector spoutOutputCollector;
private static String[] words = {"Hadoop","Storm","Apache","Linux","Nginx","Tomcat","Spark"};
public void open(Map map, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) {
this.spoutOutputCollector = spoutOutputCollector;
}
public void nextTuple() {
String word = words[new Random().nextInt(words.length)];
spoutOutputCollector.emit(new Values(word));
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
System.out.println("定义格式...");
declarer.declare(new Fields("test"));
}
}
3、Bolt 继承 BaseRichBolt
package com.zpark;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import java.util.Map;
public class Bolt extends BaseRichBolt {
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
System.out.println("==========打印=========");
}
public void execute(Tuple tuple) {
String word = (String) tuple.getValue(0);
String out = "Hello " + word + "!";
System.out.println(out);
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
}
4、Topology类
package com.zpark;
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.topology.TopologyBuilder;
public class App {
public static void main(String[] args) {
//定义一个拓扑
TopologyBuilder builder=new TopologyBuilder();
//设置一个Executeor(线程),默认一个
builder.setSpout("send", new Spout());
//设置一个Executeor(线程),和一个task
builder.setBolt("deal", new Bolt(),1).setNumTasks(1).shuffleGrouping("send");
Config conf = new Config();
conf.put("send", "send");
try{
//运行拓扑
if(args !=null&&args.length>0){ //有参数时,表示向集群提交作业,并把第一个参数当做topology名称
System.out.println("远程模式");
StormSubmitter.submitTopology(args[0], conf, builder.createTopology());
} else{//没有参数时,本地提交
//启动本地模式
System.out.println("本地模式");
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("work" ,conf, builder.createTopology() );
Thread.sleep(60000);
// 关闭本地集群
cluster.shutdown();
}
}catch (Exception e){
e.printStackTrace();
}
}
}
5、运行程序,其中控制台会出现如下数据
6、接下来我们将这个项目放到Storm服务器集群中运行,这里不要把Storm的jar包加进来。
7、我们将jar包上传到linux中storm安装目录下,然后这个时候在主节点storm安装目录下执行: bin/storm nimbus & 在从节点目录下分别执行 bin/storm supervisor & 启动整个集群的storm服务,也可以执行 bin/storm ui & 启动UI管理界面更直观的看到执行结果,当然对于单机环境启动或者不启动storm服务都可以,这个时候,执行下面命令运行本次项目的程序
这里就是调用了App类中的main方法,如果程序中对参数进行了处理,后面还可以跟上参数,回车确认执行之后,系统会进行初始化集群的工作,几秒后任务开始执行,执行过程中某一时刻的滚动输出如下:
到这里,第一个Storm入门项目的开发和测试运行都完毕了,更复杂的计算逻辑模式也基本相同,主要就是Maven项目中出现了更复杂的模块和调用,整个运行的流程其实都是差不多的,现在就算步入Storm流式计算的殿堂的大门了,接下来的精彩还需要慢慢体会