Storm学习记录(一、简介)

一、简介

Storm是一个免费并开源的分布式实时计算系统。利用Storm可以很容易做到可靠地处理无限的数据流,像Hadoop批量处理大数据一样,Storm可以实时处理数据。Storm简单,可以使用任何编程语言。

Storm有如下特点:

  1. 编程简单:开发人员只需要关注应用逻辑,而且跟Hadoop类似,Storm提供的编程原语也很简单
  2. 高性能,低延迟:可以应用于广告搜索引擎这种要求对广告主的操作进行实时响应的场景。
  3. 分布式:可以轻松应对数据量大,单机搞不定的场景
  4. 可扩展: 随着业务发展,数据量和计算量越来越大,系统可水平扩展
  5. 容错:单个节点挂了不影响应用
  6. 消息不丢失:保证消息处理

Storm计算模型:

Topology – DAG有向无环图的实现:

对于Storm实时计算逻辑的封装即,由一系列通过数据流相互关联的SpoutBolt所组成的拓扑结构

生命周期:此拓扑只要启动就会一直在集群中运行,直到手动将其kill,否则不会终止

(区别于MapReduce当中的JobMR当中的Job在计算执行完成就会终止)

Tuple – 元组:Stream中最小数据组成单元

Stream – 数据流

Spout中源源不断传递数据给Bolt、以及上一个Bolt传递数据给下一个Bolt,所形成的这些数据通道即叫做Stream

Stream声明时需给其指定一个Id(默认为Default),实际开发场景中,多使用单一数据流,此时不需要单独指定StreamId

二、样例 

求sum= 1+2+3+....

添加依赖:

<dependency>
    <groupId>org.apache.storm</groupId>
    <artifactId>storm-core</artifactId>
    <version>1.2.2</version>
    <scope>provided</scope>
</dependency>
public class Test {
    /**
     * 建立拓扑结构,放入集群运行
     * @param args
     */
    public static void main(String[] args) {
        //构建拓扑结构
        TopologyBuilder tb = new TopologyBuilder();

        tb.setSpout("wsspout",new WordSumSpout());

        tb.setBolt("wsbolt",new WordSumBolt()).shuffleGrouping("wsspout");

//        创建本地集群
        LocalCluster lc = new LocalCluster();
//        将任务布置到集群上
        lc.submitTopology("wordsum",new Config(),tb.createTopology());
    }
}
public class WordSumBolt extends BaseRichBolt {
    Map map;
    TopologyContext context;
    OutputCollector collector;

    int sum = 0;

    @Override
    public void prepare(Map map, TopologyContext context, OutputCollector collector) {
        this.map = map;
        this.collector = collector;
        this.context = context;
    }

    /**
     * 获取数据(有必要的话,向后继续发送数据)
     */
    @Override
    public void execute(Tuple tuple) {
//        tuple.getInteger(0);
        int num = tuple.getIntegerByField("num");
        sum += num;

        System.out.println("sum: ------" + sum);
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {

    }
}
public class WordSumSpout extends BaseRichSpout {
    Map map;
    TopologyContext context;
    SpoutOutputCollector collector;
    int i =0;

    /**
     * 配置初始化spout类
     */
    @Override
    public void open(Map map, TopologyContext context, SpoutOutputCollector collector) {
        this.map = map;
        this.context = context;
        this.collector = collector;
    }

    /**
     * 采集并向后推送数据
     */
    @Override
    public void nextTuple() {
        i++;
        List num = new Values(i);
        this.collector.emit(num);

        System.err.println("Spout:-------- "+i);
        Utils.sleep(1000);
    }

    /**
     * 向接收数据的逻辑单元发送数据的字段名称
     */
    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("num"));
    }
}

统计单词出现个数 :

public class Test {
    public static void main(String[] args) {
        TopologyBuilder tb = new TopologyBuilder();
        tb.setSpout("wcspout",new WcSpout());
        tb.setBolt("wspiltbolt",new WspiltBolt()).shuffleGrouping("wcspout");
//        fieldsGrouping: 只传到同一个bolt处理
        tb.setBolt("wcountbolt",new WcountBolt(),3).fieldsGrouping("wspiltbolt",new Fields("word"));

        LocalCluster lc =new LocalCluster();
        lc.submitTopology("wordcount",new Config(),tb.createTopology());

    }
}
public class WcountBolt extends BaseRichBolt {

    //用来统计单词及次数
    Map<String, Integer> map = new HashMap<>();

    @Override
    public void prepare(Map map, TopologyContext topologyContext, OutputCollector collector) {
    }

    /**
     * 获取tuple中的每个单词,并按照单词统计出现的次数
     */
    @Override
    public void execute(Tuple tuple) {
        String word = tuple.getStringByField("word");

        if (map.containsKey(word)) {
            map.put(word, map.get(word) + 1);
        } else {
            map.put(word, 1);
        }

        System.out.println(word +"--------"+map.get(word));
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
    }
}
public class WcSpout extends BaseRichSpout {

    SpoutOutputCollector collector;
    //模拟数据
    String[] text = {
            "hello Sam", "hello Tom", "hello Jetty"
    };
    Random r = new Random();

    @Override
    public void open(Map map, TopologyContext topologyContext, SpoutOutputCollector collector) {
        this.collector = collector;
    }

    //    随机向后发送字符串
    @Override
    public void nextTuple() {
        List line = new Values(text[r.nextInt(text.length)]);
        this.collector.emit(line);
        System.out.println("spout emit: -------" + line);
        Utils.sleep(1000);
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("line"));
    }
}
public class WspiltBolt extends BaseRichBolt{

    OutputCollector collector;
    @Override
    public void prepare(Map map, TopologyContext topologyContext, OutputCollector collector) {
        this.collector = collector;
    }

    /**
     * 获取每一行并切割
     */
    @Override
    public void execute(Tuple tuple) {
        String line = tuple.getString(0);
        String[] words = line.split(" ");

        for (String word:words){
            this.collector.emit(new Values(word));
        }
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("word"));
    }
}

猜你喜欢

转载自blog.csdn.net/qq_33283652/article/details/86242285