Window sliding window of Spark-Streaming and statistical case of hot search terms

Spark Streaming provides support for sliding window operations, allowing us to perform computational operations on data within a sliding window. Each time the data of the RDD dropped in the window will be aggregated to perform the calculation operation, and then the generated RDD will be used as an RDD of the window DStream. For example, in the figure below, a sliding window calculation is performed on the data every three seconds. The three RDDs in these three seconds will be aggregated for processing, and after two seconds, the data in the last three seconds will be processed again. Perform sliding window calculations. Therefore, for each sliding window operation, two parameters must be specified, the window length and the sliding interval, and the values ​​of these two parameters must be an integer multiple of the batch interval. (Spark Streaming's support for sliding windows is more complete and powerful than Storm)
Transform: Transform
window: perform custom calculations on the data of each sliding window
countByWindow: perform count operation on the data of each sliding window
reduceByWindow: perform a reduce operation on the data of each sliding window
reduceByKeyAndWindow: perform a reduceByKey operation on the data of each sliding window
countByValueAndWindow: perform countByValue operation on the data of each sliding window
Case:
object WindowDemo {
  def main(args: Array[String]): Unit = {
    Logger.getLogger("org").setLevel(Level.WARN)
    val config = new SparkConf().setAppName("WindowDemo").setMaster("local[2]")
    //Seconds(1) Create an RDD in 1 second
    val ssc = new StreamingContext(config, Seconds(1))
    //(a: Int, b: Int) => a + b   a代表上一次累加的结果,b代表本次需要累加的元素
    //Seconds(3)  代表窗口的时间范围
    //Seconds(2)  代表窗口的滑动间隔
    ssc.socketTextStream("hadoop01", 8888).flatMap(_.split(" ")).map((_, 1)).reduceByKeyAndWindow(
      (a: Int, b: Int) => a + b, Seconds(3), Seconds(2)).print()
    ssc.start()
    ssc.awaitTermination()
  }
}


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325665699&siteId=291194637