如何优雅的停止掉SparkStreaming

关于如何优雅的停止SparkStreaming，网上挺多的，我测试了一种简单的方法，分享出来

一个简简单单的SparkStreaming样例,从一个文件中读取数据后将结果保存到指定的目录中

package SparkStream

import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}

/**
  * Created by admin on 2019/3/21.
  * 功能: 演示正常的使用SparkStreaming
  *
  */
object SparkStreaming {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("stop spark streaming").setMaster("local[2]")
    val ssc = new StreamingContext(conf, Seconds(5))
    val file = ssc.textFileStream("C://test//stu1.txt")
    val res = file.map { line =>
      val arr = line.split("\\|")
      //arr(0) + "888888" + arr(2)
      (arr(0)+"88",1)
    }.reduceByKey(_+_)
    res.saveAsTextFiles("c://test//result")
    ssc.start()
    ssc.awaitTermination()

  }
}

如何优雅的停止掉SparkStreaming？这是本文的研究重点。大致思路：

在驱动程序中，加一段代码，这段代码的作用每隔10秒，扫描HDFS上的目录，如果发现这个目录存在，就调用StreamContext对象stop方法，自己优雅的终止自己。

package SparkStream



import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}

/**
  * Created by admin on 2019/3/21.
  * 功能: 演示如何优雅的停止掉SparkStreaming
  *
  */
object StopSparkStreaming {
  val shutdownMarker = "c://test//source1"
  var stopFlag: Boolean = false

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("stop spark streaming").setMaster("local[2]")
    val ssc = new StreamingContext(conf, Seconds(5))
    val file = ssc.textFileStream("C://test//stu1.txt")
    val res = file.map { line =>
      val arr = line.split("\\|")
      //arr(0) + "888888" + arr(2)
      (arr(0)+"88",1)
    }.reduceByKey(_+_)
    res.saveAsTextFiles("c://test//result")
    ssc.start()
    //检查间隔毫秒
    val checkIntervalMillis = 10000
    var isStopped = false
    while (!isStopped) {
      println("calling awaitTerminationOrTimeout")
      //等待执行停止。执行过程中发生的任何异常都会在此线程中抛出，如果执行停止了返回true，
      //线程等待超时长，当超过timeout时间后，会监测ExecutorService是否已经关闭，若关闭则返回true，否则返回false。
      isStopped = ssc.awaitTerminationOrTimeout(checkIntervalMillis)
      if (isStopped) {
        println("confirmed! The streaming context is stopped. Exiting application...")
      } else {
        println("Streaming App is still running. Timeout...")
      }
      //判断文件夹是否存在
      checkShutdownMarker
      if (!isStopped && stopFlag) {
        println("stopping ssc right now")
        //第一个true：停止相关的SparkContext。无论这个流媒体上下文是否已经启动，底层的SparkContext都将被停止。
        //第二个true：则通过等待所有接收到的数据的处理完成，从而优雅地停止。
        ssc.stop(true, true)
        println("ssc is stopped!!!!!!!")
      }
    }
  }

  def checkShutdownMarker = {
    if (!stopFlag) {
      //开始检查hdfs是否有stop-spark文件夹
      val fs = FileSystem.get(new Configuration())
      //如果有返回true，如果没有返回false
      stopFlag = fs.exists(new Path(shutdownMarker))
    }
  }
}

先正常调度ss一段时间，然后创建目录"c://test//source1"。后台日志如下所示,仅仅保留部分记录

Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:48:40 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:48:50 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:00 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:10 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:20 INFO InputInfoTracker: remove old batch metadata: 
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:30 INFO InputInfoTracker: remove old batch metadata: 1553154505000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:40 INFO InputInfoTracker: remove old batch metadata: 1553154515000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:49:50 INFO InputInfoTracker: remove old batch metadata: 1553154525000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:00 INFO InputInfoTracker: remove old batch metadata: 1553154535000 ms
19/03/21 15:50:00 INFO BlockManager: Removing RDD 35
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:10 INFO InputInfoTracker: remove old batch metadata: 1553154545000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:20 INFO InputInfoTracker: remove old batch metadata: 1553154555000 ms
Streaming App is still running. Timeout...
calling awaitTerminationOrTimeout
19/03/21 15:50:30 INFO InputInfoTracker: remove old batch metadata: 1553154565000 ms
Streaming App is still running. Timeout...
stopping ssc right now
19/03/21 15:50:32 INFO JobGenerator: Stopping JobGenerator gracefully
19/03/21 15:50:32 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation
19/03/21 15:50:32 INFO JobGenerator: Waited for all received blocks to be consumed for job generation
19/03/21 15:50:35 INFO FileInputDStream: Finding new files took 2 ms
19/03/21 15:50:35 INFO FileInputDStream: New files at time 1553154635000 ms:

19/03/21 15:50:35 INFO JobScheduler: Added jobs for time 1553154635000 ms
19/03/21 15:50:35 INFO JobScheduler: Starting job streaming job 1553154635000 ms.0 from job set of time 1553154635000 ms
19/03/21 15:50:35 INFO SparkContext: Starting job: saveAsTextFiles at StopSparkStreaming.scala:28
19/03/21 15:50:35 INFO DAGScheduler: Registering RDD 132 (map at StopSparkStreaming.scala:23)
19/03/21 15:50:35 INFO DAGScheduler: Got job 26 (saveAsTextFiles at StopSparkStreaming.scala:28) with 2 output partitions
19/03/21 15:50:35 INFO DAGScheduler: Final stage: ResultStage 53 (saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:35 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 52)
19/03/21 15:50:35 INFO DAGScheduler: Missing parents: List()
19/03/21 15:50:35 INFO DAGScheduler: Submitting ResultStage 53 (MapPartitionsRDD[134] at saveAsTextFiles at StopSparkStreaming.scala:28), which has no missing parents
19/03/21 15:50:35 INFO MemoryStore: Block broadcast_26 stored as values in memory (estimated size 64.5 KB, free 324.6 KB)
19/03/21 15:50:35 INFO MemoryStore: Block broadcast_26_piece0 stored as bytes in memory (estimated size 22.2 KB, free 346.8 KB)
19/03/21 15:50:35 INFO BlockManagerInfo: Added broadcast_26_piece0 in memory on localhost:50448 (size: 22.2 KB, free: 1121.9 MB)
19/03/21 15:50:35 INFO SparkContext: Created broadcast 26 from broadcast at DAGScheduler.scala:1006
19/03/21 15:50:35 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 53 (MapPartitionsRDD[134] at saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:35 INFO TaskSchedulerImpl: Adding task set 53.0 with 2 tasks
19/03/21 15:50:35 INFO TaskSetManager: Starting task 0.0 in stage 53.0 (TID 52, localhost, partition 0,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:35 INFO TaskSetManager: Starting task 1.0 in stage 53.0 (TID 53, localhost, partition 1,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:35 INFO Executor: Running task 0.0 in stage 53.0 (TID 52)
19/03/21 15:50:35 INFO Executor: Running task 1.0 in stage 53.0 (TID 53)
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:35 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
19/03/21 15:50:35 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0053_m_000001_53' to file:/c:/test/result-1553154635000/_temporary/0/task_201903211550_0053_m_000001
19/03/21 15:50:35 INFO SparkHadoopMapRedUtil: attempt_201903211550_0053_m_000001_53: Committed
19/03/21 15:50:35 INFO Executor: Finished task 1.0 in stage 53.0 (TID 53). 2080 bytes result sent to driver
19/03/21 15:50:35 INFO TaskSetManager: Finished task 1.0 in stage 53.0 (TID 53) in 277 ms on localhost (1/2)
19/03/21 15:50:35 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0053_m_000000_52' to file:/c:/test/result-1553154635000/_temporary/0/task_201903211550_0053_m_000000
19/03/21 15:50:35 INFO SparkHadoopMapRedUtil: attempt_201903211550_0053_m_000000_52: Committed
19/03/21 15:50:35 INFO Executor: Finished task 0.0 in stage 53.0 (TID 52). 2080 bytes result sent to driver
19/03/21 15:50:35 INFO TaskSetManager: Finished task 0.0 in stage 53.0 (TID 52) in 335 ms on localhost (2/2)
19/03/21 15:50:35 INFO TaskSchedulerImpl: Removed TaskSet 53.0, whose tasks have all completed, from pool 
19/03/21 15:50:35 INFO DAGScheduler: ResultStage 53 (saveAsTextFiles at StopSparkStreaming.scala:28) finished in 0.335 s
19/03/21 15:50:35 INFO DAGScheduler: Job 26 finished: saveAsTextFiles at StopSparkStreaming.scala:28, took 0.351543 s
19/03/21 15:50:35 INFO JobScheduler: Finished job streaming job 1553154635000 ms.0 from job set of time 1553154635000 ms
19/03/21 15:50:35 INFO JobScheduler: Total delay: 0.680 s for time 1553154635000 ms (execution: 0.630 s)
19/03/21 15:50:35 INFO ShuffledRDD: Removing RDD 128 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 128
19/03/21 15:50:35 INFO MapPartitionsRDD: Removing RDD 127 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 127
19/03/21 15:50:35 INFO MapPartitionsRDD: Removing RDD 126 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 126
19/03/21 15:50:35 INFO UnionRDD: Removing RDD 70 from persistence list
19/03/21 15:50:35 INFO BlockManager: Removing RDD 70
19/03/21 15:50:35 INFO FileInputDStream: Cleared 1 old files that were older than 1553154575000 ms: 1553154570000 ms
19/03/21 15:50:35 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
19/03/21 15:50:35 INFO InputInfoTracker: remove old batch metadata: 1553154570000 ms
19/03/21 15:50:40 INFO RecurringTimer: Stopped timer for JobGenerator after time 1553154640000
19/03/21 15:50:40 INFO FileInputDStream: Finding new files took 5 ms
19/03/21 15:50:40 INFO FileInputDStream: New files at time 1553154640000 ms:

19/03/21 15:50:40 INFO JobScheduler: Added jobs for time 1553154640000 ms
19/03/21 15:50:40 INFO JobScheduler: Starting job streaming job 1553154640000 ms.0 from job set of time 1553154640000 ms
19/03/21 15:50:40 INFO JobGenerator: Stopped generation timer
19/03/21 15:50:40 INFO JobGenerator: Waiting for jobs to be processed and checkpoints to be written
19/03/21 15:50:40 INFO SparkContext: Starting job: saveAsTextFiles at StopSparkStreaming.scala:28
19/03/21 15:50:40 INFO DAGScheduler: Registering RDD 137 (map at StopSparkStreaming.scala:23)
19/03/21 15:50:40 INFO DAGScheduler: Got job 27 (saveAsTextFiles at StopSparkStreaming.scala:28) with 2 output partitions
19/03/21 15:50:40 INFO DAGScheduler: Final stage: ResultStage 55 (saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:40 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 54)
19/03/21 15:50:40 INFO DAGScheduler: Missing parents: List()
19/03/21 15:50:40 INFO DAGScheduler: Submitting ResultStage 55 (MapPartitionsRDD[139] at saveAsTextFiles at StopSparkStreaming.scala:28), which has no missing parents
19/03/21 15:50:40 INFO MemoryStore: Block broadcast_27 stored as values in memory (estimated size 64.5 KB, free 411.2 KB)
19/03/21 15:50:40 INFO MemoryStore: Block broadcast_27_piece0 stored as bytes in memory (estimated size 22.2 KB, free 433.4 KB)
19/03/21 15:50:40 INFO BlockManagerInfo: Added broadcast_27_piece0 in memory on localhost:50448 (size: 22.2 KB, free: 1121.9 MB)
19/03/21 15:50:40 INFO SparkContext: Created broadcast 27 from broadcast at DAGScheduler.scala:1006
19/03/21 15:50:40 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 55 (MapPartitionsRDD[139] at saveAsTextFiles at StopSparkStreaming.scala:28)
19/03/21 15:50:40 INFO TaskSchedulerImpl: Adding task set 55.0 with 2 tasks
19/03/21 15:50:40 INFO TaskSetManager: Starting task 0.0 in stage 55.0 (TID 54, localhost, partition 0,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:40 INFO TaskSetManager: Starting task 1.0 in stage 55.0 (TID 55, localhost, partition 1,PROCESS_LOCAL, 1894 bytes)
19/03/21 15:50:40 INFO Executor: Running task 1.0 in stage 55.0 (TID 55)
19/03/21 15:50:40 INFO Executor: Running task 0.0 in stage 55.0 (TID 54)
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks
19/03/21 15:50:40 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms
19/03/21 15:50:40 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0055_m_000001_55' to file:/c:/test/result-1553154640000/_temporary/0/task_201903211550_0055_m_000001
19/03/21 15:50:40 INFO SparkHadoopMapRedUtil: attempt_201903211550_0055_m_000001_55: Committed
19/03/21 15:50:40 INFO Executor: Finished task 1.0 in stage 55.0 (TID 55). 2080 bytes result sent to driver
19/03/21 15:50:40 INFO TaskSetManager: Finished task 1.0 in stage 55.0 (TID 55) in 303 ms on localhost (1/2)
19/03/21 15:50:40 INFO FileOutputCommitter: Saved output of task 'attempt_201903211550_0055_m_000000_54' to file:/c:/test/result-1553154640000/_temporary/0/task_201903211550_0055_m_000000
19/03/21 15:50:40 INFO SparkHadoopMapRedUtil: attempt_201903211550_0055_m_000000_54: Committed
19/03/21 15:50:40 INFO Executor: Finished task 0.0 in stage 55.0 (TID 54). 2080 bytes result sent to driver
19/03/21 15:50:40 INFO TaskSetManager: Finished task 0.0 in stage 55.0 (TID 54) in 359 ms on localhost (2/2)
19/03/21 15:50:40 INFO TaskSchedulerImpl: Removed TaskSet 55.0, whose tasks have all completed, from pool 
19/03/21 15:50:40 INFO DAGScheduler: ResultStage 55 (saveAsTextFiles at StopSparkStreaming.scala:28) finished in 0.361 s
19/03/21 15:50:40 INFO DAGScheduler: Job 27 finished: saveAsTextFiles at StopSparkStreaming.scala:28, took 0.378762 s
19/03/21 15:50:40 INFO JobScheduler: Finished job streaming job 1553154640000 ms.0 from job set of time 1553154640000 ms
19/03/21 15:50:40 INFO JobScheduler: Total delay: 0.772 s for time 1553154640000 ms (execution: 0.741 s)
19/03/21 15:50:40 INFO ShuffledRDD: Removing RDD 133 from persistence list
19/03/21 15:50:40 INFO MapPartitionsRDD: Removing RDD 132 from persistence list
19/03/21 15:50:40 INFO BlockManager: Removing RDD 132
19/03/21 15:50:40 INFO BlockManager: Removing RDD 133
19/03/21 15:50:40 INFO MapPartitionsRDD: Removing RDD 131 from persistence list
19/03/21 15:50:40 INFO BlockManager: Removing RDD 131
19/03/21 15:50:40 INFO UnionRDD: Removing RDD 75 from persistence list
19/03/21 15:50:40 INFO FileInputDStream: Cleared 1 old files that were older than 1553154580000 ms: 1553154575000 ms
19/03/21 15:50:40 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
19/03/21 15:50:40 INFO InputInfoTracker: remove old batch metadata: 1553154575000 ms
19/03/21 15:50:40 INFO BlockManager: Removing RDD 75
19/03/21 15:50:40 INFO JobGenerator: Waited for jobs to be processed and checkpoints to be written
19/03/21 15:50:40 INFO JobGenerator: Stopped JobGenerator
19/03/21 15:50:40 INFO JobScheduler: Stopped JobScheduler
19/03/21 15:50:40 INFO StreamingContext: StreamingContext stopped successfully
19/03/21 15:50:40 INFO SparkUI: Stopped Spark web UI at http://192.168.17.10:4040
19/03/21 15:50:40 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/03/21 15:50:41 INFO MemoryStore: MemoryStore cleared
19/03/21 15:50:41 INFO BlockManager: BlockManager stopped
19/03/21 15:50:41 INFO BlockManagerMaster: BlockManagerMaster stopped
19/03/21 15:50:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/03/21 15:50:41 INFO SparkContext: Successfully stopped SparkContext
ssc is stopped!!!!!!!
calling awaitTerminationOrTimeout
confirmed! The streaming context is stopped. Exiting application...
19/03/21 15:50:41 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
19/03/21 15:50:41 INFO ShutdownHookManager: Shutdown hook called
19/03/21 15:50:41 INFO ShutdownHookManager: Deleting directory C:\Users\admin\AppData\Local\Temp\spark-ec9347c3-5dbd-4f27-a52f-40e71d565521
19/03/21 15:50:41 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

Process finished with exit code 0

可以看到从"19/03/21 15:49:10"到"19/03/21 15:50:30",ss就每隔10秒钟扫描目录(c://test//result)。因为目录此时没有，等超时之后,ss还在继续运行。此后监测到目录存在后，就调用ssc.stop(true, true)方法进行停止,进而退出整个while循环。

如何优雅的停止掉SparkStreaming

猜你喜欢