Spark flow summary

Spark flow Summary:
1, build spark application operating environment, start sparkContext (context)
2, sparkcontext registered to the resource manager and application run executor resources
3, the resource manager master assigned executor resources and start StandaloneExecutorBackend
4, executor of the operation with the sending heartbeat mechanism to Master
. 5, Master Client returns to the resource, the initialization assembly dirver
. 6, RDD Object: the sparkContext, operator action encountered construct DAG directed acyclic graph, submitted to the FIG DAGScheduler
. 7, DAGScheduler: the shuffle segmentation stage, find the task task according to stage, to load tasks to taskSet in the results submitted to the TeskScheduler
8, TeskScheduler: submit tasks to the cluster, you can make provisioning and management tasks
9, exector: responsible for running tasks, save data and management data (Blocks)
10, TeskRunnerexector is a thread pool thread pool, the task is to encapsulate the task TeskRunner, into the thread pool, call the run method of execution
11, finished running the release of resources

 

 

Guess you like

Origin www.cnblogs.com/huSimple/p/11815605.html