Why do we want to choose the production Spark On Yarn mode?

Why do we want to choose the production Spark On Yarn?

We chose the development of local [2] mode
of production run tasks on Job, we chose Spark On Yarn mode,

Spark Application deployed to the yarn, there are the following advantages:

1. Application and deploy services more convenient

  • Only yarn, including Spark, Storm including a variety of applications do not bring their own services, after they submitted via the client, distributed caching mechanisms provided by the yarn distribution to each compute node.

2. Resource isolation mechanism

  • yarn only responsible for the management and scheduling of resources, entirely up to the user themselves and what kind of services and Applicatioin run on a cluster yarn, it is possible to simultaneously run multiple similar services and Application on the yarn. Yarn Cgroups use of resources to achieve isolation, the user when developing new services or Application, do not worry about the isolation of resources.

3. Resource Management elasticity

  • Yarn can be by way of a queue, a plurality of service management in yarn run clusters thereof, according to different types of applications where pressure is adjusted corresponding to the amount of resources, resource management elasticity.

Spark On Yarn has two modes, one is a cluster modes, one is client mode.

Run client mode:

  • “./spark-shell –master yarn”
  • “./spark-shell –master yarn-client”
  • “./spark-shell –master yarn –deploy-mode client”

Running a cluster model

  • “./spark-shell –master yarn-cluster”
  • “./spark-shell –master yarn –deploy-mode cluster”

The main difference between client and cluster modes:. A client's driver is running in client process b in the driver cluster is running in the Application Master.

Guess you like

Origin blog.51cto.com/14309075/2415646