Spark Standalone HA (high availability) mode

1. HA architecture description


Insert image description here

2. Host planning


Insert image description here

master node slave node
hadoop002,hadoop005 hadoop003,hadoop004
Zookeeper
hadoop002 ,hadoop003,hadoop004

3. Installation of Zookeeper


4. Spark installation


  1. For installation and deployment, please refer to Spark Standalone cluster installation and test cases
    . Note: 这里是4台虚拟机

  2. To configure Spark high availability, you only need to modify spark-env.sh. The specific content that needs to be modified is as shown in the following table:
    Insert image description here
    Edit spark-env.sh on hadoop002 and replace the content with the following content:

    export JAVA_HOME=/training/jdk1.8.0_171
    export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop002,hadoop003,hadoop004 -Dspark.deploy.zookeeper.dir=/spark"
    #history 配置历史服务
    export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=/training/spark-2.4.8-bin-hadoop2.7/history"
    
         
         
          
          

      Note :需要将spark-env.sh分发到其他节点中,即分发到hadoop003,hadoop004和hadoop005

    5. Run the test


    1. Start the spark cluster on Hadoop002, enter the spark installation directory, and execute:sbin/start-all.sh
    2. Start Master on Hadoop005, enter the spark installation directory, and execute:sbin/start-master.sh
    3. View the web interface provided by spark on hadoop002 in the browser, enter: hadoop002:8080, as shown below:
      Insert image description here
    4. View the web interface provided by spark on hadoop005 in the browser, enter: hadoop005:8080, as shown below:
      Insert image description here
    5. High availability test
      1) Kill the master on hadoop002: kill + process ID or use the stop-master.sh command to shut down
      2) Enter on the browser: hadoop005:8080, change it clearly and refresh the page. Wait for a while and you will see As shown in the figure below:
      Insert image description here
      At this point, the spark master has completed the switch and achieved high availability.

    1. HA architecture description


    Insert image description here

    2. Host planning


    Insert image description here

    master node slave node
    hadoop002,hadoop005 hadoop003,hadoop004
    Zookeeper
    hadoop002 ,hadoop003,hadoop004

    3. Installation of Zookeeper


    4. Spark installation


    1. For installation and deployment, please refer to Spark Standalone cluster installation and test cases
      . Note: 这里是4台虚拟机

    2. To configure Spark high availability, you only need to modify spark-env.sh. The specific content that needs to be modified is as shown in the following table:
      Insert image description here
      Edit spark-env.sh on hadoop002 and replace the content with the following content:

      export JAVA_HOME=/training/jdk1.8.0_171
      export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop002,hadoop003,hadoop004 -Dspark.deploy.zookeeper.dir=/spark"
      #history 配置历史服务
      export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=/training/spark-2.4.8-bin-hadoop2.7/history"
      
           
           
          
          

        Note :需要将spark-env.sh分发到其他节点中,即分发到hadoop003,hadoop004和hadoop005

      5. Run the test


      1. Start the spark cluster on Hadoop002, enter the spark installation directory, and execute:sbin/start-all.sh
      2. Start Master on Hadoop005, enter the spark installation directory, and execute:sbin/start-master.sh
      3. View the web interface provided by spark on hadoop002 in the browser, enter: hadoop002:8080, as shown below:
        Insert image description here
      4. View the web interface provided by spark on hadoop005 in the browser, enter: hadoop005:8080, as shown below:
        Insert image description here
      5. High availability test
        1) Kill the master on hadoop002: kill + process ID or use the stop-master.sh command to shut down
        2) Enter on the browser: hadoop005:8080, change it clearly and refresh the page. Wait for a while and you will see As shown in the figure below:
        Insert image description here
        At this point, the spark master has completed the switch and achieved high availability.

      Guess you like

      Origin blog.csdn.net/weixin_41786879/article/details/126290591