Spark Standalone HA (high availability) mode
1. HA architecture description
2. Host planning
master node | slave node |
---|---|
hadoop002,hadoop005 | hadoop003,hadoop004 |
Zookeeper |
---|
hadoop002 ,hadoop003,hadoop004 |
3. Installation of Zookeeper
- Please refer to the cluster installation of Zookeeper
4. Spark installation
-
For installation and deployment, please refer to Spark Standalone cluster installation and test cases
. Note:这里是4台虚拟机
-
To configure Spark high availability, you only need to modify spark-env.sh. The specific content that needs to be modified is as shown in the following table:
Edit spark-env.sh on hadoop002 and replace the content with the following content:export JAVA_HOME=/training/jdk1.8.0_171 export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop002,hadoop003,hadoop004 -Dspark.deploy.zookeeper.dir=/spark" #history 配置历史服务 export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=/training/spark-2.4.8-bin-hadoop2.7/history"
Note :
需要将spark-env.sh分发到其他节点中,即分发到hadoop003,hadoop004和hadoop005
5. Run the test
- Start the spark cluster on Hadoop002, enter the spark installation directory, and execute:
sbin/start-all.sh
- Start Master on Hadoop005, enter the spark installation directory, and execute:
sbin/start-master.sh
- View the web interface provided by spark on hadoop002 in the browser, enter:
hadoop002:8080
, as shown below:
- View the web interface provided by spark on hadoop005 in the browser, enter:
hadoop005:8080
, as shown below:
- High availability test
1) Kill the master on hadoop002: kill + process ID or use the stop-master.sh command to shut down
2) Enter on the browser:hadoop005:8080
, change it clearly and refresh the page. Wait for a while and you will see As shown in the figure below:
At this point, the spark master has completed the switch and achieved high availability.
Spark Standalone HA (high availability) mode
1. HA architecture description
2. Host planning
master node | slave node |
---|---|
hadoop002,hadoop005 | hadoop003,hadoop004 |
Zookeeper |
---|
hadoop002 ,hadoop003,hadoop004 |
3. Installation of Zookeeper
- Please refer to the cluster installation of Zookeeper
4. Spark installation
-
For installation and deployment, please refer to Spark Standalone cluster installation and test cases
. Note:这里是4台虚拟机
-
To configure Spark high availability, you only need to modify spark-env.sh. The specific content that needs to be modified is as shown in the following table:
Edit spark-env.sh on hadoop002 and replace the content with the following content:export JAVA_HOME=/training/jdk1.8.0_171 export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop002,hadoop003,hadoop004 -Dspark.deploy.zookeeper.dir=/spark" #history 配置历史服务 export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=/training/spark-2.4.8-bin-hadoop2.7/history"
Note :
需要将spark-env.sh分发到其他节点中,即分发到hadoop003,hadoop004和hadoop005
5. Run the test
- Start the spark cluster on Hadoop002, enter the spark installation directory, and execute:
sbin/start-all.sh
- Start Master on Hadoop005, enter the spark installation directory, and execute:
sbin/start-master.sh
- View the web interface provided by spark on hadoop002 in the browser, enter:
hadoop002:8080
, as shown below:
- View the web interface provided by spark on hadoop005 in the browser, enter:
hadoop005:8080
, as shown below:
- High availability test
1) Kill the master on hadoop002: kill + process ID or use the stop-master.sh command to shut down
2) Enter on the browser:hadoop005:8080
, change it clearly and refresh the page. Wait for a while and you will see As shown in the figure below:
At this point, the spark master has completed the switch and achieved high availability.