Article directory
Preface
This article records the detailed installation steps of spark-3.1.2. It is recommended to save it and roll it up quietly~~
1. Preparation in advance
- Cluster machines synchronize time with each other
- Password-free login between machines
- Turn off the firewall on all machines
- All machines need to install JDK1.8
- Hadoop environment is best at 3.2
2. Upload the installation package to linux
Installation package name: spark-3.1.2-bin-hadoop3.2.tgz
I uploaded it to the software directory
3. Unzip the installation package
First cd into the software directory, and then unzip the installation package to the /usr/local/ path
[root@cq01 softwares]# tar -zxvf spark-3.1.2-bin-hadoop3.2.tgz -C /usr/local/
Enter /usr/local and rename spark-3.1.2-bin-hadoop3.2.tgz to spark
[root@cq01 softwares]# cd /usr/local/
[root@cq01 local]# mv spark-3.1.2-bin-hadoop3.2/ spark
4. Configuration file
Go to conf from the installation path and configure it.
[root@cq01 local]# cd /usr/local/spark/conf/
1.spark-env.sh.template
Rename to spark-env.sh
[root@cq01 conf]# mv spark-env.sh.template spark-env.sh
Edit spark-env.sh file
[root@cq01 conf]# vi spark-env.sh
Add the jdk installation path at the end of the document
2.workers.template
Rename to workers
[root@cq01 conf]# mv workers.template workers
Add slave nodes according to your own nodes (be careful not to write the Master node in )
[root@cq01 conf]# vi workers
5. Distribute to other nodes
Return to the local path first
[root@cq01 conf]# cd /usr/local/
Distribute the configured content to other nodes ( distributed according to the number of machines in your own cluster )
[root@cq01 local]# scp -r ./spark/ cq02:$PWD
[root@cq01 local]# scp -r ./spark/ cq03:$PWD
6. Configure global environment variables
After configuring the global environment variables, you can use the script under bin anywhere. Note that you should also configure the environment variables of several other machines.
[root@cq01 local]# vi /etc/profile
#spark environment
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
Restart environment variables
[root@cq01 local]# source /etc/profile
7. Start the cluster
Enter the sbin directory under the installation directory
[root@cq01 spark]# cd /usr/local/spark/sbin/
start up
[root@cq01 sbin]# ./start-all.sh
If the following prompt appears, the startup is complete.
8. Check the process
Use the jps command to view the process. Here I wrote a process to view all machines in the cluster.
[root@cq01 sbin]# jps-cluster.sh
The following process appears to indicate that it has been started successfully
9. Web access
The webUI interface provided by spark3.1.2 is the same as the port of tomcat, 8080 , so we can access it by entering the URL http://Virtual machine Master's IP address: 8080 , and then the following interface will appear
10. Verification
Enter the bin directory of spark and execute the following command
[root@cq01 bin]# ./run-example SparkPi 5 --master local[1]
If the following interface appears, it means the operation is successful.
Summarize
At this point, the installation of spark-3.1.2 has been completed. If you have any questions, please feel free to chat.