Detailed installation steps for Spark3


Preface

This article records the detailed installation steps of spark-3.1.2. It is recommended to save it and roll it up quietly~~


1. Preparation in advance

  1. Cluster machines synchronize time with each other
  2. Password-free login between machines
  3. Turn off the firewall on all machines
  4. All machines need to install JDK1.8
  5. Hadoop environment is best at 3.2

2. Upload the installation package to linux

Installation package name: spark-3.1.2-bin-hadoop3.2.tgz

I uploaded it to the software directory

3. Unzip the installation package

First cd into the software directory, and then unzip the installation package to the /usr/local/ path

    [root@cq01 softwares]# tar -zxvf spark-3.1.2-bin-hadoop3.2.tgz -C /usr/local/

Enter /usr/local and rename spark-3.1.2-bin-hadoop3.2.tgz to spark

    [root@cq01 softwares]# cd /usr/local/
    [root@cq01 local]# mv spark-3.1.2-bin-hadoop3.2/ spark

4. Configuration file

Go to conf from the installation path and configure it.

    [root@cq01 local]# cd /usr/local/spark/conf/

1.spark-env.sh.template

Rename to spark-env.sh

[root@cq01 conf]# mv spark-env.sh.template spark-env.sh

Edit spark-env.sh file

    [root@cq01 conf]# vi spark-env.sh

Add the jdk installation path at the end of the document
Insert image description here

2.workers.template

Rename to workers

    [root@cq01 conf]# mv workers.template workers

Add slave nodes according to your own nodes (be careful not to write the Master node in )

    [root@cq01 conf]# vi workers 

Insert image description here

5. Distribute to other nodes

Return to the local path first

    [root@cq01 conf]# cd /usr/local/

Distribute the configured content to other nodes ( distributed according to the number of machines in your own cluster )

    [root@cq01 local]# scp -r ./spark/ cq02:$PWD
    [root@cq01 local]# scp -r ./spark/ cq03:$PWD

6. Configure global environment variables

After configuring the global environment variables, you can use the script under bin anywhere. Note that you should also configure the environment variables of several other machines.

    [root@cq01 local]# vi /etc/profile
    #spark environment
    export SPARK_HOME=/usr/local/spark
    export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin

Insert image description here

Restart environment variables

    [root@cq01 local]# source /etc/profile

7. Start the cluster

Enter the sbin directory under the installation directory

    [root@cq01 spark]# cd /usr/local/spark/sbin/

start up

    [root@cq01 sbin]# ./start-all.sh

If the following prompt appears, the startup is complete.
Insert image description here

8. Check the process

Use the jps command to view the process. Here I wrote a process to view all machines in the cluster.

    [root@cq01 sbin]# jps-cluster.sh 

The following process appears to indicate that it has been started successfully
Insert image description here

9. Web access

The webUI interface provided by spark3.1.2 is the same as the port of tomcat, 8080 , so we can access it by entering the URL http://Virtual machine Master's IP address: 8080 , and then the following interface will appear
Insert image description here

10. Verification

Enter the bin directory of spark and execute the following command

    [root@cq01 bin]# ./run-example SparkPi 5 --master local[1]

If the following interface appears, it means the operation is successful.

Insert image description here


Summarize

At this point, the installation of spark-3.1.2 has been completed. If you have any questions, please feel free to chat.

Guess you like

Origin blog.csdn.net/qq_45263520/article/details/124421370