hadoop之spark完全分布式环境搭建

hadoop之spark完全分布式环境搭建

配置scala

1)下载Scala安装包scala-2.11.4.tgz安装

rpm -xzvf scala-2.11.4.tgz

2)添加Scala环境变量,在~/.bashrc中添加:

export SCALA_HOME=/usr/local/scala
export PATH=$SCALA_HOME/bin:$PATH

2)验证Scala是否成功:

scala -version

配置SPARK

下载二进制包spark-2.2.0-bin-hadoop2.7.tgz
网址:http://spark.apache.org/downloads.html,最新为2.2.0

步骤

  1. tar开文件包

    tar -xzvf spark-2.2.0-bin-hadoop2.7.tgz
    
  2. 重命名

    `mv spark-2.2.0-bin-hadoop2.7 spark`
    
  3. 配置环境变量vi ~/.bashrc 添加

    export SPARK_HOME=/usr/local/spark
    export PATH=$SPARK_HOME/bin:$PATH
    

    保存后执行source ~/.bashrc
    执行spark-shell看是否配置成功

  4. 进入conf文件夹,复制spark-env.sh.templatespark-env.sh,并添加如下内容

    export JAVA_HOME=/usr/local/java
    export SCALA_HOME=/usr/local/scala
    export HADOOP_HOME=/usr/local/hadoop
    export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
    export SPARK_MASTER_HOST=master
    export SPARK_LOCAL_IP=192.168.1.151
    export SPARK_WORKER_MEMORY=800m
    export SPARK_WORKER_CORES=1
    export SPARK_HOME=/usr/local/spark
    export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
    
  5. 复制slaves.template成slaves cp slaves.template slaves,修改$SPARK_HOME/conf/slaves,添加如下内容:

    master
    slave1
    slave2
    
  6. 将配置好的spark文件和.bashrc文件复制到slave1和slave2节点

    扫描二维码关注公众号,回复: 5271649 查看本文章
    scp -r /usr/local/spark slave1:/usr/local
    scp -r /usr/local/spark slave2:/usr/local
    scp -r ~/.bashrc slave1:~/
    scp -r ~/.bashrc slave2:~/
    

    最后各节点source ~/.bashrc

  7. 在slave1和slave2修改$SPARK_HOME/conf/spark-env.sh,将export SPARK_LOCAL_IP=192.168.1.151改成slave1和slave2对应节点的IP

  8. 在Master节点启动集群

    sbin/start-all.sh
    
  9. 使用jps查看集群是否启动成功

      master在Hadoop的基础上新增了: Master

      slave1和slave2在Hadoop的基础上新增了: Worker

10.电脑访问http://master:8080/出现如下页面,证明搭建成功

猜你喜欢

转载自blog.csdn.net/weixin_39394526/article/details/75555328