hadoop2.7.2在Ubuntu12.04下分布式安装指南

一、系统及版本准备

JDK:jdk-7u2-linux-i586

Hadoop:hadoop-2.7.0

安装目录:

/usr/local/jdk

/usr/local/hadoop

节点及IP(/etc/hosts,注意需要重启网络):

192.168.56.100 os.data0

192.168.56.101 os.data1

192.168.56.102 os.data2

二、创建系统用户组

1.创建hadoop用户及组密码为hadoop

$ sudo su
# adduser hadoop

2.sudo用户授权:

root用户下:
vi /etc/sudoers 

添加:

写道
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL

三、配置双向免密钥登录,参见另外一个博客

四、授权及环境变量设置:

sudo chown -R hadoop:hadoop /usr/local/hadoop

环境变量配置:

sudo vi /etc/profile

末尾追加内容如下:

export JAVA_HOME=/usr/local/jdk
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

#set hadoop environment
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL

刷新生效:

$source /etc/profile

五、分布式配置:

在hadoop中创建几个文件夹:

$cd /usr/local/hadoop
/usr/local/hadoop$ mkdir tmp
/usr/local/hadoop$ mkdir tmp/dfs
/usr/local/hadoop$ mkdir tmp/dfs/data
/usr/local/hadoop$ mkdir tmp/dfs/name
/usr/local/hadoop$ sudo chown hadoop:hadoop tmp

修改配置文件涉及文件列表如下:

hadoop-env.sh 

yarn-env.sh 

core-site.xml 

hdfs-site.xml 

yarn-site.xml 

mapred-site.xml 

slaves

1.hadoop-env.sh :

/usr/local/hadoop/etc/hadoop$ sudo vi hadoop-env.sh

修改的内容如下:

export JAVA_HOME=/usr/local/jdk

2.yarn-env.sh 

内容:

export JAVA_HOME=/usr/local/jdk

3.core-site.xml

内容:

<configuration>
       <property>
                <name>fs.defaultFS</name>
                <value>hdfs://os.data0:8020</value>
       </property>
       <property>
                <name>io.file.buffer.size</name>
                <value>131072</value>
        </property>
       <property>
               <name>hadoop.tmp.dir</name>
               <value>file:/usr/local/hadoop/tmp</value>
               <description>Abase for other temporary   directories.</description>
       </property>
        <property>
               <name>hadoop.proxyuser.hosts</name>
               <value>*</value>
       </property>
       <property>
               <name>hadoop.proxyuser.groups</name>
               <value>*</value>
       </property>
</configuration>

4.hdfs-site.xml

内容:

<configuration>
       <property>
                <name>dfs.namenode.secondary.http-address</name>
               <value>os.data0:9001</value>
       </property>
     <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/usr/local/hadoop/tmp/dfs/name</value>
       </property>
      <property>
              <name>dfs.datanode.data.dir</name>
              <value>file:/usr/local/hadoop/tmp/dfs/data</value>
       </property>
       <property>
               <name>dfs.replication</name>
               <value>1</value>
        </property>
        <property>
                 <name>dfs.webhdfs.enabled</name>
                  <value>true</value>
         </property>
</configuration>

5.mapred-site.xml

内容:

<configuration>
          <property>                                                          
           <name>mapreduce.framework.name</name>
                <value>yarn</value>
           </property>
          <property>
                  <name>mapreduce.jobhistory.address</name>
                  <value>os.data0:10020</value>
          </property>
          <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>os.data0:19888</value>
       </property>
</configuration>

6.yarn-site.xml

内容:

<configuration>
        <property>
               <name>yarn.nodemanager.aux-services</name>
               <value>mapreduce_shuffle</value>
        </property>
        <property>                                                                
               <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
               <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
               <name>yarn.resourcemanager.address</name>
               <value>os.data0:8032</value>
       </property>
       <property>
               <name>yarn.resourcemanager.scheduler.address</name>
               <value>os.data0:8030</value>
       </property>
       <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
             <value>os.data0:8031</value>
      </property>
      <property>
              <name>yarn.resourcemanager.admin.address</name>
               <value>os.data0:8033</value>
       </property>
       <property>
               <name>yarn.resourcemanager.webapp.address</name>
               <value>os.data0:8088</value>
       </property>
</configuration>

7.slaves 

内容:

os.data1
os.data2

把配置的配置文件scp到其他节点上,注意scp不覆盖

六、格式化namenode

/usr/local/hadoop$ bin/hdfs namenode -format

如果碰到错误注意解决即可

七、启动

/usr/local/hadoop$ sbin/start-all.sh

通过jps查看进程即可

猜你喜欢

转载自snv.iteye.com/blog/2358256