单机Hadoop的安装与使用

第一步:安装操作系统并创建Hadoop用户
OS:RHEL6.5
[root@hadoop ~]# useradd hadoop
[root@hadoop ~]# passwd hadoop

第二步:Java安装
自带Java
[root@hadoop ~]# java -version
java version "1.7.0_45"
OpenJDK Runtime Environment (rhel-2.4.3.3.el6-x86_64 u45-b15)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)

JAVA_HOME为/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.45.x86_64


第三步:SSH登陆权限设置
对于Hadoop的伪分布和全分布而言,Hadoop的NameNode需要启动集群中所有机器的Hadoop守护进程。通过SSH实现。
配置SSH
su - hadoop

mkdir ~/.ssh
chmod 700 ~/.ssh
/usr/bin/ssh-keygen -t rsa
/usr/bin/ssh-keygen -t dsa
检查是否有~/.ssh/authorized_keys 如果没有执行下面,如果有,跳过
$ touch ~/.ssh/authorized_keys
$ cd ~/.ssh
$ ls
----------------------------------
ssh rac1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh rac1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh rac2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh rac2 cat /home/oracle/.ssh/id_dsa.pub >>authorized_keys

scp authorized_keys rac2:/home/oracle/.ssh/

第四步:单机Hadoop安装
下载安装包:hadoop-2.8.1.tar.gz
上传安装包
创建合适的目录,解压安装包。
 cd /usr/local
 mkdir hadoop
cp /usr/hadoop-2.8.1.tar.gz /usr/local/hadoop/
tar -xzvf hadoop-2.8.1.tar.gz
 
[hadoop@hadoop hadoop-2.8.1]$ export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.45.x86_64/jre
[hadoop@hadoop hadoop-2.8.1]$ ./bin/hadoop version
Hadoop 2.8.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 20fe5304904fc2f5a18053c389e43cd26f7a70fe
Compiled by vinodkv on 2017-06-02T06:14Z
Compiled with protoc 2.5.0
From source with checksum 60125541c2b3e266cbf3becc5bda666
This command was run using /usr/local/hadoop/hadoop-2.8.1/share/hadoop/common/hadoop-common-2.8.1.jar

测试:
mkdir input
cp /usr/local/hadoop/hadoop-2.8.1/etc/hadoop /usr/local/hadoop/hadoop-2.8.1/input
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar grep input output 'dfs[a-z.]+'

结果:
。。。
 File System Counters
                FILE: Number of bytes read=1500730
                FILE: Number of bytes written=2509126
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=12
                Map output records=12
                Map output bytes=274
                Map output materialized bytes=304
                Input split bytes=133
                Combine input records=0
                Combine output records=0
                Reduce input groups=5
                Reduce shuffle bytes=304
                Reduce input records=12
                Reduce output records=12
                Spilled Records=24
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=34
                Total committed heap usage (bytes)=274628608
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=468
        File Output Format Counters
                Bytes Written=214

output下的信息:
[root@hadoop output]# ll
total 4
-rw-r--r--. 1 hadoop hadoop 202 Jul 23 14:57 part-r-00000
-rw-r--r--. 1 hadoop hadoop   0 Jul 23 14:57 _SUCCESS
[root@hadoop output]# vi part-r-00000

6       dfs.audit.logger
4       dfs.class
3       dfs.server.namenode.
3       dfs.logger
2       dfs.period
2       dfs.audit.log.maxfilesize
2       dfs.audit.log.maxbackupindex
1       dfsmetrics.log
1       dfsadmin
1       dfs.servers
1       dfs.log
1       dfs.file



猜你喜欢

转载自blog.csdn.net/ghostliming/article/details/76430226