个人遇到的一些Hadoop错误

1、org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block

对应的文件和数据块是存在的，之所以报这个错，是因为打开的数据流过多没有及时关闭

2、INFO ipc.Client: Retrying connect to server: slave1/192.168.233.131:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

情况之一：格式化NameNode时，连接的DataNode机器防火墙没关

1）、ping master能通,telnet master 9000不能通，说明开启了防火墙
2）、去master主机关闭防火墙/etc/init.d/iptables stop

情况之二：运行MapReduce任务时，yarn没有启动

3、java.io.IOException: Bad connect ack with firstBad

1）、某个节点机器开启防火墙，导致不能连接

2）、强制kill掉某个节点（据说）

3）、某个机器直接当掉

4、hdfs所有nameNode都是standby，zkfc 进程锁启动不了

解决方法：hdfs zkfc -formatZK

5、Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied

给相应的文件或目录设置可访问权限

6、STARTUP_MSG: host = java.net.UnknownHostException: Name or service not known

修改/etc/hosts文件

7、DataNode无法启动

查看DataNode日志
日志报错“ulimit -a for user root”
原因：datanamenode运行时打开文件数，达到系统最大限制
当前最大限制
[root@centos-FI hadoop-2.4.1]# ulimit -n
1024
解决：
调整最大打开文件数
[root@centos-FI hadoop-2.4.1]# ulimit -n 65536
[root@centos-FI hadoop-2.4.1]# ulimit -n
65536
再次启动hadoop
ps：ulimit命令只是临时修改，重启又恢复默认。

8、Refusing to manually manage HA state, since it may cause a split-brain scenario or other incorrect state. If you are very sure you know what you are doing, please specify the forcemanual flag.

是因为开启了zkfc 自动选active的namenode 不能手动切换了 zkfc会自动选择namenode节点作为active的

9、org.apache.hadoop.HadoopIllegalArgumentException: Could not get the namenode ID of this node. You may run zkfc on the node other than namenode.

启动zkfc的命令使用错误，使用hadoop-daemon.sh而不是hadoop-daemons.sh

个人遇到的一些Hadoop错误

猜你喜欢