HDFS balancer 异常处理

版权声明:原创文章,欢迎转载,转载请注明出处 https://blog.csdn.net/zhangshenghang/article/details/82805302

Hbase批量导入数据时,服务器负载较高,导致HDFS数据没有及时均衡,导致有一个DataNode数据暴增,手动进行balancer。

增加HDFS DataNode节点,想要均衡数据存储,执行

 hdfs balancer -threshold 10 

突然有一些节点报错


18/09/21 17:51:37 WARN balancer.Dispatcher: Failed to move blk_1073837252_96442 with size=268435456 from 10.248.161.6:9866:DISK to 10.248.161.10:9866:DISK through 10.248.161.6:9866
java.net.NoRouteToHostException: 没有到主机的路由
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:356)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$3000(Dispatcher.java:233)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1148)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

后来发现是新增的节点没有关闭防火墙 。。。。。 

CentOS7 执行

service firewalld status
service firewalld stop
systemctl disable firewalld.service
service firewalld status

然后再查看日志发现恢复正常。

运行了十个小时才完成~~~~~~~~~~~

17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.blocksize = 268435456 (default=134217728)
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.32:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.31:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.13:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.7:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.12:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.9:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.40:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.35:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.10:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.6:9866
17/12/08 19:16:28 INFO balancer.Balancer: 0 over-utilized: []
17/12/08 19:16:28 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
17/12/08 19:16:28              311              3.05 TB                 0 B                0 B
17/12/08 19:16:28       Balancing took 9.853577777777778 hours

猜你喜欢

转载自blog.csdn.net/zhangshenghang/article/details/82805302