How to avoid various problems caused by HBase writing too fast

First, let's briefly review the entire writing process

client api ==> RPC ==>  server IPC ==> RPC queue ==> RPC handler ==> write WAL ==> write memstore ==> flush to  filesystem

The entire writing process starts from the client calling the API, the data is encoded into a request through protobuf, and the IPC module implemented through socket is sent to the server's RPC queue. Finally, the handler responsible for processing the RPC takes out the request and completes the write operation. Writing will first write the WAL file, and then write a copy to the memory, that is, the memstore module. When the conditions are met, the memstore will be flushed to the underlying file system to form an HFile.

What problems do you encounter when writing too fast?

When writing is too fast, the water level of the memstore will be pushed up immediately.
You may see logs similar to the following:

RegionTooBusyException: Above memstore limit, regionName=xxxxx ...

This is that the memory size of the Region's memstore exceeds 4 times the normal size. At this time, an exception will be thrown, the write request will be rejected, and the client will start to retry the request. When it reaches 128M, the flush memstore will be triggered. When it reaches 128M * 4 and the flush cannot be triggered, an exception will be thrown to reject the write. The default values ​​for the two related parameters are as follows:

hbase.hregion.memstore.flush.size=128M
hbase.hregion.memstore.block.multiplier=4

or a log like this:

regionserver.MemStoreFlusher: Blocking updates on hbase.example.host.com,16020,1522286703886: the global memstore size 1.3 G is >= than blocking 1.3 G size
regionserver.MemStoreFlusher: Memstore is above high water mark and block 528ms

This is that the total memory overhead of the memstore of all regions exceeds the configuration upper limit. The default is 40% of the configuration heap, which will cause writes to be blocked. The purpose is to wait for the flush thread to flush the data in the memory, otherwise continuing to allow writing to the memestore will burst the memory.

hbase.regionserver.global.memstore.upperLimit=0.4  # 较旧版本,新版本兼容
hbase.regionserver.global.memstore.size=0.4 # 新版本

When writing is blocked, the queue will start to backlog. If you are unlucky, it will eventually lead to OOM. You may find that the JVM crashes due to OOM or see a log similar to the following:

ipc.RpcServer: /192.168.x.x:16020 is unable to read call parameter from client 10.47.x.x
java.lang.OutOfMemoryError: Java heap space

I think HBase has a very bad design here, catching the OOM exception but not terminating the process. At this time, the process may not be able to run normally, and you will find that many other threads also throw OOM exceptions in the log. For example, stop may not stop at all, and RS may be in a dead state.

How to avoid RS OOM?

One is to speed up the flush:

hbase.hstore.blockingWaitTime = 90000 ms
hbase.hstore.flusher.count = 2
hbase.hstore.blockingStoreFiles = 10

When the hbase.hstore.blockingStoreFilesconfiguration upper limit is reached, the flush will block and wait until the compaction work is completed. The blocking time is hbase.hstore.blockingWaitTime, you can change this time to a smaller value. hbase.hstore.flusher.countIt can be configured according to the machine model. Unfortunately, this number will not be dynamically adjusted according to the write pressure. If it is configured too much, it will be useless for non-imported data in multiple scenarios. You have to restart to change the configuration.

In the same way, if the flush is accelerated, it means that the compaction must keep up, otherwise there will be more and more files, so the scan performance will decrease and the overhead will increase.

hbase.regionserver.thread.compaction.small = 1
hbase.regionserver.thread.compaction.large = 1

Adding compaction threads increases CPU and bandwidth overhead and may affect normal requests. If it's not importing data, it's generally enough. Fortunately, this configuration can be dynamically adjusted in Cloud HBase without restarting.

The above configuration requires manual intervention. If the intervention is not timely, the server may already be OOM. Is there a better control method at this time?

hbase.ipc.server.max.callqueue.size = 1024 * 1024 * 1024 # 1G

Directly limit the size of the queue stack. When the accumulation reaches a certain level, in fact, the subsequent requests cannot be processed by the server, and the client may time out first. And the accumulation will lead to OOM, the default configuration of 1G requires a relatively large memory model. When the queue limit is reached, the client will receive it CallQueueTooBigException and automatically retry. This can prevent the server side from being written to burst when the writing is too fast, which has a certain back pressure effect. Using this online works well on some small models of stability control.

Original link

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325785692&siteId=291194637