The detailed process of writing data, storing data, and reading data in Hbase

Client writes -> Stores in MemStore until MemStore is full -> Flush becomes a StoreFile until it grows to a certain threshold -> Starts Compact merge operation -> Multiple StoreFiles are merged into one StoreFile, and version merge and data deletion are performed at the same time -> After the StoreFiles are Compacted, gradually larger and larger StoreFiles are formed -> when the size of a single StoreFile exceeds a certain threshold, the Split operation is triggered to split the current Region into two Regions, the Region will go offline, and the two child Regions from the new Split will be It is assigned to the corresponding HRegionServer by HMaster.


The process of writing and storing data in Hbase

<ignore_js_op><iframe id="iframe_0.20402640171035413" style="border: medium; border-image: none; width: 603px; height: 389px;" src="data:text/html;charset=utf8,%3Cimg%20id=%22img%22%20src=%22http://www.aboutyun.com/data/attachment/forum/201412/28/133846gvccfx6v2gixfbzr.jpg?_=5118545%22%20style=%22border:none;max-width:1151px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.20402640171035413',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no"></iframe> 

 

Client write -> store in MemStore until MemStore is full -> Flush into a StoreFile, until it grows to a certain threshold -> start Compact merge operation -> merge multiple StoreFiles into one StoreFile, and perform version merge and data deletion at the same time -> After the StoreFiles are Compacted, gradually larger and larger StoreFiles are formed -> when the size of a single StoreFile exceeds a certain threshold, the Split operation is triggered to split the current Region into two Regions, the Region will go offline, and the two child Regions from the new Split will be It is assigned to the corresponding HRegionServer by HMaster, so that the pressure of the original one Region can be shunted to two Regions. From this process, it can be seen that HBase only adds data, and the update and delete operations are all done in the Compact stage, so, User write operations only need to enter the memory and return immediately, thus ensuring high I/O performance.

In addition to the above process:

Supplement 1: HStore storage is the core of HBase storage, which consists of two parts, one is MemStore and the other is StoreFiles.
Supplement 2: Features of HLog:
In a distributed system environment, system errors or downtime cannot be avoided. Once HRegionServer exits,
The memory data in the MemStore will be lost, and the introduction of HLog is to prevent this.
Working mechanism: There will be an HLog object in each HRegionServer. HLog is a class that implements Write Ahead Log. Every time a user operates to write to the Memstore, a copy of the data will also be written to the HLog file, and the HLog file will be rolled out regularly. , and delete old files (data persisted to StoreFile). When the HRegionServer terminates unexpectedly, the HMaster will perceive it through Zookeeper. The HMaster first processes the remaining HLog files, splits the log data of different regions, and puts them in the corresponding region directories. log) redistribution, the HRegionServer that received these regions will find that there are historical HLogs that need to be processed during the process of loading regions, so the data in the HLog will be Replayed to MemStore, and then flushed to StoreFiles to complete data recovery.
Supplement 3: Region is StoreFiles. StoreFiles is composed of HFiles, and Hfiles are composed of data blocks of hbase. There are many keyvalue pairs in a data block, and each keyvalue stores the value we need.
Supplement 4:
<ignore_js_op><iframe id="iframe_0.8811067102597507" style="border: medium; border-image: none; width: 432px; height: 232px;" src="data:text/html;charset=utf8,%3Cimg%20id=%22img%22%20src=%22http://www.aboutyun.com/data/attachment/forum/201412/28/133844f93esoatjdw1y8t9.png?_=5118545%22%20style=%22border:none;max-width:1151px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.8811067102597507',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no"></iframe> 
Let's look at the picture above:
A table has two column families (one in red and one in yellow), and one column family has two columns. As can be seen from the figure, this is the biggest feature of a columnar database. The same column family has two columns. With the data together, we also found that if there are multiple versions, multiple versions will also be stored at the same time. Finally, we also found that there are such values ​​in it: r1: key value, cf1: name of column family, c1: list. t1: version number, value (the last picture shows where the value can be stored). Through this view, we found that if we design the table with these things: r1: key value, cf1: the name of the column family, c1: the listed name is shorter, will we save a lot of storage space? !
Also, we should also get the following understanding from this picture :
Looking at the penultimate picture, the efficiency of field filtering decreases significantly from left to right, so when designing keyvalues, users can consider moving some important filtering information to the left to a suitable position, so as to not change the amount of data. , to improve query performance. Simply put, the user should try to store the query dimension or information in the row key, because it is the most efficient for filtering data.
After getting the above understanding, we should have the following awareness :
HBase data will be stored in order in a specific range, because we generally store it in order, so it will always be stored in the same region, because a region can only be managed by one server, so We always add to the same region, which will cause read and write hotspots, which will degrade cluster performance. So there is still a solution to this, what I can think of is, for example, if we have 9 servers, then we go back to the current time, and then touch 9, add it to the row key prefix, so that it will be evenly divided into different regions On the server, the advantage of this is that because the connected data is distributed to different servers, users can read data in parallel with multiple threads, so that the throughput of the query will be improved.
Regarding our version control, we either synchronize the time on multiple servers, or simply set a client timestamp instead when put inserts data. (Because if we don't show the addition, people will add our own time to our own server.)
Supplement 5:
When designing a watch, there are two design methods, one is a high watch design, and the other is a fat watch design. According to the splitting rules of HBase, our high table design is easier to split (using composite keys). However, if we design a fat table, and the data in our fat table needs to be modified frequently, this design is very reasonable. Because our HBase guarantees row-level atomicity, if it is designed as a high table, it is not suitable, because atomicity across rows cannot be guaranteed.
Supplement 6:
write cache
Each put operation is actually an RPC operation, which transmits the client's data to the server and returns it. This is only suitable for operations with small data volume. If an application needs to store thousands of rows of data per second in the HBase table, This is not appropriate to deal with. The HBase API is equipped with a client write buffer, the buffer is responsible for collecting put operations, and then calling the RPC operation to send the put to the server at one time. By default, client buffers are disabled. Buffers can be activated by setting the autoflush setting to FALSE. table.setAutoFlush(false);void flushCommits () throws IOException This method is to force data to be written to the server. Users can also configure the client write buffer size according to the following methods. void setWritaeBufferSize(long writeBufferSize) throws IOException; The default size is 2MB, which is also moderate. Generally, the data inserted by users is not large, but if the data you insert is large, you may consider increasing this value. This allows the client to more efficiently group a certain amount of data into a single RPC request. It is also a hassle to set a write buffer for each user's HTable. In order to avoid trouble, users can
Set a larger preset value for the user in Hbase-site.xml.
  1. <property>
  2. <name>hbase.client.write.buffer</name>
  3. <value>20971520</value>
  4. </property>
copy code


Supplement 7:
Hbase supports a large number of algorithms, and supports compression algorithms above the column family level. Unless there are special reasons, we should try to use compression. Compression usually brings better performance. Through some tests, we recommend using the SNAPPY algorithm for our hbase compression.





Hbase read data:

client->zookeeper->.ROOT->.META-> The user data table zookeeper records the path information of .ROOT (root has only one region), and .ROOT records the region information of .META, (.META may have multiple region), the region information is recorded in .META.
Supplement 1:
In HBase, all storage files are divided into several small storage blocks. These small storage blocks are loaded into memory during get or scan operations. They are similar to storage unit pages in RDBMS. The default size of this parameter is 64K. Set by the above method: void setBlocksize(int s); (The default size of Hfile in HBase is 64K, which does not matter if the block of HDFS is 64M) HBase sequentially reads a data block into the memory cache, when it reads adjacent data It can be read in memory without reading from disk again, effectively reducing the number of disk I/Os. This parameter defaults to TRUE, which means that each block read will be cached in memory. However, if the user reads a particular column family sequentially, it is best to set this property to FALSE to disable cache usage. The reason described above: if we access a specific column family, but we still enable this function, at this time our mechanism will load the data of other column families we do not need into memory, increasing our burden, The condition we use is that we get adjacent data. void setBlockCacheEnabled(boolean blockCacheEnable);
Supplement 2:
1: Disable automatic flashing.
When we have a large batch of data to insert, if we do not prohibit, Put instances will be sent to the regio server one by one
, if the user disables the automatic flashing function, the put operation will only be sent when the write buffer is full.
2: Use scan cache.
If HBase is used as the input source for a mapreduce job, it is best to scan as the mapreduce job input
The cache of the cache instance is set to a number greater than the default value of 1 with the setCaching() method. Using the default value means map
The task will request the region server as each record is processed. However, if this value is 500, then once
500 pieces of data can be sent to the client for processing, of course, this data is also determined according to your situation.
This is row-level and explained on our page 119.
3: Limit the scanning range.
This is well understood, for example we are dealing with a large number of rows (especially as input source for mapreduce), where
When using scan, we have the Scan.addFamily(); method. At this time, if we only need to
Several columns in this column family, then we must be precise. Because too many columns will lead to loss of efficiency.
4: Close resultScanner
Of course, this cannot improve our efficiency, but if it is not turned off, it will have an impact on efficiency.
5: Usage of block cache
First of all, our block cache is started by Scan.setCacheBolcks(); for those rows that are frequently accessed
We should use cache blocks, but mapreduce job scans a lot of rows, we shouldn't use this
. (This block cache is not the same as the one I mentioned in Section 4).
6: Optimize the way to obtain row health
Of course, the premise of using this is that we only need the row keys in the table to use it. Then how to use it is explained on page 411.
7: Turn off WAL on Put
This is what the book says, but I personally think that this function is not used, because we have turned off this function,
The server will not write put to WAL, but directly to memstore, so that once the server fails
Our data is lost.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326487070&siteId=291194637