[Read] leveldb source of LOG

After clarifying leveldb basic architecture diagram, I marked some of the more important (mainly because I was white) point of information, as follows:

Here Insert Picture Description

[Comment <> (code [db / filename.h] leveldb can see several file types)

// Owned filenames have the form:
//    dbname/CURRENT
//    dbname/LOCK
//    dbname/LOG
//    dbname/LOG.old
//    dbname/MANIFEST-[0-9]+
//    dbname/[0-9]+.(log|sst|ldb)

Our first order of writing data against the code introduced in turn.

LOG

When data is written, the beginning will be written to the log file, because it is written to the file order, so write fast, it can return immediately.

Log log format Description [doc / log_format.md]

  • Log file consists of a plurality Block, each Block size is 32KB.

  • Block there are a plurality of internal Recordcomposition, Record divided into four types (for the left there is a pre-assigned document [db / log_format.h]):

    • Full: a Record filled the entire Block storage space.
    • First: First Record in a Block.
    • Last: The last Record of a Block.
    • Middle: The rest are Middle type Record.
  • A Record consists of several parts:

    • Header section
    • 32-bit length CRC
    • Length 16-bit length: a data storing portion length.
    • Type 8-bit length: a storage Record Type, which is above four types.

1.1 write log

Writing process For example, we now want to write these data:

    A: 长度 1000
    B: 长度 97270
    C: 长度 8000

We first installed the first block. See A data is small, so, FULLcan be installed, the first time to record a remaining 31761Bspace.

Here pretending to B , but it is so big, to have it sliced and then loaded. Next to the first portion B of the first to install a Record, so here it RecordTypeis First, the loaded 31761Bdata. This is the first time a block is full, again a block, this block is excluded record hearder、crcand other parts, you can also install 32761Bthe data, of course, this is not enough B installed, it does not matter, then we open a block installed. And then, for the second part B of RecordTypethat Secondthe. Then B also remaining 32655Bdata, a block is pretend to be, and left the 6Broom, do we stay outtrailer

Depression induced by C , and A , are FULL record, falling in the fourth block.

The above process can be visually represented by a diagram:

To sum up, log multiple fixed-size block composition, but also by the record block composed, record is continuous, data may be split into different record.

Write class Writerof interface function isAddRecord

Status AddRecord(const Slice& slice);

Look at this simple function:

status:状态
block_offset_ : 当前block用(偏移)到哪里了
leftover : 当前block还剩多少
left:待写入数据
kBlockSize:32(32768,Bytes)
kHeaderSize:74+2+1,Bytes)
type:即RecordType


while(status_is_ok &&  left>0) {
    if (leftover < kHeaderSize) {
        // 用0填充
    }
    // 根据left、block_offset_,更新RecordType
    // 真正写入过程由EmitPhysicalRecord完成,包括生成一个record头部,追加数据
    // 更新status
    // 更新left
}

1.2 reading log

Class read operation Writerof the interface function isReadRecord

bool ReadRecord(Slice* record, std::string* scratch);
  // 真正读入过程由ReadPhysicalRecord实现:从文件中每次读取一个Block,Read内部会做偏移,保证按顺序读取,并判断各种badrecord的情况
  // 根据recordtype,向switch指向的内存中追加数据
  switch(recordtype) {
      case Full:
      case First:
      case Middle:
      case Last:
  }

Slice is a structure in which only two members, a pointer pointing to external memory, a is the size.

The code is written a switch...casereally great stay. Specific code that logic I added some comments to be specific look.

Reference material

Design and Implementation LevelDB - The Basics

Published 120 original articles · won praise 35 · views 170 000 +

Guess you like

Origin blog.csdn.net/u012328476/article/details/104218904