EOSIO源码分析 - 块文件存储

前言

在区块链数据的存储中，块文件的存储是核心的核心，利用块文件，可以对以往的数据进行回溯，追踪数据起源，并且区块链是去中心化的，所以体现出区块链最核心的两个特性

去中心化：数据无法更改，更改会引起块文件数据发生变化
数据溯源：对块数据进行重播，可以查找数据起源

存储结构

在EOS中对块文件的存储比较简单，块数据存储为以下两个文件

blocks .log: 按照块号顺序存储的块数据文件
blocks.index:记录的块数据在blocks.log位置索引，每个索引大小为uint64_t大小，也是按照块号顺序存储的，文件大小 = 块号 * sizeof(uint64_t)

EOS中块数据机构如下，核心变量注释如下

struct block_header
{
   block_timestamp_type             timestamp;			/// 块生产时间
   account_name                     producer;			/// 生产者
   block_id_type                    previous;			/// 前一个块的Hash，这个变量将块文件串成一个链
   checksum256_type                 transaction_mroot; 	/// 交易执行信息形成的默克尔树信息
   checksum256_type                 action_mroot; 		/// 交易动作执行信息形成的默克尔树信息
   // 其他变量
   ... 
};

struct signed_block_header : public block_header{
   signature_type    producer_signature;				/// 块签名信息
};

struct transaction_receipt_header {
   enum status_enum {
      executed  = 0, ///< succeed, no error handler executed
      soft_fail = 1, ///< objectively failed (not executed), error handler executed
      hard_fail = 2, ///< objectively failed and error handler objectively failed thus no state change
      delayed   = 3, ///< transaction delayed/deferred/scheduled for future execution
      expired   = 4  ///< transaction expired and storage space refuned to user
   }; 
   fc::enum_type<uint8_t,status_enum>   status;				///交易执行结果
   uint32_t                             cpu_usage_us = 0; 	///< total billed CPU usage (microseconds)
   fc::unsigned_int                     net_usage_words; 	///< total billed NET usage, so we can reconstruct resource state when skipping context free data... hard failures...
};

struct transaction_receipt : public transaction_receipt_header {
   std::variant<transaction_id_type, packed_transaction> trx;	/// 交易数据，这种结构兼容了旧版本
};

struct signed_block : public signed_block_header{
public:
   deque<transaction_receipt>              transactions; /// 交易数据记录
};
using signed_block_ptr = std::shared_ptr<signed_block>;

块文件操作执行

EOS中对于块文件的操作，全部封装于block_log类，对于该类的理解，要着重理解以下几个文件

初始化时，创始块的产生

struct genesis_state {
   time_point                               initial_timestamp;
   public_key_type                          initial_key;
}

void block_log::reset( const genesis_state& gs, const signed_block_ptr& first_block, packed_transaction::cf_compression_type segment_compression ) {
   my->reset(1, gs);
   append(first_block, segment_compression);
}

从这个函数的运行中可以看出，在创建创始块时，重点记录了创世时间与创世公钥，系统中区分是不是同一条链，核心就是比较这两个变量。

加载时，头节点的读取

void block_log::open(const fc::path& data_dir) {
   my->close();

   if (!fc::is_directory(data_dir))
      fc::create_directories(data_dir);

   // 读取log，index文件相关内容
   my->block_file.set_file_path( data_dir / "blocks.log" );
   my->index_file.set_file_path( data_dir / "blocks.index" );

   my->reopen();
   auto log_size = fc::file_size( my->block_file.get_file_path() );
   auto index_size = fc::file_size( my->index_file.get_file_path() );

   if (log_size) {
   	  // 从文件开头读取版本号
      my->block_file.seek( 0 );
      my->version = 0;
      my->block_file.read( (char*)&my->version, sizeof(my->version) );
      // 读取头块，即当前系统中最新不可逆块
      my->head = read_head();
      if( my->head ) {
         my->head_id = my->head->id();
      } else {
         my->head_id = {};
      }
	  // 读取index文件，兼容以前的旧版本，对index做出调整与对齐
      if (index_size) {
         uint64_t block_pos;
         my->block_file.seek_end(-sizeof(uint64_t));
         my->block_file.read((char*)&block_pos, sizeof(block_pos));

         uint64_t index_pos;
         my->index_file.seek_end(-sizeof(uint64_t));
         my->index_file.read((char*)&index_pos, sizeof(index_pos));
      } else {
         construct_index();
      }
   } else if (index_size) {
   }
}

从代码中，可以直观看出，log文件是重中之重

如何读取某个块数据

// 从log文件中根据内容的pos位置，读取出块内容
signed_block_ptr block_log::read_block(uint64_t pos)const {
  my->block_file.seek(pos);
  signed_block_ptr result = std::make_shared<signed_block>();
  auto ds = my->block_file.create_datastream();
  fc::raw::unpack(ds, *result);
  return result;
}

// 从index文件中根据块号读取出块内容存储位置的索引
uint64_t block_log::get_block_pos(uint32_t block_num) const {
   if (!(my->head && block_num <= block_header::num_from_id(my->head_id) && block_num >= my->first_block_num))
      return npos;
   // 从这句话可以直观看出：位置与块号的对应关系
   my->index_file.seek(sizeof(uint64_t) * (block_num - my->first_block_num));
   uint64_t pos;
   // 读出索引并且返回
   my->index_file.read((char*)&pos, sizeof(pos));
   return pos;
}

// 根据块号读取块内容
signed_block_ptr block_log::read_block_by_num(uint32_t block_num)const {
   try {
      signed_block_ptr b;
      uint64_t pos = get_block_pos(block_num);
      if (pos != npos) {
         b = read_block(pos);
         EOS_ASSERT(b->block_num() == block_num, reversible_blocks_exception,
                   "Wrong block was read from block log.", ("returned", b->block_num())("expected", block_num));
      }
      return b;
   } FC_LOG_AND_RETHROW()
}

从上面的代码中可以整理出如下结论，如何从块号读取一个块内容

首先从index文件中根据块号读取真正块内容存储的位置
再次从log文件中读取块内容

块数据如何存储

uint64_t detail::block_log_impl::append(const signed_block_ptr& b) {
   try {
      // 文件指针定位到文件结尾
      block_file.seek_end(0);
      index_file.seek_end(0);
      // 获取文件位置pos
      uint64_t pos = block_file.tellp();
      auto data = fc::raw::pack(*b);
      // 在log文件中分别写入块内容，块位置索引，在这里写入pos，在启动加载时可以从log文件直接获取pos
      block_file.write(data.data(), data.size());
      block_file.write((char*)&pos, sizeof(pos));
      // 在index文件中再次存入pos位置，index写入pos，可以随时读取任意块号的内容
      index_file.write((char*)&pos, sizeof(pos));
      // 更新头块，头块为当前新加入进来的块
      head = b;
      head_id = b->id();

      flush();
      return pos;
   }
   FC_LOG_AND_RETHROW()
}

理解这段代码，功能比较简单，但是要着重理解index，log文件，他们分别存入什么内容，起什么作用

总结

EOS中整个关于log的操作比较简单，代码也很好理解，数据结构也比较通俗易懂，总结起来我们要掌握以下知识点

创始块是如何创建的，在整个系统中起到标记链的作用
index，log文件分别存储的数据，以及它们分别的作用
块是如何实现读取与存取的

了解完相关代码，也会出现如下几个思考

index文件还好说，大小为sizeof(uint64_t) * block_number, 1K存储128个块号，1M存储1281024，1G可以存储1281024*1024，大约是十几亿个块，大小增长随着块号顺序线性增长
blog文件存储了块数据，如果某段时间交易量剧增，log文件的大小也会剧增，很可能很快就达到1G以上，文件太大会导致读取，备份很不方便，是否有个办法对log文件尽心合理拆分，文件书也不能太多
如果引用第三方工具对log文件尽心分析，目前的文件结构，是否合理，比方说我们做个块数据浏览器，对于底层我们要如何设计，如何快速将文件转移或者读取，如何保证文件的完整性