Postgresql内核源码分析-vacuum流程2

 74ac905cfa3740079f2f66a445a3d7c2.gif#pic_center

  • 专栏内容:postgresql内核源码分析
  • 个人主页:senllang的主页
  • 座右铭:天行健,君子以自强不息;地势坤,君子以厚德载物.

文章目录

  • 前言
  • 概要介绍
  • 入口函数
  • 结构体
  • 详细流程
  • 子流程
  • 结尾

前言

本文是基于postgresql 15的代码进行分析解读,演示是在centos8系统上进行。


  • 概要介绍

vacuum有两种调用途径,通过SQL命令和可执行命令,通过autovacuum后台服务。

命令方式可以通过设置参数,对vacuum行为可选项进行控制;autovacuum是后面服务,根据配置定期进行调用执行,按数据库列表进行遍历,然后分别遍历各个数据库内的所有表及表涉及的对象。

vacuum分为lazy和full两种,lazy模式主要对表内deadtuple进行清理,page内空间整理,让page内的碎片空间集中,对应的索引也进行清 理,同时对tuple进行frozen,当事务号达到配置回卷上限时,会强制进行回卷事务号;而full模式,除了lazy要做的事,对表文件内的空间进行整理,让有效的tuple集中存放,表文件占用的空闲空间释放给操作系统;

autovacuum采用lazy模式,运行过程中产生的空闲空间可以再利用。而full vacuum一般在停机维护时采用,成本较高,会阻塞业务运行。

  • 入口函数

static inline void

table_relation_vacuum(Relation rel, struct VacuumParams *params,

                      BufferAccessStrategy bstrategy)

  • 进入这个调用时,由调用者保证已经在事务当中,而且已经获得了ShareUpdateExclusive 锁。
  • 这个调用在tableam中进行赋值,对于heap table 的实现是 

.relation_vacuum = heap_vacuum_rel,

说明表的数据清理由表的类型来决定,vacuum命令来调用执行;

  • 结构体

/* Phases of vacuum during which we report error context. */

typedef enum

{

    VACUUM_ERRCB_PHASE_UNKNOWN,

    VACUUM_ERRCB_PHASE_SCAN_HEAP,

    VACUUM_ERRCB_PHASE_VACUUM_INDEX,

    VACUUM_ERRCB_PHASE_VACUUM_HEAP,

    VACUUM_ERRCB_PHASE_INDEX_CLEANUP,

    VACUUM_ERRCB_PHASE_TRUNCATE

} VacErrPhase;

typedef struct LVRelState

{

    /* Target heap relation and its indexes */

    Relation    rel;

    Relation   *indrels;

    int         nindexes;

    /* Buffer access strategy and parallel vacuum state */

    BufferAccessStrategy bstrategy;

    ParallelVacuumState *pvs;

    /* Aggressive VACUUM? (must set relfrozenxid >= FreezeLimit) */

    bool        aggressive;

    /* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */

    bool        skipwithvm;

    /* Wraparound failsafe has been triggered? */

    bool        failsafe_active;

    /* Consider index vacuuming bypass optimization? */

    bool        consider_bypass_optimization;

    /* Doing index vacuuming, index cleanup, rel truncation? */

    bool        do_index_vacuuming;

    bool        do_index_cleanup;

    bool        do_rel_truncate;

    /* VACUUM operation's cutoffs for freezing and pruning */

    struct VacuumCutoffs cutoffs;

    GlobalVisState *vistest;

    /* Tracks oldest extant XID/MXID for setting relfrozenxid/relminmxid */

    TransactionId NewRelfrozenXid;

    MultiXactId NewRelminMxid;

    bool        skippedallvis;

    /* Error reporting state */

    char       *dbname;

    char       *relnamespace;

    char       *relname;

    char       *indname;        /* Current index name */

    BlockNumber blkno;          /* used only for heap operations */

    OffsetNumber offnum;        /* used only for heap operations */

    VacErrPhase phase;

    bool        verbose;        /* VACUUM VERBOSE? */

    /*

     * dead_items stores TIDs whose index tuples are deleted by index

     * vacuuming. Each TID points to an LP_DEAD line pointer from a heap page

     * that has been processed by lazy_scan_prune.  Also needed by

     * lazy_vacuum_heap_rel, which marks the same LP_DEAD line pointers as

     * LP_UNUSED during second heap pass.

     */

    VacDeadItems *dead_items;   /* TIDs whose index tuples we'll delete */

    BlockNumber rel_pages;      /* total number of pages */

    BlockNumber scanned_pages;  /* # pages examined (not skipped via VM) */

    BlockNumber removed_pages;  /* # pages removed by relation truncation */

    BlockNumber frozen_pages;   /* # pages with newly frozen tuples */

    BlockNumber lpdead_item_pages;  /* # pages with LP_DEAD items */

    BlockNumber missed_dead_pages;  /* # pages with missed dead tuples */

    BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */

    /* Statistics output by us, for table */

    double      new_rel_tuples; /* new estimated total # of tuples */

    double      new_live_tuples;    /* new estimated total # of live tuples */

    /* Statistics output by index AMs */

    IndexBulkDeleteResult **indstats;

    /* Instrumentation counters */

    int         num_index_scans;

    /* Counters that follow are only for scanned_pages */

    int64       tuples_deleted; /* # deleted from table */

    int64       tuples_frozen;  /* # newly frozen */

    int64       lpdead_items;   /* # deleted from indexes */

    int64       live_tuples;    /* # live tuples remaining */

    int64       recently_dead_tuples;   /* # dead, but not yet removable */

    int64       missed_dead_tuples; /* # removable, but not removed */

} LVRelState;

/*

 * State returned by lazy_scan_prune()

 */

typedef struct LVPagePruneState

{

    bool        hastup;         /* Page prevents rel truncation? */

    bool        has_lpdead_items;   /* includes existing LP_DEAD items */

    /*

     * State describes the proper VM bit states to set for the page following

     * pruning and freezing.  all_visible implies !has_lpdead_items, but don't

     * trust all_frozen result unless all_visible is also set to true.

     */

    bool        all_visible;    /* Every item visible to all? */

    bool        all_frozen;     /* provided all_visible is also true */

    TransactionId visibility_cutoff_xid;    /* For recovery conflicts */

} LVPagePruneState;

  • 详细流程

    • 参数检查和初始化,获取表上配置信息和命令传入的参数,冲突时命令传入的优先生效
    • 获取OldestXmin和OldestMxact,这一步必须放在开始位置。为了确定被删除的tuple是否为dead tuple,XIDs和MXIDs能够被冻结,先要得到OldestXmin和OldestMxact这两个值,对于xmax < OldestXmin的删除的tuple将会被清理。

vacrel->NewRelfrozenXid = vacrel->cutoffs.OldestXmin;

vacrel->NewRelminMxid = vacrel->cutoffs.OldestMxact;

  • 分配存储deadtuple的内存空间,对应调用 dead_items_alloc

这里最大不会超过 maintenance_work_mem,最小也是一个page的大小,所以配置时也不能太小

  • 扫描deadtuples,清理deadtuples,清理deadtuples对应的索引,更新visibilitymap 和 free space map文件。lazy_scan_heap中会遍历每一个block, 把deadtuple收集到缓存中,对于vm中已经标记可见的就会跳过,当deadtuple缓存不够缓存一个page时,调用lazy_vacuum进行清理动作,同时还会计算是否需要延迟清理等。
  • 释放表文件最后的空闲空间
  • 更新pg_class中的统计信息
  • 子流程

    • 对于empty page和 new page的单独处理

调用接口 lazy_scan_new_or_empty 

必须在清理前调用,new page处理较简单,它一定在文件尾部,可能是新扩展的或者是批量扩展的块,这里只需要更新空闲空间即可;

对于empty page ,那么所有内容是全可见的,而且可以被重用的,设置空闲空间,并在块上设置全可见,注意这里会变成脏块和记录wal;

  • 对于pin计数不为1的情况,此时不能进行清理;

 由 ConditionalLockBufferForCleanup 来检查,因为pin不为1,说明块上还有tuple正在被使用,此时不能prune;只是对当前块上的deadtuple进行统计,检查tuple的可见性,并统计frozenXid

  • 判断tuple是否全可见的步骤;
    1. 先判断是否为LP_UNUSE,如果是,那就什么都不用做,目前已经是可重用的;
    2. 再判断infomask上事务是否提交,如果没提交,肯定不全可见;
    3. 其次判断xmin是否比oldestXmin更早,如果不是,那就不全可见;
    4. 至此,该tuple就对所有事务全可见了;记录全可见最新的xmin;

  • Frozen tuple的流程

frozen是按page为单位进行,大致分为两步:

  1. 在扫描deadtuple的过程,也会把frozen tuple记录下来;也就是prepare阶段;
  2. 将收集到的tuple在tupleheader将xid置为FrozenTransactionId,并且记录wal日志;


结尾

作者邮箱:[email protected]
如有错误或者疏漏欢迎指出,互相学习。

注:未经同意,不得转载!

猜你喜欢

转载自blog.csdn.net/senllang/article/details/128968270