Postgresql kernel source code analysis-vacuum process

 

  • Column content: postgresql kernel source code analysis​​​​​​
  • Personal homepage: senllang's homepage
  • Motto: Tian Xingjian, the gentleman strives for self-improvement;

Article directory

  • foreword
  • Overview
  • entry function
  • clean data flow
  • analyze process
  • Transaction number freezing process
  • end

foreword

This article is based on the analysis and interpretation of the postgresql 15 code, and the demonstration is carried out on the centos8 system.


  • one. Overview

Postgresql's MVCC mechanism will generate new and old versions during update, and store them in table files, resulting in table data expansion and affecting the efficiency of query and scan. The introduction of vacuum is to regularly clean up deadtuple. The so-called deadtuple is an outdated version for all transactions.

The following is a detailed introduction to what vacuum does, how to develop it, and how to avoid conflicts with business during the cleaning process.

  • two. entry function

    • Execute vacuum in various forms, such as sql command, vacuumdb command, and autovacuum service automatically executed in the background. After they are packaged and processed by themselves, they are all processed by vacuum()
void

vacuum(List *relations, VacuumParams *params,

       BufferAccessStrategy bstrategy, bool isTopLevel)

  • Parameter Description

    • relations , the relation list to be processed, if it is empty, all tables in the current database will be processed;
    • params, records the parameters of vacuum execution, mainly optional behaviors, such as whether it is full, whether to skip conflicting pages, etc. For details, see the analysis of the vacuum command;
    • bstrategy, it will be passed in during autovacuum, otherwise it is NULL
    • isTopLevel, which is passed in when ProcessUtility is called, and the value of istolevelve is as follows:
typedef enum

{

    PROCESS_UTILITY_TOPLEVEL,   /* toplevel interactive command */

    PROCESS_UTILITY_QUERY,      /* a complete query, but not toplevel */

    PROCESS_UTILITY_QUERY_NONATOMIC,    /* a complete query, nonatomic

                                         * execution context */

    PROCESS_UTILITY_SUBCOMMAND  /* a portion of a query */

} ProcessUtilityContext;

  • Function flow

    • Check the transaction. For analyze, it needs to be in the transaction. For vacuum, it cannot already be in the transaction, because it will start the transaction by itself;
    • Check the list of tables, if not specified, get a list of all tables
    • Allocate a global memory context, which may be used multiple times;
static MemoryContext vac_context = NULL;

static BufferAccessStrategy vac_strategy;

  • If you use your own transaction, you need to end the transaction when the backend starts;
  • The entire processing of vacuum and analyze of the vacuum table is placed in try...catch, which has atomic operations
  • After the above processing is completed, the transaction number is frozen
  • Finally release the memory context and end processing

  • three. clean data flow

    • The processing function is vacuum_rel
    • Parameter Description
      • OID, relation OID to be processed
      • relation RangeVar type, table information corresponding to OID

typedef struct RangeVar

{

    NodeTag     type;

    /* the catalog (database) name, or NULL */

    char       *catalogname;

    /* the schema name, or NULL */

    char       *schemaname;

    /* the relation/sequence name */

    char       *relname;

    /* expand rel by inheritance? recursively act on children? */

    bool        inh;

    /* see RELPERSISTENCE_* in pg_class.h */

    char        relpersistence;

    /* table alias & optional column aliases */

    Alias      *alias;

    /* token location, or -1 if unknown */

    int         location;

} RangeVar;

The first two parameters are obtained from the VacuumRelation structure, which is the List of relation;

typedef struct VacuumRelation

{

    NodeTag     type;

    RangeVar   *relation;       /* table name to process, or NULL */

    Oid         oid;            /* table's OID; InvalidOid if not looked up */

    List       *va_cols;        /* list of column names, or NIL for all */

} VacuumRelation;

  • start transaction
  • Dealing with vacuum, divided into two types
    • Layze vacuum is performed by table_relation_vacuum, triggered by SQL commands and autovacuum

static inline void

table_relation_vacuum(Relation rel, struct VacuumParams *params,

                      BufferAccessStrategy bstrategy)

At execution time, the table lock held is ShareUpdateExclusive lock

  • Full vacuum is performed by cluster_rel

/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */

void cluster_rel(Oid tableOid, Oid indexOid, ClusterParams *params);

At execution time, hold the highest level table lock

  • Four. analyze process

    • The processing function is analyze_rel

  • five. Transaction number freezing process

    • The processing function is vac_update_datfrozenxid

end

Author email: [email protected]
If there are any mistakes or omissions, please point them out and learn from each other.

Note: Do not reprint without consent!

Guess you like

Origin blog.csdn.net/senllang/article/details/128890884