PostgreSQL plugin pg_stat_statements source code reading

pg_stat_statements code analysis

Overview of pg_Stat_statements

pg_stat_statements is an extension module of the PostgreSQL database, which is used to track and record the performance statistics of SQL query statements

  1. Function:

The pg_stat_statements extension module tracks SQL queries executed against the database, and performance statistics for those queries. The information it captures includes the query's text, execution times, total run time, average run time, minimum and maximum run time, and more. These statistics can help developers and administrators determine which queries are the most time-consuming, so that performance optimization can be more targeted.


2. Install and enable:

To use pg_stat_statements, you first need to ensure that it has been compiled into PostgreSQL as an extension, or install it in the database using the CREATE EXTENSION command. Then enable the extension in the database configuration file postgresql.conf, setting shared_preload_libraries to 'pg_stat_statements'.

  1. Configuration options:

pg_stat_statements has some configuration options that can be tuned according to needs, for example:

  • pg_stat_statements.max: Limit the maximum number of queries to track.
  • pg_stat_statements.track: Specifies the type of query to track, such as none, top, all, etc.
  • pg_stat_statements.track_utility: Whether to track database utility commands.
  • pg_stat_statements.save: Controls whether to save the query text, and the maximum length of the save.


    4. View statistics:
    Use SQL queries to retrieve query performance statistics from the pg_stat_statements view. Some of the important columns include:
  • query: query text.
  • calls: The number of times the query was called.
  • total_time: The total running time of the query.
  • mean_time: average running time.
  • min_time and max_time: minimum and maximum runtime.
  • rows: The total number of rows returned by the query.

pg_stat_statements initialization

PostgreSQL initializes the extended library in the function process_shared_preload_libraries, the function path is: src/backend/utils/init/miscinit.c

void process_shared_preload_libraries(void)
{
    
    
	process_shared_preload_libraries_in_progress = true;
	load_libraries(shared_preload_libraries_string,
				   "shared_preload_libraries",
				   false);
	process_shared_preload_libraries_in_progress = false;
	process_shared_preload_libraries_done = true;
}

This function is used to preload all the extension libraries.
In this function, the extension library is loaded by calling the load_libraries function. The second parameter is the string "shared_preload_libraries", which is in the database configuration file postgresql.conf mentioned in the previous section. mentioned. At this point shared_preload_libraries has been set to pg_stat_statements.

Hook

Hook is actually a static function pointer.

working principle:

  • Each hook is composed of a global function pointer. The server runs and initializes it to NULL. When the database must be called, it will first check whether it is NULL. If not, the function will be called first, otherwise the standard function will be executed.

Set the function pointer:

  • When the database is loaded into the shared library, it is first loaded into memory and then a function call _PG_init is executed. This function is available in most shared libraries. So we can load our own hook through this function.
_PG_init(void)
{
    
    
	/*
	 * Install hooks.
	 */
	prev_shmem_request_hook = shmem_request_hook;
	shmem_request_hook = pgss_shmem_request;
	prev_shmem_startup_hook = shmem_startup_hook;
	shmem_startup_hook = pgss_shmem_startup;
	prev_post_parse_analyze_hook = post_parse_analyze_hook;
	post_parse_analyze_hook = pgss_post_parse_analyze;
	prev_planner_hook = planner_hook;
	planner_hook = pgss_planner;
	prev_ExecutorStart = ExecutorStart_hook;
	ExecutorStart_hook = pgss_ExecutorStart;
	prev_ExecutorRun = ExecutorRun_hook;
	ExecutorRun_hook = pgss_ExecutorRun;
	prev_ExecutorFinish = ExecutorFinish_hook;
	ExecutorFinish_hook = pgss_ExecutorFinish;
	prev_ExecutorEnd = ExecutorEnd_hook;
	ExecutorEnd_hook = pgss_ExecutorEnd;
	prev_ProcessUtility = ProcessUtility_hook;
	ProcessUtility_hook = pgss_ProcessUtility;
}

When the main process of the database is started, only when shmem_startuo_hook is found, the hook function will be executed

Executing query statements is different from pg_stat_statements

start

void
ExecutorStart(QueryDesc *queryDesc, int eflags)
{
    
    
	pgstat_report_query_id(queryDesc->plannedstmt->queryId, false);

	if (ExecutorStart_hook)
		(*ExecutorStart_hook) (queryDesc, eflags);
	else
		standard_ExecutorStart(queryDesc, eflags);
}

When running to this function before installation, ExecutorStart_hook == NULL, the standard_ExecutorStart function will be called.
After the hook is installed, the pgss_ExecutorStart function will be called, which is in the pg_stat_statements.c file

/*
 * ExecutorStart hook: start up tracking if needed
 */
static void
pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
    
    
	if (prev_ExecutorStart)
		prev_ExecutorStart(queryDesc, eflags);
	else
		standard_ExecutorStart(queryDesc, eflags);

	/*
        如果查询的queryId为零,则不跟踪它。
        这可以防止对直接包含在实用程序语句中的可优化语句进行重复计数。
	 */
	if (pgss_enabled(exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0))
	{
    
    
		/*
            设置为跟踪ExecutorRun中的总运行时间。
            确保在每个查询上下文中分配了空间,这样它就会离开ExecutorEnd
		 */
		if (queryDesc->totaltime == NULL)
		{
    
    
			MemoryContext oldcxt;

			oldcxt = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);
			queryDesc->totaltime = InstrAlloc(1, INSTRUMENT_ALL, false);
			MemoryContextSwitchTo(oldcxt);
		}
	}
}

In this function, the standard_ExecutorStart function will still be executed. After standard_ExecutorStart is executed, some data totaltime will be tracked according to the parameters in the database configuration file.

Run

Run and start perform similar

void
ExecutorRun(QueryDesc *queryDesc,
			ScanDirection direction, uint64 count,
			bool execute_once)
{
    
    
	if (ExecutorRun_hook)
		(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
	else
		standard_ExecutorRun(queryDesc, direction, count, execute_once);
}

When pg_stat_statement is not installed, ExecutorRun_hook == NULL will call the standard_ExecutorRun function when running to this function

After installation, the pgss_ExecutorRun function will be called

static void
pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count,
				 bool execute_once)
{
    
    
	exec_nested_level++;
	PG_TRY();
	{
    
    
		if (prev_ExecutorRun)
			prev_ExecutorRun(queryDesc, direction, count, execute_once);
		else
			standard_ExecutorRun(queryDesc, direction, count, execute_once);
	}
	PG_FINALLY();
	{
    
    
		exec_nested_level--;
	}
	PG_END_TRY();
}

In this function, the standard_ExecutorRun function will still be called, but the nesting depth will be tracked additionally, and the total consumption time of the query will be counted

When executing the ExecutorFinish and ExecutorEnd functions, it will call the function in pg_stat_statement similar to ExecutorRun.

hook describe
pgss_shmem_startup Load external file, initialize pgss_hash and file
pgss_post_parse_analyze Initialize the query jumbl, calculate the query id according to the query tree, and call it first when SQL is executed
pgss_ExecutorStart initialize totaltime
pgss_ExecutorRun Handle the hook called when the query is executed, and count the total consumption time of the query
pgss_ExecutorFinish The hook that is called when processing the query ends
pgss_ExecutorEnd The hook that is invoked after processing the query is completed

Hash creation and initialization

After the database process starts, when it finds a shmem_startup_hook, it will execute the hook function, and then use the pgss_shmem_startup function to create and initialize a shared memory (Hash Table) for storing data

data storage

When pgss_ExecutorEnd is executed, pgss_store is called to store the sql operation information in the hash table of the shared memory:

static void
pgss_shmem_startup(void)
{
    
    
    ……
	pgss_hash = ShmemInitHash("pg_stat_statements hash",
							  pgss_max, pgss_max,
							  &info,
							  HASH_ELEM | HASH_BLOBS);    
    ……
}
/*                                    
 * ExecutorEnd hook: store results if needed                                    
 */                                    
static void                                    
pgss_ExecutorEnd(QueryDesc *queryDesc)                                    
{
    
                                        
    if (queryDesc->totaltime && pgss_enabled())                                
    {
    
                                    
                         
        InstrEndLoop(queryDesc->totaltime);                            
                                    
        pgss_store(queryDesc->sourceText,queryDesc->totaltime->total,                    
                   queryDesc->estate->es_processed,  &queryDesc->totaltime->bufusage);                    
    }                                
                                    
    if (prev_ExecutorEnd)                                
        prev_ExecutorEnd(queryDesc);                            
    else                                
        standard_ExecutorEnd(queryDesc);                            
}     
static void pgss_store(const char *query, uint64 queryId,
		   int query_location, int query_len,
		   pgssStoreKind kind,
		   double total_time, uint64 rows,
		   const BufferUsage *bufusage,
		   const WalUsage *walusage,
		   const struct JitInstrumentation *jitusage,
		   JumbleState *jstate)
{
    
    
    /*
        该函数用于将查询执行的统计信息存储在共享内存中,以便后续分析和查询性能优化。

        参数:
        - `query`: 查询文本
        - `queryId`: 查询标识符
        - `query_location`: 查询文本在原始查询中的位置
        - `query_len`: 查询文本的长度
        - `kind`: 统计类型(执行计划还是执行阶段)
        - `total_time`: 查询执行的总时间
        - `rows`: 影响的行数
        - `bufusage`: 缓冲区使用情况统计
        - `walusage`: Write-Ahead Logging(WAL)使用情况统计
        - `jitusage`: JIT(Just-In-Time)编译使用情况统计
        - `jstate`: 查询状态信息

        执行流程:
        - 对输入参数进行检查,确保pg_stat_statements扩展被启用,并且共享内存已分配。
        - 如果`queryId`为0,则没有其他模块计算查询标识符,因此不进行处理。
        - 对查询文本进行规范化,处理查询中的多语句情况。
        - 生成一个`pgssHashKey`作为在哈希表中搜索的键。
        - 使用共享锁从哈希表中查找现有的统计信息条目。
        - 如果没有找到现有条目,则需要创建一个新的条目。这可能涉及生成规范化的查询文本,将查询文本写入文件,并创建新的哈希表条目。
        - 对统计信息条目进行更新,包括查询执行次数、总时间、行数、缓冲区使用情况、WAL使用情况等等,根据执行类型不同进行不同的统计更新。
        - 最终释放共享锁,并进行必要的内存释放。

        该函数在数据库查询执行过程中被调用,用于捕获和存储关于查询性能的重要统计信息,这些信息可以帮助数据库管理员和开发人员进行性能优化和故障排除。
    */
}

Statistical information bits are stored in the structure Counter, take a look at the data structure

/*
 * pgssEntry 中实际保存的统计计数器。
 */
typedef struct Counters
{
    
    
	int64		calls[PGSS_NUMKIND];             /* 计划/执行的次数 */
	double		total_time[PGSS_NUMKIND];        /* 计划/执行总时间,以毫秒为单位 */
	double		min_time[PGSS_NUMKIND];          /* 计划/执行最小时间,以毫秒为单位 */
	double		max_time[PGSS_NUMKIND];          /* 计划/执行最大时间,以毫秒为单位 */
	double		mean_time[PGSS_NUMKIND];         /* 计划/执行平均时间,以毫秒为单位 */
	double		sum_var_time[PGSS_NUMKIND];      /* 计划/执行时间方差之和,以毫秒为单位 */
	int64		rows;                            /* 检索或影响的总行数 */
	int64		shared_blks_hit;                 /* 共享缓冲区命中次数 */
	int64		shared_blks_read;                /* 从共享磁盘块读取次数 */
	int64		shared_blks_dirtied;             /* 共享磁盘块脏化次数 */
	int64		shared_blks_written;             /* 向共享磁盘块写入次数 */
	int64		local_blks_hit;                  /* 本地缓冲区命中次数 */
	int64		local_blks_read;                 /* 从本地磁盘块读取次数 */
	int64		local_blks_dirtied;              /* 本地磁盘块脏化次数 */
	int64		local_blks_written;              /* 向本地磁盘块写入次数 */
	int64		temp_blks_read;                  /* 临时块读取次数 */
	int64		temp_blks_written;               /* 临时块写入次数 */
	double		blk_read_time;                   /* 读取块所花时间,以毫秒为单位 */
	double		blk_write_time;                  /* 写入块所花时间,以毫秒为单位 */
	double		temp_blk_read_time;              /* 读取临时块所花时间,以毫秒为单位 */
	double		temp_blk_write_time;             /* 写入临时块所花时间,以毫秒为单位 */
	double		usage;                           /* 使用因子 */
	int64		wal_records;                     /* 生成的 WAL 记录数 */
	int64		wal_fpi;                         /* 生成的 WAL 完整页映像数 */
	uint64		wal_bytes;                       /* 生成的总 WAL 字节数 */
	int64		jit_functions;                   /* 发出的 JIT 函数总数 */
	double		jit_generation_time;             /* 生成 JIT 代码的总时间 */
	int64		jit_inlining_count;              /* 内联时间大于 0 的次数 */
	double		jit_inlining_time;               /* 内联 JIT 代码的总时间 */
	int64		jit_optimization_count;          /* 优化时间大于 0 的次数 */
	double		jit_optimization_time;           /* 优化 JIT 代码的总时间 */
	int64		jit_emission_count;              /* 发出时间大于 0 的次数 */
	double		jit_emission_time;               /* 发出 JIT 代码的总时间 */
} Counters;

Counters in pgssEntry

typedef struct pgssEntry
{
    
    
	pgssHashKey key;			/* entry的哈希键 - 必须在最前面 */
	Counters	counters;		/* 此查询的统计信息 */
	Size		query_offset;	/* 查询文本在外部文件中的偏移量 */
	int			query_len;		/* 查询字符串中有效字节数,或者为-1 */
	int			encoding;		/* 查询文本编码 */
	slock_t		mutex;			/* 仅保护计数器 */
} pgssEntry;

typedef struct pgssHashKey
{
    
    
	Oid			userid;			/* 用户 OID */
	Oid			dbid;			/* 数据库 OID */
	uint64		queryid;		/* 查询标识符 */
	bool		toplevel;		/* 是否在顶层执行的查询 */
} pgssHashKey;

fetch data

In the pg_stat_statements_internal function, all the data is taken out from the hash table

static void
pg_stat_statements_internal(FunctionCallInfo fcinfo,
							pgssVersion api_version,
							bool showtext)
{
    
    
    /*
        从哈希表中获取统计信息的计数器,并根据需要计算标准差。

        将统计信息和相关数据构造为一行数据,将其插入到结果集中
    */
}

Guess you like

Origin blog.csdn.net/weixin_47895938/article/details/132345989