PostgreSQL数据恢复工具——pg_filedump

0、说明

数据库难免会遇到因为某些故障导致数据丢失的情况,此时便需要进行数据恢复。一般情况有备份的话可以直接恢复,但是如果恰好没有备份或者数据文件出现损坏那就比较麻烦了。

在PostgreSQL中如果只是一般的数据文件损坏,我们可以直接使用zero_damaged_pages=on来跳过损坏的数据块来读取数据,然后将数据导到新表中即可。但是如果元数据都损坏了,数据库已经无法启动了呢?这种情况我们便需要通过工具直接从数据文件中读取数据,例如Oracle中的DUL、ODU这类的工具。

在pg中,我们可以使用pg_filedump这个工具来实现类似的功能。

1、安装

安装很简单,直接解压安装即可。

git clone git://git.postgresql.org/git/pg_filedump.git  
  
cd pg_filedump  
    
make 

make install 

使用说明:

Usage: pg_filedump [-abcdfhikxy] [-R startblock [endblock]] [-D attrlist] [-S blocksize] [-s segsize] [-n segnumber] file

Display formatted contents of a PostgreSQL heap/index/control file
Defaults are: relative addressing, range of the entire file, block
               size as listed on block 0 in the file

The following options are valid for heap and index files:
  -a  Display absolute addresses when formatting (Block header
      information is always block relative)
  -b  Display binary block images within a range (Option will turn
      off all formatting options)
  -d  Display formatted block content dump (Option will turn off
      all other formatting options)
  -D  Decode tuples using given comma separated list of types
      Supported types:
        bigint bigserial bool char charN date float float4 float8 int
        json macaddr name oid real serial smallint smallserial text
        time timestamp timetz uuid varchar varcharN xid xml
      ~ ignores all attributes left in a tuple
  -f  Display formatted block content dump along with interpretation
  -h  Display this information
  -i  Display interpreted item details
  -k  Verify block checksums
  -o  Do not dump old values.
  -R  Display specific block ranges within the file (Blocks are
      indexed from 0)
        [startblock]: block to start at
        [endblock]: block to end at
      A startblock without an endblock will format the single block
  -s  Force segment size to [segsize]
  -t  Dump TOAST files
  -v  Ouput additional information about TOAST relations
  -n  Force segment number to [segnumber]
  -S  Force block size to [blocksize]
  -x  Force interpreted formatting of block items as index items
  -y  Force interpreted formatting of block items as heap items

The following options are valid for control files:
  -c  Interpret the file listed as a control file
  -f  Display formatted content dump along with interpretation
  -S  Force block size to [blocksize]

Report bugs to <[email protected]>

2、使用测试

2.1、创建表并插入测试数据

bill@bill=>create table t_dump(id int,info text,crt_time timestamp);
CREATE TABLE

bill@bill=>insert into t_dump select generate_series(1,10),md5(random()::text),clock_timestamp();
INSERT 0 10

bill@bill=>select * from t_dump ;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  1 | 9af995e95f321e3521fcb6d41208af40 | 2021-02-03 13:26:23.502727
  2 | 2544880c1a22986487e563a6c89f377b | 2021-02-03 13:26:23.502932
  3 | de61423aaf82b8a1bbb49dc3d7809863 | 2021-02-03 13:26:23.502945
  4 | 398af8893872a1860e08ac424ecce885 | 2021-02-03 13:26:23.502951
  5 | 374e4a32688ec70a46fae44fda9e4ed8 | 2021-02-03 13:26:23.502958
  6 | bc8911e89c5be9329abf29cf68f5b4ce | 2021-02-03 13:26:23.502963
  7 | 136bfa992d70eb33b3cdd9e53376261b | 2021-02-03 13:26:23.50297
  8 | ffaa31f1a5ae272727c53ba37dd77706 | 2021-02-03 13:26:23.502975
  9 | e6fe762a144d51e15ea6bdefe2362242 | 2021-02-03 13:26:23.502982
 10 | 8669af9ca762b99e307f9ba2de6d77d2 | 2021-02-03 13:26:23.502988
(10 rows)

2.2、查看表对应的文件

bill@bill=>select pg_relation_filepath('t_dump');
 pg_relation_filepath
----------------------
 base/16385/25316
(1 row)

--checkpiont确保数据刷到磁盘
bill@bill=>checkpoint ;
CHECKPOINT

2.3、pg_filedump读取数据文件

pg13@cnndr4pptliot-> pg_filedump 25316

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: None
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      64 (0x0040)
 Block: Size 8192  Version    4            Upper    7472 (0x1d30)
 LSN:  logid      0 recoff 0x7595d5b8      Special  8192 (0x2000)
 Items:   10                      Free Space: 7408
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 64

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL


*** End of File Encountered. Last Block Read: 0 ***

不过上面的信息我们并不能看懂是什么意思,我们需要使用-D选项将其转换成可以直观读取的格式:

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: -D int,text,timestamp
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      64 (0x0040)
 Block: Size 8192  Version    4            Upper    7472 (0x1d30)
 LSN:  logid      0 recoff 0x7595d5b8      Special  8192 (0x2000)
 Items:   10                      Free Space: 7408
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()
 Length (including item array): 64

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
COPY: 2 2544880c1a22986487e563a6c89f377b        2021-02-03 13:26:23.502932
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
COPY: 3 de61423aaf82b8a1bbb49dc3d7809863        2021-02-03 13:26:23.502945
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
COPY: 4 398af8893872a1860e08ac424ecce885        2021-02-03 13:26:23.502951
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
COPY: 5 374e4a32688ec70a46fae44fda9e4ed8        2021-02-03 13:26:23.502958
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
COPY: 6 bc8911e89c5be9329abf29cf68f5b4ce        2021-02-03 13:26:23.502963
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
COPY: 7 136bfa992d70eb33b3cdd9e53376261b        2021-02-03 13:26:23.502970
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
COPY: 8 ffaa31f1a5ae272727c53ba37dd77706        2021-02-03 13:26:23.502975
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
COPY: 9 e6fe762a144d51e15ea6bdefe2362242        2021-02-03 13:26:23.502982
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL
COPY: 10        8669af9ca762b99e307f9ba2de6d77d2        2021-02-03 13:26:23.502988


*** End of File Encountered. Last Block Read: 0 ***

可以看到COPY:XXX部分显示的便是表中的实际数据了!

但这样是远远不够的,为什么呢?假如表中存在dead tuple,这样显示出来的结果我们便没法判断哪些是需要的了。

bill@bill=>update t_dump set info = 'bill' where id = 1;
UPDATE 1

bill@bill=>checkpoint;
CHECKPOINT

bill@bill=>select * from t_dump;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  2 | 2544880c1a22986487e563a6c89f377b | 2021-02-03 13:26:23.502932
  3 | de61423aaf82b8a1bbb49dc3d7809863 | 2021-02-03 13:26:23.502945
  4 | 398af8893872a1860e08ac424ecce885 | 2021-02-03 13:26:23.502951
  5 | 374e4a32688ec70a46fae44fda9e4ed8 | 2021-02-03 13:26:23.502958
  6 | bc8911e89c5be9329abf29cf68f5b4ce | 2021-02-03 13:26:23.502963
  7 | 136bfa992d70eb33b3cdd9e53376261b | 2021-02-03 13:26:23.50297
  8 | ffaa31f1a5ae272727c53ba37dd77706 | 2021-02-03 13:26:23.502975
  9 | e6fe762a144d51e15ea6bdefe2362242 | 2021-02-03 13:26:23.502982
 10 | 8669af9ca762b99e307f9ba2de6d77d2 | 2021-02-03 13:26:23.502988
  1 | bill                             | 2021-02-03 13:26:23.502727
(10 rows)

查看:

pg13@cnndr4pptliot-> pg_filedump -D int,text,timestamp  25316

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: 25316
* Options used: -D int,text,timestamp
*******************************************************************

Block    0 ********************************************************
<Header> -----
 Block Offset: 0x00000000         Offsets: Lower      68 (0x0044)
 Block: Size 8192  Version    4            Upper    7424 (0x1d00)
 LSN:  logid      0 recoff 0x7595db70      Special  8192 (0x2000)
 Items:   11                      Free Space: 7356
 Checksum: 0x0000  Prune XID: 0x000008d0  Flags: 0x0000 ()
 Length (including item array): 68

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727
 Item   2 -- Length:   72  Offset: 8048 (0x1f70)  Flags: NORMAL
COPY: 2 2544880c1a22986487e563a6c89f377b        2021-02-03 13:26:23.502932
 Item   3 -- Length:   72  Offset: 7976 (0x1f28)  Flags: NORMAL
COPY: 3 de61423aaf82b8a1bbb49dc3d7809863        2021-02-03 13:26:23.502945
 Item   4 -- Length:   72  Offset: 7904 (0x1ee0)  Flags: NORMAL
COPY: 4 398af8893872a1860e08ac424ecce885        2021-02-03 13:26:23.502951
 Item   5 -- Length:   72  Offset: 7832 (0x1e98)  Flags: NORMAL
COPY: 5 374e4a32688ec70a46fae44fda9e4ed8        2021-02-03 13:26:23.502958
 Item   6 -- Length:   72  Offset: 7760 (0x1e50)  Flags: NORMAL
COPY: 6 bc8911e89c5be9329abf29cf68f5b4ce        2021-02-03 13:26:23.502963
 Item   7 -- Length:   72  Offset: 7688 (0x1e08)  Flags: NORMAL
COPY: 7 136bfa992d70eb33b3cdd9e53376261b        2021-02-03 13:26:23.502970
 Item   8 -- Length:   72  Offset: 7616 (0x1dc0)  Flags: NORMAL
COPY: 8 ffaa31f1a5ae272727c53ba37dd77706        2021-02-03 13:26:23.502975
 Item   9 -- Length:   72  Offset: 7544 (0x1d78)  Flags: NORMAL
COPY: 9 e6fe762a144d51e15ea6bdefe2362242        2021-02-03 13:26:23.502982
 Item  10 -- Length:   72  Offset: 7472 (0x1d30)  Flags: NORMAL
COPY: 10        8669af9ca762b99e307f9ba2de6d77d2        2021-02-03 13:26:23.502988
 Item  11 -- Length:   48  Offset: 7424 (0x1d00)  Flags: NORMAL
COPY: 1 bill    2021-02-03 13:26:23.502727

可以看到dump出来总共有11条记录,但表中实际只有10条记录,这是因为存在dead tuple,如果要查看哪些是dead tuple便需要查看更详细的信息。

pg13@cnndr4pptliot-> pg_filedump -D int,text,timestamp  -i -f 25316|less

...

<Data> -----
 Item   1 -- Length:   72  Offset: 8120 (0x1fb8)  Flags: NORMAL
  XMIN: 2255  XMAX: 2256  CID|XVAC: 0
  Block Id: 0  linp Index: 11   Attributes: 3   Size: 24
  infomask: 0x0502 (HASVARWIDTH|XMIN_COMMITTED|XMAX_COMMITTED|HOT_UPDATED)

  1fb8: cf080000 d0080000 00000000 00000000  ................
  1fc8: 0b000340 02051800 01000000 43396166  ...@........C9af
  1fd8: 39393565 39356633 32316533 35323166  995e95f321e3521f
  1fe8: 63623664 34313230 38616634 30000000  cb6d41208af40...
  1ff8: 87a9524d 6d5d0200                    ..RMm]..

COPY: 1 9af995e95f321e3521fcb6d41208af40        2021-02-03 13:26:23.502727

可以看到Item1的XMAX是2256,即表示这条数据是被更新的数据,即dead tuple。

关于数据块的内部结构,可以参考:
https://www.postgresql.org/docs/13/storage-page-layout.html

相关的源码头文件:

#include "access/gin_private.h"  
#include "access/gist.h"  
#include "access/hash.h"  
#include "access/htup.h"  
#include "access/htup_details.h"  
#include "access/itup.h"  
#include "access/nbtree.h"  
#include "access/spgist_private.h"  
#include "catalog/pg_control.h"  
#include "storage/bufpage.h"  

猜你喜欢

转载自blog.csdn.net/weixin_39540651/article/details/113603638