TiDB入门篇-数据物理备份和恢复

简介

快照备份是集群全量备份的一种实现。它基于 TiDB 的多版本并发控制 (MVCC) 实现,将指定快照包含的所有数据备份到目标存储中。备份下来的数据大小约等于集群(压缩后的)单副本数据大小。备份完成之后,你可以在一个空集群或不存在数据冲突(相同 schema 或 table)的集群执行快照备份恢复,将集群恢复到快照备份时的数据状态,同时恢复功能会依据集群副本设置恢复出多副本。

TiDB 备份与恢复功能使用概述 | PingCAP 文档中心

全量备份操作

前提

先正常启动一个集群

[root@master ~]# tiup cluster display  tidb-deploy
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster display tidb-deploy
Cluster type:       tidb
Cluster name:       tidb-deploy
Cluster version:    v6.5.1
Deploy user:        root
SSH type:           builtin
Dashboard URL:      http://192.168.66.20:2379/dashboard
Grafana URL:        http://192.168.66.20:3000
ID                   Role          Host           Ports                            OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----                            -------       ------  --------                      ----------
192.168.66.20:9093   alertmanager  192.168.66.20  9093/9094                        linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.66.20:3000   grafana       192.168.66.20  3000                             linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.66.10:2379   pd            192.168.66.10  2379/2380                        linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.20:2379   pd            192.168.66.20  2379/2380                        linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.21:2379   pd            192.168.66.21  2379/2380                        linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.20:9090   prometheus    192.168.66.20  9090/12020                       linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.66.10:4000   tidb          192.168.66.10  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.66.10:9000   tiflash       192.168.66.10  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.66.10:20160  tikv          192.168.66.10  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.66.20:20160  tikv          192.168.66.20  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.66.21:20160  tikv          192.168.66.21  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160

准备些数据

#先创建些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
create table temptable(name varchar(10));
insert into temptable values('1');
insert into temptable values('1');
insert into temptable values('1');

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.00 sec)

执行br全备

1.安装备份工具执行全量备份。

#安装备份工具br
tiup install br

# --backupts '2022-09-08 13:30:00' 指定备份快照时间,如果不指定那么就是命令的执行时间

tiup br backup full --pd "192.168.66.10:2379" \
    --storage "local:///tmp/datatemp" \
    --ratelimit 128 \
    --log-file backupfull.log

打印结果

[root@master ~]# tiup br backup full --pd "192.168.66.10:2379" \
>     --storage "local:///tmp/datatemp" \
>     --ratelimit 128 \
>     --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log 
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:32:19.052 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=17] [ranges-succeed=17] [ranges-failed=0] [backup-checksum=14.147284ms] [backup-fast-checksum=3.783683ms] [backup-total-ranges=80] [backup-total-regions=80] [total-take=4.551226548s] [BackupTS=440393304777424899] [total-kv=1227] [total-kv-size=294.5kB] [average-speed=64.71kB/s] [backup-data-size(after-compressed)=78.38kB] [Size=78375]

2.查看下备份文件的快照时间。

#查看备份文件的备份时间
tiup br validate decode --field="end-version" \
--storage "local:///tmp/datatemp" | tail -n1

结果输出如下,对应物理时间 2022-09-08 13:30:00 +0800 CST:
435844546560000000

3.让每一台机器都有全部的备份文件。

#在192.168.66.10的数据拷贝到192.168.66.20
#在192.168.66.21的数据拷贝到192.168.66.20
#然后备份的所有数据192.168.66.20拷贝到192.168.66.10和192.168.66.21
scp -r /tmp/datatemp [email protected]:/tmp

scp -r /tmp/datatemp [email protected]:/tmp

scp -r /tmp/datatemp [email protected]:/tmp
scp -r /tmp/datatemp [email protected]:/tmp

打印结果

[root@node2 datatemp]# ll
total 212
drwxr-xr-x 2 root root   4096 Mar 28 09:32 1
drwxr-xr-x 2 root root    210 Mar 28 09:44 4
drwxr-xr-x 2 root root   4096 Mar 28 09:44 5
-rw-r--r-- 1 root root     78 Mar 28 09:44 backup.lock
-rw-r--r-- 1 root root 196315 Mar 28 09:44 backupmeta
-rw-r--r-- 1 root root   7898 Mar 28 09:44 checkpoint.meta
drwxrwxrwx 2 root root     29 Mar 28 09:44 checkpoints

4.删除之前的数据,执行数据恢复功能。

#连接到数据库并且删除数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;

mysql> drop database test;
Query OK, 0 rows affected (0.39 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA     |
| PERFORMANCE_SCHEMA |
| mysql              |
+--------------------+
4 rows in set (0.00 sec)

exit;

执行恢复查看数据

#执行恢复
tiup br restore full --pd "192.168.66.10:2379" \
--storage "local:///tmp/datatemp"

#打印如下
[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T09.50.34+0800 
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:50:37.983 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=336.844µs] [restore-ranges=3] [total-take=3.072937062s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899] [RestoreTS=440393592991907878] [total-kv=5] [total-kv-size=347B] [average-speed=112.9B/s]

#查看数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.02 sec)

5.如果之前全量备份,后面有数据的加入,然后在执行全量恢复会有什么效果,是恢复到快照的数据,还是数据的合并呢?

下面准备数据

#删除刚才的数据库
drop table temptable;
create table temptable1(name varchar(10));
insert into temptable1 values('1');

create table temptable(name varchar(10));
insert into temptable values('2');

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable      |
| temptable1     |
+----------------+
2 rows in set (0.00 sec)

mysql> select * from temptable;
+------+
| name |
+------+
| 2    |
+------+
1 row in set (0.00 sec)

开始快照恢复刚才备份的数据。

tiup br restore full --pd "192.168.66.10:2379" \
--storage "local:///tmp/datatemp"

下面数据恢复失败了。 

[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T09.57.03+0800 
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:57:05.519 +08:00] [INFO] [collector.go:73] ["Full Restore failed summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=241.554µs] [restore-ranges=3]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch

进入数据库可以看到只恢复了一部分数据。 

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 2    |
| 1    |
| 1    |
+------+
3 rows in set (0.01 sec)

下面删除test库重新恢复数据结果如下,可以看到数据恢复成功。

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;

tiup br restore full --pd "192.168.66.10:2379" \
--storage "local:///tmp/datatemp"

[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T10.01.10+0800 
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 10:01:12.129 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=255.166µs] [restore-ranges=3] [total-take=2.085316769s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899] [RestoreTS=440393759492407297] [total-kv=5] [total-kv-size=347B] [average-speed=166.4B/s]

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;


Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.01 sec)

6.销毁集群然后在恢复数据。

#销毁集群
tiup cluster destroy tidb-deploy
#重新部署
tiup cluster deploy tidb-deploy  v6.5.1 ./topology.yaml --user root -p
#查看集群启动状态
tiup cluster display tidb-deploy
tiup cluster start tidb-deploy
#数据恢复
tiup br restore full --pd "192.168.66.10:2379" \
--storage "local:///tmp/datatemp"

#检查数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;


mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.00 sec)

结论:

可以看出在数据进行快照备份的时候,要恢复前先停止数据的写入,然后在所有的数据都清除的情况下执行快照恢复。

恢复单个数据库的数据 

1.删除对应的数据库。

drop database test;
exit;

2.恢复指定的数据库数据。 

tiup br restore db --pd "192.168.66.10:2379" \
--db "test" \
--storage "local:///tmp/datatemp"

3.验证。

mysql --comments --host 127.0.0.1 --port 4000 -u root -p

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable      |
+----------------+
1 row in set (0.01 sec)

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.01 sec)

恢复单张表的数据 

恢复单张表的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop table temptable;
show tables;

mysql> drop table temptable;
Query OK, 0 rows affected (0.32 sec)

mysql> show tables;
Empty set (0.00 sec)

exit;

tiup br restore table --pd "192.168.66.10:2379" \
--db "test" \
--table "temptable" \
--storage "local:///tmp/datatemp"

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;

select * from temptable;

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> 
mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.00 sec)

恢复 mysql 数据库下的表

自 br v5.1.0 开始,快照备份会备份 mysql schema 下的系统表数据,而不会默认恢复这些数据。自 br v6.2.0 开始,在设置 --with-sys-table 下,恢复数据时将同时恢复部分系统表相关数据

TiDB 快照备份与恢复使用指南 | PingCAP 文档中心

增量日志数据备份

查看下实验开始的数据

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;

Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
+------+
3 rows in set (0.00 sec)

开启日志数据增量备份

1.开启任务。

#执行日志备份
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \
--storage="local:///tmp/datatemplog-backup"

打印内容

[root@master ~]# tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \
> --storage="local:///tmp/datatemplog-backup"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log start --task-name=pitr --pd=192.168.66.10:2379 --storage=local:///tmp/datatemplog-backup
Detail BR log in /tmp/br.log.2023-03-28T10.37.15+0800 
[2023/03/28 10:37:15.931 +08:00] [INFO] [collector.go:73] ["log start"] [streamTaskInfo="{taskName=pitr,startTs=440394327217668097,endTS=999999999999999999,tableFilter=*.*}"] [pausing=false] [rangeCount=2]
[2023/03/28 10:37:18.046 +08:00] [INFO] [collector.go:73] ["log start success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [backup-checksum=2.666236ms] [total-take=2.35706338s]

2.查看任务。

tiup br log status --task-name=pitr --pd="192.168.66.10:2379"

打印如下 

[root@master ~]# tiup br log status --task-name=pitr --pd="192.168.66.10:2379"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitr --pd=192.168.66.10:2379
Detail BR log in /tmp/br.log.2023-03-28T10.38.23+0800 
● Total 1 Tasks.
> #1 <
              name: pitr
            status: ● NORMAL
             start: 2023-03-28 10:37:15.834 +0800
               end: 2090-11-18 22:07:45.624 +0800
           storage: local:///tmp/datatemplog-backup
       speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 10:37:15.834 +0800; gap=1m8s

3.新添加些数据。

#新添加些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
insert into temptable values('2');
select * from temptable;

mysql> insert into temptable values('2');
Query OK, 1 row affected (0.01 sec)

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
| 2    |
+------+
4 rows in set (0.00 sec)

使用Point-in-time recovery (PITR)指定数据恢复

TiDB 日志备份与 PITR 使用指南 | PingCAP 文档中心

1.先记销毁集群模拟故障。

#销毁集群
tiup cluster destroy tidb-deploy
#重新部署
tiup cluster deploy tidb-deploy  v6.5.1 ./topology.yaml --user root -p
#查看集群启动状态
tiup cluster display tidb-deploy
tiup cluster start tidb-deploy

2.检查数据。

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;


mysql> use test;
Database changed
mysql> select * from temptable;
ERROR 1146 (42S02): Table 'test.temptable' doesn't exist

2.准备恢复数据。

#如果是分布式数据存储这一步就不需要
cd /tmp/datatemplog-backup
#把增量的数据都发送到其他机器
scp -r /tmp/datatemplog-backup [email protected]:/tmp
scp -r /tmp/datatemplog-backup [email protected]:/tmp

3.执行恢复。

tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/datatemplog-backup" \
--full-backup-storage="local:///tmp/datatemp" 

打印如下

[root@master ~]# tiup br restore point --pd "192.168.66.10:2379" \
> --storage="local:///tmp/datatemplog-backup" \
> --full-backup-storage="local:///tmp/datatemp" 
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-backup --full-backup-storage=local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T10.57.44+0800 
Error: the full snapshot(from cluster ID:7215396801192470494) and log(from cluster ID:7215412685566300640) come from different cluster.: [BR:Common:ErrInvalidArgument]invalid argument

结论:

也就是如果是这种操作,销毁集群,创建新的集群的时候这种恢复有错误。

下面执行快照恢复,执行快照恢复成功

[root@master ~]# tiup br restore full --pd "192.168.66.10:2379" \
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T11.00.13+0800 
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 11:00:17.001 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=199.397µs] [restore-ranges=3] [total-take=3.107023133s] [RestoreTS=440394688506888228] [total-kv=5] [total-kv-size=347B] [average-speed=111.7B/s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899]

(重点)如果全量备份是之前集群的问题(恢复失败)

#启动增量日志备份数据
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \
 --storage="local:///tmp/datatemplog-nihao"

 tiup br log status --task-name=pitr --pd="192.168.66.10:2379"

 Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitr --pd=192.168.66.10:2379
Detail BR log in /tmp/br.log.2023-03-28T11.05.00+0800 
● Total 1 Tasks.
> #1 <
              name: pitr
            status: ● NORMAL
             start: 2023-03-28 11:04:51.492 +0800
               end: 2090-11-18 22:07:45.624 +0800
           storage: local:///tmp/datatemplog-nihao
       speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 11:04:51.492 +0800; gap=10s

#恢复指定时间数据
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/datatemplog-backup" \
--full-backup-storage="local:///tmp/datatemp" 
--restored-ts '2022-05-15 18:00:00+0800'

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('2');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;

mysql> insert into temptable values('4');
Query OK, 1 row affected (0.00 sec)

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 1    |
| 1    |
| 2    |
| 3    |
| 4    |
+------+
6 rows in set (0.01 sec)

use test;
delete from temptable;
select * from temptable;


#让每一台机器都有对应的完整的数据
scp -r /tmp/datatemplog-nihao [email protected]:/tmp
scp -r /tmp/datatemplog-nihao [email protected]:/tmp

#执行恢复
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/datatemplog-nihao" \
--full-backup-storage="local:///tmp/datatemp" 
--restored-ts '2023-03-28 11:11:07+0800'

[root@master ~]# tiup br restore point --pd "192.168.66.10:2379" \
> --storage="local:///tmp/datatemplog-nihao" \
> --full-backup-storage="local:///tmp/datatemp" 
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-nihao --full-backup-storage=local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T11.16.52+0800 
Error: failed to check task exits: log backup task is running: pitr, please stop the task before restore, and after PITR operation finished, create log-backup task again and create a full backup on this cluster

#查看下状态
tiup br log status --task-name=pitr --pd="192.168.66.10:2379"

#停止下
tiup br log stop --task-name=pitr

#然后执行
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/datatemplog-nihao" 
--full-backup-storage="local:///tmp/datatemp" 
--restored-ts '2023-03-28 11:11:07+0800'

> --storage="local:///tmp/datatemplog-nihao" 
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-nihao
Detail BR log in /tmp/br.log.2023-03-28T11.19.56+0800 
Error: restore log from 0(1970-01-01 08:00:00 +0800 CST) to 440394761238478850(2023-03-28 11:04:51.492 +0800 CST),  but the current existed log from 440394761238478850(2023-03-28 11:04:51.492 +0800 CST) to 440394761238478850(2023-03-28 11:04:51.492 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument

结论:

如果之前的全量备份数据是之前集群的,增量的数据是后面产生的,那么PITR恢复会失败。

数据实验

1.检查集群健康状态
#查看集群数据,下面表示集群正常
[root@master ~]# tiup cluster display tidb-deploy
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster display tidb-deploy
Cluster type:       tidb
Cluster name:       tidb-deploy
Cluster version:    v6.5.1
Deploy user:        root
SSH type:           builtin
Dashboard URL:      http://192.168.66.20:2379/dashboard
Grafana URL:        http://192.168.66.20:3000
ID                   Role          Host           Ports                            OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----                            -------       ------  --------                      ----------
192.168.66.20:9093   alertmanager  192.168.66.20  9093/9094                        linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.66.20:3000   grafana       192.168.66.20  3000                             linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.66.10:2379   pd            192.168.66.10  2379/2380                        linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.20:2379   pd            192.168.66.20  2379/2380                        linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.21:2379   pd            192.168.66.21  2379/2380                        linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.66.20:9090   prometheus    192.168.66.20  9090/12020                       linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.66.10:4000   tidb          192.168.66.10  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.66.10:9000   tiflash       192.168.66.10  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.66.10:20160  tikv          192.168.66.10  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.66.20:20160  tikv          192.168.66.20  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.66.21:20160  tikv          192.168.66.21  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 11

2.准备数据。
#插入写准备好的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('2');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;

mysql> select * from temptable;
+------+
| name |
+------+
| 2    |
| 3    |
| 4    |
+------+
3 rows in set (0.01 sec)

3.执行全量备份
mkdir /tmp/full_data
chmod 777 /tmp/full_data

tiup br backup full --pd "192.168.66.10:2379" \
    --storage "local:///tmp/full_data" \
    --ratelimit 128 \
    --log-file backupfull.log

#下面是打印结果
    [root@master ~]# tiup br backup full --pd "192.168.66.10:2379" \
>     --storage "local:///tmp/full_data" \
>     --ratelimit 128 \
>     --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/full_data --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log 
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 12:26:18.986 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=19] [ranges-succeed=19] [ranges-failed=0] [backup-checksum=13.20605ms] [backup-fast-checksum=4.260555ms] [backup-total-regions=80] [backup-total-ranges=80] [total-take=4.306455012s] [BackupTS=440396021403222019] [total-kv=1266] [total-kv-size=359.4kB] [average-speed=83.45kB/s] [backup-data-size(after-compressed)=86.52kB] [Size=86517]



4.开始日志增量备份
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \
 --storage="local:///tmp/full_data-upload"

 #打印结果
 [root@master ~]# tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \
>  --storage="local:///tmp/full_data-upload"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log start --task-name=pitr --pd=192.168.66.10:2379 --storage=local:///tmp/full_data-upload
Detail BR log in /tmp/br.log.2023-03-28T12.27.46+0800 
[2023/03/28 12:27:46.499 +08:00] [INFO] [collector.go:73] ["log start"] [streamTaskInfo="{taskName=pitr,startTs=440396065393344515,endTS=999999999999999999,tableFilter=*.*}"] [pausing=false] [rangeCount=2]
[2023/03/28 12:27:48.681 +08:00] [INFO] [collector.go:73] ["log start success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [backup-checksum=1.447661ms] [total-take=2.360156816s]

5.添加些新增数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('100');
insert into temptable values('300');
insert into temptable values('400');
select * from temptable;

#打印结果
mysql> select * from temptable;
+------+
| name |
+------+
| 2    |
| 3    |
| 4    |
| 100  |
| 300  |
| 400  |
+------+
6 rows in set (0.00 sec)

6.检查备份数据(下面的操作是每一台机器都有全部的增量数据和全量数据,因为这样如果没有分布式文件系统那么每一台机器只有一部分数据)

#先关闭下日志增量数据同步
tiup br log stop --task-name=pitr

cd /tmp/full_data-upload
cd /tmp/full_data

scp -r /tmp/full_data [email protected]:/tmp
scp -r /tmp/full_data-upload [email protected]:/tmp
 
scp -r /tmp/full_data [email protected]:/tmp
scp -r /tmp/full_data-upload [email protected]:/tmp
 
scp -r /tmp/full_data [email protected]:/tmp
scp -r /tmp/full_data [email protected]:/tmp
scp -r /tmp/full_data-upload [email protected]:/tmp
scp -r /tmp/full_data-upload [email protected]:/tmp
 
7.删除现在数据库里面的数据

mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA     |
| PERFORMANCE_SCHEMA |
| mysql              |
+--------------------+
4 rows in set (0.00 sec)

8.全量加增量恢复
#--restored-ts '2023-03-28 12:35:00+0800' 指定恢复时间点
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/full_data-upload"  \
--full-backup-storage="local:///tmp/full_data" 


#打印如下,下面说明了要先启动日志增量备份,然后启动全量备份才行,不然有日志缺口
[root@master full_data]# tiup br restore point --pd "192.168.66.10:2379" \
> --storage="local:///tmp/full_data-upload"  \
> --full-backup-storage="local:///tmp/full_data"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-upload --full-backup-storage=local:///tmp/full_data
Detail BR log in /tmp/br.log.2023-03-28T12.43.47+0800 
Error: it has gap between full backup ts:440396021403222019(2023-03-28 12:24:58.639 +0800 CST) and log backup ts:440396065393344515(2023-03-28 12:27:46.448 +0800 CST). : [BR:Common:ErrInvalidArgument]invalid argument


9.检查下现在的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
mysql> use test;
ERROR 1049 (42000): Unknown database 'test'

10.先启动日志增量备份
tiup br log start --task-name=pitrone --pd="192.168.66.10:2379" \
 --storage="local:///tmp/full_data-uploadd"

#查看下状态
tiup br log status --task-name=pitrone

[root@master full_data]# tiup br log status --task-name=pitrone
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitrone
Detail BR log in /tmp/br.log.2023-03-28T12.49.34+0800 
● Total 1 Tasks.
> #1 <
              name: pitrone
            status: ● NORMAL
             start: 2023-03-28 12:49:01.889 +0800
               end: 2090-11-18 22:07:45.624 +0800
           storage: local:///tmp/full_data-uploadd
       speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 12:49:01.889 +0800; gap=34s

11.新添加些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
create database test;
use test;
create table temptable(name varchar(20));
select * from temptable;
insert into temptable values('1');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 3    |
| 4    |
+------+
3 rows in set (0.00 sec)


12.执行全量备份
mkdir /tmp/full_datad
chmod 777 /tmp/full_datad

tiup br backup full --pd "192.168.66.10:2379" \
    --storage "local:///tmp/full_datad" \
    --ratelimit 128 \
    --log-file backupfull.log

#下面是打印结果
[root@master full_data-upload]# tiup br backup full --pd "192.168.66.10:2379" \
>     --storage "local:///tmp/full_datad" \
>     --ratelimit 128 \
>     --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/full_datad --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log 
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 12:58:40.782 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=21] [ranges-succeed=21] [ranges-failed=0] [backup-checksum=25.683732ms] [backup-fast-checksum=5.276318ms] [backup-total-ranges=80] [backup-total-regions=80] [total-take=4.335096191s] [total-kv-size=362.7kB] [average-speed=83.66kB/s] [backup-data-size(after-compressed)=90.19kB] [Size=90188] [BackupTS=440396550632374275] [total-kv=1271]

13.执行完全量备份以后插入些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('100');
insert into temptable values('300');
insert into temptable values('400');
select * from temptable;

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 3    |
| 4    |
| 100  |
| 300  |
| 400  |
+------+
6 rows in set (0.01 sec)

插入完数据的时间现在是13:00

14.13:02的时候执行数据删除操作
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;

show databases;

15.关闭日志增量数据
tiup br log stop --task-name=pitrone

#打印如下
[root@master full_data-upload]# tiup br log stop --task-name=pitrone
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log stop --task-name=pitrone
Detail BR log in /tmp/br.log.2023-03-28T13.03.21+0800 
[2023/03/28 13:03:21.765 +08:00] [INFO] [collector.go:73] ["log stop"] [streamTaskInfo="{taskName=pitrone,startTs=440396399742550018,endTS=999999999999999999,tableFilter=*.*}"]
[2023/03/28 13:03:24.016 +08:00] [INFO] [collector.go:73] ["log stop success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [total-take=2.345402948s]

16.同步每一个节点的数据使得数据都有全部的增量和全量数据
#192.168.66.10执行
scp -r /tmp/full_datad [email protected]:/tmp
scp -r /tmp/full_data-uploadd [email protected]:/tmp
#192.168.66.21执行
scp -r /tmp/full_datad [email protected]:/tmp
scp -r /tmp/full_data-uploadd [email protected]:/tmp

#192.168.66.20执行
scp -r /tmp/full_datad [email protected]:/tmp
scp -r /tmp/full_datad [email protected]:/tmp
scp -r /tmp/full_data-uploadd [email protected]:/tmp
scp -r /tmp/full_data-uploadd [email protected]:/tmp

17.执行指定时间恢复,上面的操作是13:01:00之后执行了删除操作,现在恢复13:01:00之前的数据
#--restored-ts '2023-03-28 12:35:00+0800' 指定恢复时间点
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/full_data-uploadd" \
--full-backup-storage="local:///tmp/full_datad" \
--restored-ts '2023-03-28 13:01:00+0800'

#下面显示,全量和增量日志备份有一部分是重复的
[root@master full_data]# tiup br restore point --pd "192.168.66.10:2379" \
> --storage="local:///tmp/full_data-uploadd" \
> --full-backup-storage="local:///tmp/full_datad" \
> --restored-ts '2023-03-28 13:01:00+0800'
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-uploadd --full-backup-storage=local:///tmp/full_datad --restored-ts 2023-03-28 13:01:00+0800
Detail BR log in /tmp/br.log.2023-03-28T13.10.15+0800 
Error: restore log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396587991040000(2023-03-28 13:01:00 +0800 CST),  but the current existed log from 440396399742550018(2023-03-28 12:49:01.889 +0800 CST) to 440396399742550018(2023-03-28 12:49:01.889 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument
[root@master full_data]# vi /tmp/br.log.2023-03-28T13.10.15+0800

18.清除全量快照之前的日志增量数据
FULL_BACKUP_TS=`tiup br validate decode --field="end-version" --storage "local:///tmp/full_datad"| tail -n1`
tiup br log truncate --until=${FULL_BACKUP_TS} --storage="local:///tmp/full_data-uploadd"

#执行恢复
#--restored-ts '2023-03-28 13:01:00+0800' 指定时间恢复失败
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/full_data-uploadd" \
--full-backup-storage="local:///tmp/full_datad" 

[root@master full_data-upload]# tiup br restore point --pd "192.168.66.10:2379" \
> --storage="local:///tmp/full_data-uploadd" \
> --full-backup-storage="local:///tmp/full_datad"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-uploadd --full-backup-storage=local:///tmp/full_datad
Detail BR log in /tmp/br.log.2023-03-28T13.32.18+0800 
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 13:32:22.795 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=223.347µs] [restore-ranges=3] [total-take=4.185697983s] [restore-data-size(after-compressed)=5.109kB] [Size=5109] [BackupTS=440396550632374275] [RestoreTS=440397080766185509] [total-kv=5] [total-kv-size=347B] [average-speed=82.9B/s]
Restore Meta Files <.........................................................................................................................................................................................> ?%
Restore KV Files <...........................................................................................................................................................................................> ?%
[2023/03/28 13:32:23.881 +08:00] [INFO] [collector.go:73] ["restore log success summary"] [total-take=1.0855745s] [restore-from=440396550632374275] [restore-to=440396550632374275] [restore-from="2023-03-28 12:58:37.488 +0800"] [restore-to="2023-03-28 12:58:37.488 +0800"] [total-kv-count=0] [total-size=0B] [average-speed=0B/s]


mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable      |
+----------------+
1 row in set (0.00 sec)

mysql> select * from temptable;
+------+
| name |
+------+
| 1    |
| 3    |
| 4    |
+------+
3 rows in set (0.00 sec)

# 指定时间恢复失败
tiup br restore point --pd "192.168.66.10:2379" \
--storage="local:///tmp/full_data-uploadd" \
--full-backup-storage="local:///tmp/full_datad" \
--restored-ts '2023-03-28 13:00:00+0800'

Detail BR log in /tmp/br.log.2023-03-28T13.35.20+0800 
Error: restore log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396572262400000(2023-03-28 13:00:00 +0800 CST),  but the current existed log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396550632374275(2023-03-28 12:58:37.488 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument

结论:

指定时间恢复有点问题,如果加了指点的时间段进行恢复,总是会有点冲突,在全量和增量一块的时候,得先开启日志增量数据同步,然后在弄全量同步才能用这个功能,指定时间段恢复有点问题。

猜你喜欢

转载自blog.csdn.net/S1124654/article/details/129808641
今日推荐