MySQL 8.0.11 快速生成百万甚至千万测试数据

背景需求:

随机生成千万数据用于测试和验证

1.利用现有的生产数据。

统计现有生产环境的数据,若有千万级别数据的表则可以直接利用:
SELECT table_schema,table_name,table_rows FROM information_schema.tables WHERE table_rows >10000000;

直接备份还原到测试环境 即可。

2.利用sysbench生成单表千万上亿的数据:

这里使用的sysbench RPM安装包:
sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql prepare
sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql run

注意这里的table_size指定单表的行数,tables指定生产表的个数;使用完测试数据后自己手动删除即可。

3.自己手写SQL代码生成千万数据。

创建一个表存储0-9共10个数字,领完创建一个表用于存放千万级别的表数据:
CREATE TABLE a (i int);
INSERT INTO a(i) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

create table bigtable(i bigint unsigned );
insert into bigtable(i)
SELECT
    a.i*1
   +a1.i*10
   +a2.i*100
   +a3.i*1000
   +a4.i*10000
   +a5.i*100000
   +a6.i*1000000
   +a7.i*10000000
   AS id
FROM  a 
CROSS JOIN a AS a1
CROSS JOIN a AS a2
CROSS JOIN a AS a3
CROSS JOIN a AS a4
CROSS JOIN a AS a5
CROSS JOIN a AS a6
CROSS JOIN a AS a7;
Query OK, 100000000 rows affected (8 min 47.86 sec)
Records: 100000000  Duplicates: 0  Warnings: 0
--查询验证:
mysql> SELECT MIN(b.i),MAX(b.i),COUNT(1) from bigtable b;  
+----------+----------+-----------+
| MIN(b.i) | MAX(b.i) | COUNT(1)  |
+----------+----------+-----------+
|        0 | 99999999 | 100000000 |
+----------+----------+-----------+
1 row in set (1 min 24.20 sec)


在MariaDB10.2和10.3版本以及MySQL8.0.11版本中支持with语句后,上面的过程一条SQL语句即可搞定。
WITH a AS (
SELECT 1 AS i
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
UNION ALL SELECT 9
UNION ALL SELECT 0)
,b as (
SELECT
    a.i*1
   +a1.i*10
   +a2.i*100
   +a3.i*1000
   +a4.i*10000
   +a5.i*100000
   +a6.i*1000000
   +a7.i*10000000
   AS id
FROM  a 
CROSS JOIN a AS a1
CROSS JOIN a AS a2
CROSS JOIN a AS a3
CROSS JOIN a AS a4
CROSS JOIN a AS a5
CROSS JOIN a AS a6
CROSS JOIN a AS a7)
SELECT MIN(b.id),MAX(b.id),COUNT(1) FROM b;
min(b.id)	max(b.id)	count(1)
0	99999999	100000000
耗时3min24sec。具体的消耗时间视电脑的性能。由于insert操作需要大量写入数据到磁盘,
在insert之前可以临时关闭binlog文件。

猜你喜欢

转载自blog.csdn.net/vkingnew/article/details/81194017