MySQL 8.0.12 深入理解bit类型

版权声明:本文为博主原创文章,转载请注明出处 https://blog.csdn.net/vkingnew/article/details/82687471
背景:
在阿里巴巴推荐的MySQL建表规范里要求如下:
表达是与否概念的字段,必须使用 is_xxx 的方式命名,数据类型是 unsigned tinyint
( 1 表示是,0 表示否)。

解释:
在MySQL里表示是和否的概念 可以使用如下三种方案:
1.使用bit(1)类型,此时bit允许存储的是ASCII中的0和1. 0表示否1表示是的概念。
2.使用tinyint unsigned类型,此时存储的是十进制数字0和1.
3.使用boolean类型,MySQL并不真正支持此类型,是为兼容其他类型的数据。
TRUE等同于1,表示是;False等同于0表示否;TRUE和FALSE不区分大小写。
true和false在存储的时候是以0和1存储的。

试验验证:
1.前期使用sysbench创建一张1000万条记录的表t1,复制三个表添加不同类型的数据:
mysql> desc t1;
+----------------+-----------+------+-----+-------------------+-----------------------------+
| Field          | Type      | Null | Key | Default           | Extra                       |
+----------------+-----------+------+-----+-------------------+-----------------------------+
| id             | int(11)   | NO   | PRI | NULL              | auto_increment              |
| k              | int(11)   | NO   | MUL | 0                 |                             |
| c              | char(120) | NO   |     |                   |                             |
| pad            | char(60)  | NO   |     |                   |                             |
| LastModifyTime | datetime  | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+----------------+-----------+------+-----+-------------------+-----------------------------+
5 rows in set (0.01 sec)

mysql> create table b1 like t1;
Query OK, 0 rows affected (0.02 sec)

mysql> create table b2 like t1;
Query OK, 0 rows affected (0.04 sec)
mysql> create table b3 like t1;
Query OK, 0 rows affected (0.06 sec)

2.添加字段:
b1表采用bit(1) 类型;
b2表采用tinyint unsigned 类型;
b3表采用bool类型。
mysql> alter table b1 add column b bit(1) not null default b'0' comment '0否 1是';
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table b2 add column b tinyint unsigned not null default 0 comment '0否 1是';
Query OK, 0 rows affected (0.02 sec)
mysql> alter table b3 add column b bool not null default false comment 'false否0 true 是1';
Query OK, 0 rows affected (0.44 sec)
Records: 0  Duplicates: 0  Warnings: 0

3.插入数据:
mysql> insert into b1 select * from t1;
Query OK, 10000000 rows affected (3 min 12.15 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> insert into b2 select * from t1;
Query OK, 10000000 rows affected (3 min 10.29 sec)
Records: 10000000  Duplicates: 0  Warnings: 0
mysql> insert into b3 select * from t1;
Query OK, 10000000 rows affected (3 min 10.29 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

4.更新数据:
--更新操作:
mysql> update b1 set b=case when id%2=1 then b'1' when id%2=0 then b'0' end;
Query OK, 5000000 rows affected (2 min 3.22 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

mysql> update b2 set b=case when id%2=1 then 1 when id%2=0 then 0 end;
Query OK, 5000000 rows affected (1 min 50.12 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

mysql> update b3 set b=case when id%2=1 then true when id%2=0 then false end;
Query OK, 5000000 rows affected (1 min 51.17 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

5.数据查询:
--查询:
mysql> select id,k,lastmodifytime,bin(b) from b1 limit 2;
+----+---------+---------------------+--------+
| id | k       | lastmodifytime      | bin(b) |
+----+---------+---------------------+--------+
|  1 | 5014614 | 2018-09-13 09:38:02 | 1      |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0      |
+----+---------+---------------------+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b from b2 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:42:06 | 1 |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b from b3 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:48:28 | 1 |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)

将b写为bit类型时候需要使用函数bin函数方可正确查询,在命令模式下查询出来的结果为空。
mysql> select id,k,lastmodifytime,b from b1 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:38:02 |   |
|  2 | 5024801 | 2018-09-12 14:39:49 |   |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)
在SQLyog等客户端工具中查询:
b1的查询:
    id        k  lastmodifytime       b       
------  -------  -------------------  --------
     1  5014614  2018-09-13 09:38:02  b'1'    
     2  5024801  2018-09-12 14:39:49  b'000000
b2和b3的查询:
    id        k  lastmodifytime            b  
------  -------  -------------------  --------
     1  5014614  2018-09-13 09:48:28         1
     2  5024801  2018-09-12 14:39:49         0

结论:在易读性上使用tinyint类型和bool类型是等同的,使用bit类型则需要使用函数转换。

6.数据存储空间:
由于使用update等语句会对真实的表有碎片,在查询表真实空间大小前整理表碎片:
mysql>optimize table b1;
mysql>optimize table b2;
mysql>optimize table b3;
# du -sb b*
2868903936 b1.ibd
2868903936 b2.ibd
2868903936 b3.ibd
结论:当使用bool类型和bit(1) 的时候和tinyint是存储上是等同的。

7.查询验证:
对于b1表正确的查询语句:(b1表使用bit类型)
 select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2;
对于b2表正确的查询语句:(b2表使用tinyint类型)
select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2;
对于b3表正确的查询语句:(b3表使用boolean类型)
select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2;

现在对B1、B2、B3表均采用上述查询:
--B1表查询:
mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

查询结果一致。

--B2表:
mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

--B3表:
mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

上述三个语句在查询的时候均可直接查询出结果。



8.对b字段加索引查询:
mysql> alter table b1 add key ix_b(b);
Query OK, 0 rows affected (11.45 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table b2 add key ix_b(b);
Query OK, 0 rows affected (11.93 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table b3 add key ix_b(b);
Query OK, 0 rows affected (11.59 sec)
Records: 0  Duplicates: 0  Warnings: 0

--对B1表查询的执行计划:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
查询走了执行计划。
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: no matching row in const table
1 row in set, 1 warning (0.00 sec)
提示信息:没有匹配的数据。

mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.02 sec)

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9858444
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.02 sec)
全表扫描。

mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

走索引。
实际查询一次:
mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2;
Empty set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

结论:针对bit类型 最正确的查询方式是 
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2;
其他的会发生隐式类型转换。


针对tinyint类型的执行计划:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b is false limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b2
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9858444
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)
全表扫描。
如下的查询语句执行计划如下:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b2
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
查询语句:
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=0 limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b='0' limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=false limit 2;

--针对boolean类型:
查询的执行计划如B2.

分析:
MySQL把BIT当做字符串类型, 而不是数据类型。当检索BIT(1)列的值, 结果是一个字符串且内容是二进制位0或1, 而不是ASCII值”0″或”1″.
结论:
对于bit 类型的数值不使用索引,mysql 检索bit的值是不管是数值还是字符,mysql会对where 条件进行类型转化,将字符转换为数值,并比较数值对应的ascii码,如果值为1,则返回结果,否则,结果为空。

9.bit类型在ETL中:
针对boolean类型在MySQL数据库中是以0和1存储的,查询可以直接查询出。
采用bit(1)类型存储需要做转换:
mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)
可以看到如不做转换的时候则查询出来为空,在做数据ETL的时候对其他数据库不便于读取。

10.MySQL bit类型的bool运算和bit类型的存储:
mysql> select b'0'=0,b'0'='0',b'110000'='0',b'110000'+0;
+--------+----------+---------------+-------------+
| b'0'=0 | b'0'='0' | b'110000'='0' | b'110000'+0 |
+--------+----------+---------------+-------------+
|      1 |        0 |             1 |          48 |
+--------+----------+---------------+-------------+
1 row in set (0.03 sec)

mysql> select b'1'=1,b'1'='1',b'110001'='1',b'110001'+0;
+--------+----------+---------------+-------------+
| b'1'=1 | b'1'='1' | b'110001'='1' | b'110001'+0 |
+--------+----------+---------------+-------------+
|      1 |        0 |             1 |          49 |
+--------+----------+---------------+-------------+
1 row in set (0.00 sec)
结论:b'110000'为十进制的48,而不是字符串0.

验证:
mysql> create table b(a bit(64));
Query OK, 0 rows affected (0.04 sec)

mysql> create table bb(a bit(65));
ERROR 1439 (42000): Display width out of range for column 'a' (max = 64)

--插入数据:
insert into b values(b'1'),(b'0'),(b'01000001'),(b'01011010'),(b'01100001'),(b'00110000'),(b'00110001');
mysql> select a,a+0,bin(a),oct(a),hex(a) from b;
+----------+------+---------+--------+--------+
| a        | a+0  | bin(a)  | oct(a) | hex(a) |
+----------+------+---------+--------+--------+
|          |    1 | 1       | 1      | 1      |
|          |    0 | 0       | 0      | 0      |
|        A |   65 | 1000001 | 101    | 41     |
|        Z |   90 | 1011010 | 132    | 5A     |
|        a |   97 | 1100001 | 141    | 61     |
|        0 |   48 | 110000  | 60     | 30     |
|        1 |   49 | 110001  | 61     | 31     |
+----------+------+---------+--------+--------+
7 rows in set (0.02 sec)

mysql> select lpad(bin(a),8,'0') la,a,a+0,bin(a),oct(a),hex(a) from b;
+----------+----------+------+---------+--------+--------+
| la       | a        | a+0  | bin(a)  | oct(a) | hex(a) |
+----------+----------+------+---------+--------+--------+
| 00000000 |          |    0 | 0       | 0      | 0      |
| 00000001 |          |    1 | 1       | 1      | 1      |
| 00110000 |        0 |   48 | 110000  | 60     | 30     |
| 00110001 |        1 |   49 | 110001  | 61     | 31     |
| 01000001 |        A |   65 | 1000001 | 101    | 41     |
| 01011010 |        Z |   90 | 1011010 | 132    | 5A     |
| 01100001 |        a |   97 | 1100001 | 141    | 61     |
+----------+----------+------+---------+--------+--------+
7 rows in set (0.00 sec)

结论:bit类型表示方式为bit(m),M的取值范围为1到64.bit类型存储的二进制字符串。
bit类型的数据范围为bit(1) 到bit(64),换算成十进制范围0到2^64减1;
tinyint unsigned的数据范围,十进制表示为0到255.
bit的存储近似为(M+7)/8 bytes,而tinyint为1byte。
当使用bit(1)和tinyint的时候使用的存储空间一致。


11.JDBC和bit类型:

MySQL Type Name	Return value of GetColumnTypeName	Return value of GetColumnClassName
BIT(1)	    BIT	     java.lang.Boolean
BIT( > 1)	BIT	     byte[]
TINYINT	    TINYINT	 java.lang.Boolean if the configuration property tinyInt1isBit is set to true (the default) and the storage size is 1, or java.lang.Integer if not.
在一些应用场景下tinyint会被默认读取为true或者false,而不是想要的数据0和1.
jdbc会把tinyint 认为是java.sql.Types.BIT
此时需要在连接配置上使用类似如下的配置:
jdbc:mysql://localhost/databaseName?tinyInt1isBit=false

结论:
从易于读取性和大众接受程度、软件应用的通用性上推荐使用tinyint表示是和否。

猜你喜欢

转载自blog.csdn.net/vkingnew/article/details/82687471