[Turn] three MySql knowledge - index, locks, transaction

index

Index, similar to the book catalog, can be found immediately in accordance with a corresponding content pages directory.

Advantage of indexes: 1. sorting born. 2. Fast Find.
Shortcoming index: 1 space. 2. reduce the speed of updating the table.

Precautions: Use a small table full table scan faster, only large table using the index. Super table index largely ineffective.

From the index achieved, divided into two kinds: the clustered index and a secondary index (also called secondary index or non-clustered index)

From the functions that are divided into six kinds: the general index, the only index, primary key index, a composite index, foreign key indexes, full-text indexing.

Explain in detail six kinds of indexes:

  1. Ordinary Index: basic index, without any constraints.
  2. The only index: Similar to a normal index, but with a uniqueness constraint.
  3. Primary key index: a special unique index, does not allow nulls .
  4. Composite Index: The index combined to create a plurality of columns, may cover a plurality of columns.
  5. Foreign Key Index: Table Only type InnoDB foreign key index can only be used to ensure data consistency, integrity, and cascading operation.
  6. Full-text index: MySQL comes with the full-text index can only be used InnoDB, MyISAM, and can only be conducted in English full-text search, the general use of full-text indexing engine (ES, Solr).
  • Note: The primary key is a unique index, but the index is not necessarily the only primary key, a unique index may be null, but null can have only one primary key can not be null.

Further, InnoDB primary key cluster data, if not defined a primary key and is not defined clustered index, MySql will select a unique non-empty index in place, if there is no such index, implicitly define a 6-byte main key as a clustered index users can not view or access.

Simply put:

  1. When you set a primary key, it will automatically generate a unique index, clustered index if not before, the main key is the clustered index.

  2. When the primary key is not set, you select a unique index is not empty as an aggregate index, if not, it generates a 6-byte implicit index.

MySql to store the data in accordance with the page, the default one is 16kb, when you're in a query, not just a piece of data is loaded, but the page where the data are loaded into pageCache, this fact and the nearest access principle OS similar.

MySql using B + tree index structure. B + tree before saying, first talk about the B-tree, B-tree is a multi-balanced search trees, compared to the normal binary tree, extreme imbalance does not occur, but also multiplexed.

Features B-tree: He would also save the data in the non-page child nodes.

Figure shows:

And this feature can result in non-page child nodes can not store a large number of indexes.

The B + Tree is for this to B tree is optimized. As shown below:

We have seen, B + Tree data all data is saved to the leaf node, sub-node Definitely save only the index and pointer.

We assume that a non-child node 16KB page, each index, i.e., the primary key is bigint, i.e. 8B, pointer 8b. Then the page index can store about 1000 (16kb / 8b + 8b).

And a three-tier B + tree index How much can store it? As shown below:

About the index can store 1 billion. Typically the height of the B + tree in layer 2-4, since MySql at runtime, memory resident root node, so only once for each approximately 2-3 times IO. We can say that the design of the B + tree, is to design according to the characteristics of mechanical disk.
Know the index design, we are able to know some additional information:

  1. MySql primary key can not be too large, if such a UUID, B + will waste non-leaf nodes of the tree.

  2. MySql primary key is preferably a self-energizing, if this UUID, each insert will adjust the B + tree, causing page splitting, seriously affect the performance.

So, if the project uses sub-library sub-table, we usually need a primary key sharding, how to do it? In the implementation, we can keep incrementing primary key, and the logical primary key can be used as a unique index.

Lock mechanism

About Mysql in the lock, various concepts will be spewing out, in fact, there are several dimensions of the lock, we explain.

  1. Type Dimensions
  • Shared lock (read lock / S lock)

  • Exclusive lock (write lock / X locks)

  • Type Segmentation:
    • Intention shared lock
    • Intent exclusive (mutex) lock
  • Pessimistic lock (a lock, that is for update)
  • Optimistic locking (with a version number field, similar to the CAS mechanisms that control the user's own shortcomings: high concurrent time, a lot more useless retries)

  1. Lock granularity (particle size dimension)
  • Table lock

  • 页锁(Mysql BerkeleyDB 引擎)

  • 行锁(InnoDB)

  1. 锁的算法(算法维度)
  • Record Lock(单行记录)

  • Gap Lock(间隙锁,锁定一个范围,但不包含锁定记录)

  • Next-Key Lock(Record Lock + Gap Lock,锁定一个范围,并且锁定记录本身, MySql 防止幻读,就是使用此锁实现)

  1. 默认的读操作,上锁吗?
  • 默认是 MVCC 机制(“一致性非锁定读”)保证 RR 级别的隔离正确性,是不上锁的。

可以选择手动上锁:select xxxx for update (排他锁); select xxxx lock in share mode(共享锁),称之为“一致性锁定读”。

使用锁之后,就能在 RR 级别下,避免幻读。当然,默认的 MVCC 读,也能避免幻读。

既然 RR 能够防止幻读,那么,SERIALIZABLE 有啥用呢?

防止丢失更新。例如下图:

这个时候,我们必须使用 SERIALIZABLE 级别进行串行读取。

最后,行锁的实现原理就是锁住聚集索引,如果你查询的时候,没有正确地击中索引,MySql 优化器将会抛弃行锁,使用表锁。

事务

事务是数据库永恒不变的话题, ACID:原子性,一致性,隔离性,持久性。

四个特性,最重要的就是一致性。而一致性由原子性,隔离性,持久性来保证。

  • 原子性由 Undo log 保证。Undo Log 会保存每次变更之前的记录,从而在发生错误时进行回滚。

  • 隔离性由 MVCC 和 Lock 保证。这个后面说。

  • 持久性由 Redo Log 保证。每次真正修改数据之前,都会将记录写到 Redo Log 中,只有 Redo Log 写入成功,才会真正的写入到 B+ 树中,如果提交之前断电,就可以通过 Redo Log 恢复记录。

然后再说隔离性。

隔离级别:

  1. 未提交读(RU)
  2. 已提交读(RC)
  3. 可重复读(RR)
  4. 串行化(serializable)
    每个级别都会解决不同的问题,通常是3 个问题:脏读,不可重复读,幻读。一张经典的图:

这里有个注意点,关于幻读,在数据库规范里,RR 级别会导致幻读,但是,由于 Mysql 的优化,MySql 的 RR 级别不会导致幻读:在使用默认的 select 时,MySql 使用 MVCC 机制保证不会幻读;你也可以使用锁,在使用锁时,例如 for update(X 锁),lock in share mode(S 锁),MySql 会使用 Next-Key Lock 来保证不会发生幻读。前者称为快照读,后者称为当前读。
原理剖析:

  • RU 发生脏读的原因:RU 原理是对每个更新语句的行记录进行加锁,而不是对整个事务进行加锁,所以会发生脏读。而 RC 和 RR 会对整个事务加锁。

  • RC 不能重复读的原因:RC 每次执行 SQL 语句都会生成一个新的 Read View,每次读到的都是不同的。而 RR 的事务从始至终都是使用同一个 Read View。

  • RR 不会发生幻读的原因: 上面说过了。
    那 RR 和 Serializble 有什么区别呢?答:丢失更新。本文关于锁的部分已经提到。

MVCC 介绍:全称多版本并发控制。

innoDB 每个聚集索引都有 4 个隐藏字段,分别是主键(RowID),最近更改的事务 ID(MVCC 核心),Undo Log 的指针(隔离核心),索引删除标记(当删除时,不会立即删除,而是打标记,然后异步删除);

本质上,MVCC 就是用 Undo Log 链表实现。

MVCC 的实现方式:事务以排它锁的方式修改原始数据,把修改前的数据存放于 Undo Log,通过回滚指针与数据关联,如果修改成功,什么都不做,如果修改失败,则恢复 Undo Log 中的数据。

多说一句,通常我们认为 MVCC 是类似乐观锁的方式,即使用版本号,而实际上,innoDB 不是这么实现的。当然,这不影响我们使用 MySql。

Guess you like

Origin www.cnblogs.com/Tu9oh0st/p/11229344.html