批量更新出现死锁问题

问题描述:

多现象进行批量更新时出现报错,Mysql报Deadlock found when trying to get lock; try restarting transaction

问题分析:

Mysql出现死锁主要是有两种情况

- 通常使用insert、update、delete等操作的时候,数据库会进行锁表操作。需要在条件字段添加上索引条件,使表锁转换成行级锁,减少对数据的锁定。

- 第二种是,数据库出现行级锁(InnoDB的行锁是针对索引加的锁,不是针对记录加的锁,并且该索引不能失效,否则都会从行锁升级为表锁),行锁会先对非主键索引进行锁定在进行主键索引锁定。所以当更新语句操作到非主键索引时,不仅会锁行记录,还会锁住非主键索引。从而占用资源导致死锁。

查询数据库死锁日志:

show engine innodb status\G (仅对参考实例分析)

=====================================
2023-06-20 14:46:28 0x7f732fcff700 INNODB MONITOR OUTPUT # 查询死锁日志的时间
=====================================
Per second averages calculated from the last 37 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 609 srv_active, 0 srv_shutdown, 23969851 srv_idle
srv_master_thread log flush and writes: 0
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 100
OS WAIT ARRAY INFO: signal count 98
RW-shared spins 0, rounds 0, OS waits 0
RW-excl spins 29, rounds 870, OS waits 25
RW-sx spins 1, rounds 30, OS waits 0
Spin rounds per wait: 0.00 RW-shared, 30.00 RW-excl, 30.00 RW-sx
------------------------
LATEST DETECTED DEADLOCK # 最近一次死锁记录
------------------------
2023-06-20 14:46:15 0x7f7350cf3700
*** (1) TRANSACTION:
TRANSACTION 10298, ACTIVE 11 sec starting index read
# 事务id=10298活跃11秒,正在使用索引读取数据行
mysql tables in use 1, locked 1
# 事务正在使用1个表,涉及锁的表1个
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
# 等待3把锁,占用内存1136字节,涉及2行记录,undo log 记录事务已经更新了1条聚集索引记录
MySQL thread id 7623, OS thread handle 140132789073664, query id 6006191 127.0.0.1 root updating
# 这行表示事务的线程信息,操作系统句柄信息、连接来源、用户
update medicine_control set current_count=1 where id='2'
# 死锁涉及的sql语句
*** (1) HOLDS THE LOCK(S): #事务持有的锁
RECORD LOCKS space id 55 page no 4 n bits 88 index PRIMARY of table `jeecg-boot`.`medicine_control` trx id 10298 lock_mode X locks rec but not gap
Record lock, heap no 21 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
# 事务持有的是表`medicine_control`的record lock(记录锁/行锁),空间id是55,页码4,88位处。锁在表的主键PRIMARY 上,是一个X(独占)锁,但不是间隔锁。n_fields表示记录有12列
 0: len 1; hex 31; asc 1;;
 1: len 6; hex 00000000283a; asc     (:;;
 2: len 7; hex 020000012510db; asc     %  ;;
 3: len 6; hex e5a5b6e5a5b6; asc       ;;
 4: len 12; hex e79b98e5b0bce8a5bfe69e97; asc             ;;
 5: len 4; hex 80000001; asc     ;;
 6: len 4; hex 80000005; asc     ;;
 7: len 4; hex 80000000; asc     ;;
 8: len 5; hex 6a65656367; asc jeecg;;
 9: len 5; hex 99a60eadf7; asc      ;;
 10: len 3; hex 6a6f62; asc job;;
 11: len 5; hex 99a75e0780; asc   ^  ;;
 # 打印的记录信息,asc表示接下来要打印出记录里面的可打印字符,不可打印用空格表示。hex表示16进制
 
*** (1) WAITING FOR THIS LOCK TO BE GRANTED: # 事务等待的锁
RECORD LOCKS space id 55 page no 4 n bits 88 index PRIMARY of table `jeecg-boot`.`medicine_control` trx id 10298 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
# 事务等待的锁是表`medicine_control`的record lock(记录锁/行锁),空间id是55,页码4,88位处。锁在表的主键PRIMARY 上,是一个X(独占)锁,但不是间隔锁。n_fields表示记录有12列
 0: len 1; hex 32; asc 2;;
 1: len 6; hex 00000000283b; asc     (;;;
 2: len 7; hex 01000002012bd8; asc      + ;;
 3: len 6; hex e788b7e788b7; asc       ;;
 4: len 6; hex e69f90e69f90; asc       ;;
 5: len 4; hex 80000002; asc     ;;
 6: len 4; hex 80000002; asc     ;;
 7: len 4; hex 80000000; asc     ;;
 8: len 5; hex 6c6979616e; asc liyan;;
 9: len 5; hex 99a67b3730; asc   {70;;
 10: len 3; hex 6a6f62; asc job;;
 11: len 5; hex 99a75e0780; asc   ^  ;;
 
 
*** (2) TRANSACTION: #事务2
TRANSACTION 10299, ACTIVE 7 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 7625, OS thread handle 140133576603392, query id 6006195 127.0.0.1 root updating
update medicine_control set current_count=2 where id='1'
 
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 55 page no 4 n bits 88 index PRIMARY of table `jeecg-boot`.`medicine_control` trx id 10299 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 1; hex 32; asc 2;;
 1: len 6; hex 00000000283b; asc     (;;;
 2: len 7; hex 01000002012bd8; asc      + ;;
 3: len 6; hex e788b7e788b7; asc       ;;
 4: len 6; hex e69f90e69f90; asc       ;;
 5: len 4; hex 80000002; asc     ;;
 6: len 4; hex 80000002; asc     ;;
 7: len 4; hex 80000000; asc     ;;
 8: len 5; hex 6c6979616e; asc liyan;;
 9: len 5; hex 99a67b3730; asc   {70;;
 10: len 3; hex 6a6f62; asc job;;
 11: len 5; hex 99a75e0780; asc   ^  ;;
 
 
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 4 n bits 88 index PRIMARY of table `jeecg-boot`.`medicine_control` trx id 10299 lock_mode X locks rec but not gap waiting
Record lock, heap no 21 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 1; hex 31; asc 1;;
 1: len 6; hex 00000000283a; asc     (:;;
 2: len 7; hex 020000012510db; asc     %  ;;
 3: len 6; hex e5a5b6e5a5b6; asc       ;;
 4: len 12; hex e79b98e5b0bce8a5bfe69e97; asc             ;;
 5: len 4; hex 80000001; asc     ;;
 6: len 4; hex 80000005; asc     ;;
 7: len 4; hex 80000000; asc     ;;
 8: len 5; hex 6a65656367; asc jeecg;;
 9: len 5; hex 99a60eadf7; asc      ;;
 10: len 3; hex 6a6f62; asc job;;
 11: len 5; hex 99a75e0780; asc   ^  ;;
 
*** WE ROLL BACK TRANSACTION (2) # 事务处理结果:事务2回滚
------------
TRANSACTIONS # 当前SESSION和事务,多数的SESSION下的事务都没开始状态
------------
Trx id counter 10301
Purge done for trx's n:o < 10301 undo n:o < 0 state: running but idle
History list length 61
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421608706154464, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706153592, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706152720, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706151848, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706150976, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706150104, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706148360, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706147488, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706146616, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706145744, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706144872, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421608706144000, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 10298, ACTIVE 24 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2
MySQL thread id 7623, OS thread handle 140132789073664, query id 6006198 127.0.0.1 root
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
2048 OS file reads, 24777 OS file writes, 11472 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.59 writes/s, 0.54 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 3 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 2 buffer(s)
Hash table size 34679, node heap has 5 buffer(s)
0.00 hash searches/s, 0.27 non-hash searches/s
---
LOG
---
Log sequence number          2246453180
Log buffer assigned up to    2246453180
Log buffer completed up to   2246453180
Log written up to            2246453180
Log flushed up to            2246453180
Added dirty pages up to      2246453180
Pages flushed up to          2246453180
Last checkpoint at           2246453180
9242 log i/o's done, 0.14 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137363456
Dictionary memory allocated 835752
Buffer pool size   8192
Free buffers       6046
Database pages     2131
Old database pages 788
Modified db pages  0
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 1923, created 208, written 13739
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 2131, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=920, Main thread ID=140133220153088 , state=sleeping
Number of rows inserted 416, updated 2599, deleted 440, read 821958
0.00 inserts/s, 0.08 updates/s, 0.00 deletes/s, 0.11 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

通过 分析数据库死锁日志,可以发现是两条更新语句相信抢夺资源导致的死锁,我的问题其实也差不多,主要有两点:

1.我的记录锁锁在了非主键索引上,导致的资源冲突。所以对于非主键索引的设置尽量设置在不会频繁更新的字段上会比较好,不然一改动,会把索引相关的全部记录都给你锁了。很容易导致死锁问题。

2.普通的更新其实不会造成很严重的锁表,比较速度很快。主要还是我的更新操作嵌套在一个很长的事务当中,事务没有提交,是不会释放获得锁的,长时间占用资源才导致的死锁。

问题解决:

update语句进行不要对非主键索引进行操作,避免更多资源的锁定。

猜你喜欢

转载自blog.csdn.net/weixin_49363689/article/details/131300539