Problem phenomenon
##### 统一监控告警平台-DB: [触发]
> `告警集群:` jq-prod-db-prometheus
> `告警描述:` 数据库Oracle堵塞会话超过5分钟,会话ID:6055
> `告警环境:` prod
> `告警类型:` oracle_prod
> `告警实例:` nbadb02
> `告警IP:` 11.11.11.11
> `实例别名:` nbadb02
> `当前数值:` 2
> `告警级别:` 严重
> `触发时间:` 2021-03-16 14:11:59
> `持续时间:` 10s
> `告警次数:` 1
> `告警应用:`nba_oracle
> `应用负责人:`
Process
- Log in
11.11.11.11
to oracle and set the column width (if you use plsql, you don’t need to set it)
[bzops@nbadb02 ~]$ sudo -i
[root@nbadb02 ~]# su - oracle
[oracle@nbadb02 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 2 15:05:36 2021
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>
SQL> col sample_time for a30
col event for a30
col program for a30
col machine for a40
set lines 200 pages 999
- Check the current active transaction and find that the session
6055
sum 1573
has started at 14:07, and it has not ended yet,
and it last_call_et
is 0 at the same time , indicating that the session is still executing SQL, and there is no idle . The undo
amount 6055
occupied has reached 11M. , Indicating that the transaction is relatively large
SQL> select a.inst_id inst,A.sid,a.serial#,a.blocking_session blk,a.last_call_et call,a.status,a.sql_id,a.PLSQL_ENTRY_OBJECT_ID plsql,PREV_SQL_ID,b.start_date,a.event,b.USED_UBLK*8/1024 m,a.seconds_in_wait wait,a.osuser,a.program,a.logon_time,b.xid,a.machine
from gv$session a,gv$transaction b where a.taddr=b.addr and b.addr is not null and a.inst_id=b.inst_id order by 5 desc;
INST SID SERIAL# BLK CALL STATUS SQL_ID PLSQL PREV_SQL_ID START_DATE EVENT M WAIT OSUSER PROGRAM LOGON_TIME XID MACHINE
1 2 1545 7417 0 ACTIVE crwj8kgsbsghz crwj8kgsbsghz 2021/3/16 14:14:59 SQL*Net message to client 0.046875 0 nbauser JDBC Thin Client 2021/3/16 14:11:16 58001500481BE003 nba-service-daemon-58f676dcf6-bk8rp
2 2 6059 34681 0 INACTIVE 2xuyhc81va9jk 2021/3/16 14:11:49 SQL*Net message from client 0.8984375 0 nbauser JDBC Thin Client 2021/3/16 14:03:07 F6000400896A0A01 nba-service-mq-6c654b54df-dzw62
3 2 6055 46555 0 INACTIVE dxupr3kq3k911 2021/3/16 14:07:10 SQL*Net message from client 11.4609375 0 nbauser JDBC Thin Client 2021/3/16 14:06:26 3F001C004161E604 nba-service-web-64cd6dcc69-w45f2
4 2 1573 24181 0 INACTIVE 2uy2z5g8ky2ab 2021/3/16 14:07:01 SQL*Net message from client 1.578125 0 nbauser JDBC Thin Client 2021/3/16 13:07:09 D3000D00D01C6801 nba-service-mq-6c654b54df-pqgp6
- Checking the conversation history, you can see that many conversations are
6055
blocked, which confirms that 6055
the transactions checked above are relatively large
SQL> select sample_time,session_id as sid,sql_id,event,plsql_entry_object_id as plsql,sql_plan_hash_value as plan,sql_plan_line_id as line,sql_exec_id as execid,blocking_session blk,program as program,xid,user_id ,machine from gv$active_session_history
where blocking_session is not null ;
SAMPLE_TIME SID SQL_ID EVENT PLSQL PLAN LINE EXECID BLK PROGRAM XID USER_ID MACHINE
1 16-MAR-21 02.14.26.440 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f2
2 16-MAR-21 02.14.26.440 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
3 16-MAR-21 02.14.26.440 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
4 16-MAR-21 02.14.26.440 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
5 16-MAR-21 02.14.26.440 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf
6 16-MAR-21 02.14.26.440 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn
7 16-MAR-21 02.14.26.440 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
8 16-MAR-21 02.14.26.440 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs8
9 16-MAR-21 02.14.25.440 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
10 16-MAR-21 02.14.25.440 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f2
11 16-MAR-21 02.14.25.440 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
12 16-MAR-21 02.14.25.440 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
13 16-MAR-21 02.14.25.440 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf
14 16-MAR-21 02.14.25.440 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn
15 16-MAR-21 02.14.25.440 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
16 16-MAR-21 02.14.25.440 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs8
17 16-MAR-21 02.14.24.430 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn
18 16-MAR-21 02.14.24.430 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs8
19 16-MAR-21 02.14.24.430 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
20 16-MAR-21 02.14.24.430 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf
21 16-MAR-21 02.14.24.430 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
22 16-MAR-21 02.14.24.430 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
23 16-MAR-21 02.14.24.430 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
24 16-MAR-21 02.14.24.430 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f2
25 16-MAR-21 02.14.23.430 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
26 16-MAR-21 02.14.23.430 PM 1731 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065086 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-w45f2
27 16-MAR-21 02.14.23.430 PM 11724 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065049 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-76hs8
28 16-MAR-21 02.14.23.430 PM 1773 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064803 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
29 16-MAR-21 02.14.23.430 PM 3742 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064614 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-dtkz6
30 16-MAR-21 02.14.23.430 PM 6108 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064714 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
31 16-MAR-21 02.14.23.430 PM 7261 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49065133 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-5h6zf
32 16-MAR-21 02.14.23.430 PM 9541 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064892 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-wcctn
33 16-MAR-21 02.14.22.400 PM 10690 08qkr18cnk9rk enq: TX - row lock contention 585082728 1 49064975 6055 JDBC Thin Client 101 nba-service-web-64cd6dcc69-64v4t
................
- Now that the blocker is known
6055
, not only is 6055
a session not over yet, and the session is not idle, the
next step is to find out the 6055
running sql and send it to the development to evaluate whether the transaction is normal or not
. The session can be recorded through sql_trace in Oracle. over a period of time to run what sql
(if sql execution time is greater than one second, in fact, through the active session history v $ active_session_history can also be positioned, but 6055
the last_call_et
is 0, indicating sql execution is less than one second)
SQL> exec dbms_system.set_sql_trace_in_session(6055,46555,true);
PL/SQL procedure successfully completed.
间隔1分钟
SQL> exec dbms_system.set_sql_trace_in_session(6055,46555,false);
PL/SQL procedure successfully completed.
- After collecting sql_trace, enter the oracle trace directory, sort by time, you can see the latest log beginning with ora_
-rw-r----- 1 oracle asmadmin 74K Mar 16 14:15 nba2_ora_149894.trm
-rw-r----- 1 oracle asmadmin 2.3M Mar 16 14:15 nba2_ora_149894.trc
- Format it through the tkprof tool
[oracle@nbadb02 trace]$ tkprof nba2_ora_149894.trc
output = bk.log
TKPROF: Release 11.2.0.4.0 - Development on Tue Mar 16 14:16:08 2021
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
[oracle@nbadb02 trace]$ sz bk.log
- Filter out the recursive sql of the sys user in bk.log, you can see the
6055
executed sql logic
SQL ID: dzns8cts4u1a4 Plan Hash: 383260619
update tab1 set col1=:1 , col2=:2 , col3=
:3 , col4=:4 , col5=:5 , col6=:6 , col7=:7 , col8=:8 ,
col9=:9 , col10=:10 , col11=:11 , col12=:12 , col13=:13
where
ID=:14 and VERSION=:15
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1878 0.03 0.05 0 0 0 0
Execute 1878 0.28 0.46 202 9475 17465 1878
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3756 0.31 0.51 202 9475 17465 1878
....
SQL ID: dxupr3kq3k911 Plan Hash: 743517995
select sysinterfa0_.ID as ID400_, sysinterfa0_.BRAND_ID as BRAND2_400_,
sysinterfa0_.CONTENT as CONTENT400_, sysinterfa0_.CREATE_TIME as
CREATE4_400_, sysinterfa0_.ERROR_COUNT as ERROR5_400_, sysinterfa0_.OP_TIME
as OP6_400_, sysinterfa0_.REMARK as REMARK400_, sysinterfa0_.SHOP_ID as
SHOP8_400_, sysinterfa0_.SLIP_CODE as SLIP9_400_, sysinterfa0_.SOURCE_SYS
as SOURCE10_400_, sysinterfa0_.STATUS as STATUS400_,
sysinterfa0_.TARGET_SYS as TARGET12_400_, sysinterfa0_.TYPE as TYPE400_,
sysinterfa0_.VERSION as VERSION400_
from
tab2 sysinterfa0_ where sysinterfa0_.SLIP_CODE=:1 and
sysinterfa0_.TYPE=:2 and sysinterfa0_.SOURCE_SYS=:3 and
sysinterfa0_.TARGET_SYS=:4 and sysinterfa0_.BRAND_ID=:5
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1913 0.03 0.04 0 0 0 0
Execute 1913 0.02 0.03 0 0 0 0
Fetch 1913 0.04 0.09 55 9854 0 1878
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5739 0.10 0.17 55 9854 0 1878
....
select shop.id
from tab1 sku,
tab2 pro,
tab3 br,
tab4 shop
where sku.product_id = pro.id
and pro.brand_id = br.brand_id
and br.shop_id = shop.id
and sku.code = :1
and shop.IS_SYNC_BMS = 1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 382 0.00 0.00 0 0 0 0
Execute 382 0.00 0.00 0 0 0 0
Fetch 382 0.11 0.14 1 15280 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1146 0.12 0.16 1 15280 0 0
.....
Solution
- After sending the above sql to the development and analysis, feedback that the transaction logic has room for optimization, and now we are doing bulk inventory operations. Blocking can be ignored, and the logic will be optimized in the future