Oracle 修改密码'library cache lock'事件导致用户被hang住

版权声明:本文为作者原创,转载请标明出处! https://blog.csdn.net/kiral07/article/details/87876185

Oracle 修改密码'library cache lock'事件导致用户被hang住

一、案例背景

某项目Oracle 11g rac环境下,修改用户密码时会话被hang住,并且客户端用户在登录时也会被hang住。
现象如下:

SQL> alter user XXX identified by xxxxx;----->在修改时就会hang住,会话一直在等待

SQL> conn XXX/xxxx;----->无法进入用户会话

此时抓取hang Analyze以及做Systemstate Analysis获取trace分析

sqlplus -prelim / as sysdba
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
--等一分钟后再次执行hanganalyze分析
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 266
oradebug -g all dump systemstate 266

分析dump文件发现有上百个”library cache lock“等待事件,并且源头都是来自于同一个会话。

附trace log:

*** 2019-02-21 17:33:14.660
===============================================================================
HANG ANALYSIS:
  instances (db_name.oracle_sid): zjedb.zjedb1, zjedb.zjedb2
  oradebug_node_dump_level: 3
  analysis initiated by oradebug
  os thread scheduling delay history: (sampling every 1.000000 secs)
    0.000000 secs at [ 17:33:14 ]
      NOTE: scheduling delay has not been sampled for 0.340580 secs    0.000000 secs from [ 17:33:10 - 17:33:15 ], 5 sec avg
    0.000000 secs from [ 17:32:15 - 17:33:15 ], 1 min avg
    0.000000 secs from [ 17:28:14 - 17:33:15 ], 5 min avg
  vktm time drift history
===============================================================================
 
Chains most likely to have caused the hang:
 [a] Chain 1 Signature: <not in a wait><='library cache lock'
     Chain 1 Signature Hash: 0x24734cf
 [b] Chain 2 Signature: <not in a wait><='library cache lock'
     Chain 2 Signature Hash: 0x24734cf
 [c] Chain 3 Signature: <not in a wait><='library cache lock'
     Chain 3 Signature Hash: 0x24734cf
 
===============================================================================
Non-intersecting chains:
 
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (zjedb.zjedb1)
                   os id: 8118
              process id: 96, oracle@zjerac1
              session id: 6
        session serial #: 36237
    }
    is waiting for 'library cache lock' with wait info:
    {
                      p1: 'handle address'=0x89d9bbab0
                      p2: 'lock address'=0x89e0f7560
                      p3: '100*mode+namespace'=0x4f0003
            time in wait: 6 min 47 sec
           timeout after: 8 min 12 sec---->此会话已被阻塞6min以上
                 wait id: 7
                blocking: 0 sessions
             current sql: <none>
             short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+2022<-kslwaitctx()+163<-kjusuc()+3400<-ksipgetctxi()+1759<-kqlmLock()+2679<-kqlmClusterLock()+191<-kgllkal()+2183<-kglLock()+1308<-kglget()+343<-kziasfc()+3126<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
            wait history:
              * time between current wait and wait #1: 0.000102 sec
              1.       event: 'library cache: mutex X'
                 time waited: 0.010957 sec
                     wait id: 6               p1: 'idn'=0x76e11533
                                              p2: 'value'=0xc400000000
                                              p3: 'where'=0x50
              * time between wait #1 and #2: 0.000045 sec
              2.       event: 'latch: ges resource hash list'
                 time waited: 0.003486 sec
                     wait id: 5               p1: 'address'=0x8723bf798
                                              p2: 'number'=0x70
                                              p3: 'tries'=0x0
              * time between wait #2 and #3: 0.000034 sec
              3.       event: 'library cache lock'
                 time waited: 5 min 13 sec
                     wait id: 4               p1: 'handle address'=0x89d9bbab0
                                              p2: 'lock address'=0x89e0f7460
                                              p3: '100*mode+namespace'=0x4f0002
    }
    and is blocked by---->阻塞的源头
 => Oracle session identified by:
    {
                instance: 1 (zjedb.zjedb1)
                   os id: 7964
              process id: 140, oracle@zjerac1
              session id: 2273 ----->此为阻塞源头的会话sid
        session serial #: 52199
    }
    which is not in a wait:
    {
               last wait: 0.997701 sec ago
                blocking: 388 sessions----->阻塞了300多个会话,等待很严重
             current sql: <none>
             short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-__nanosleep()+16<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
            wait history:
              1.       event: 'library cache lock'
                 time waited: 6 min 46 sec
                     wait id: 7               p1: 'handle address'=0x89d9bbab0
                                              p2: 'lock address'=0x88edfd2a8
                                              p3: '100*mode+namespace'=0x4f0003
              * time between wait #1 and #2: 0.000314 sec
              2.       event: 'library cache load lock'
                 time waited: 0.001645 sec
                     wait id: 6               p1: 'object address'=0x89d9bbab0
                                              p2: 'lock address'=0x88edfd0a8
                                              p3: '100*mask+namespace'=0x4f0003
              * time between wait #2 and #3: 0.000079 sec
              3.       event: 'latch: ges resource hash list'
                 time waited: 0.002859 sec
                     wait id: 5               p1: 'address'=0x8723bf798
                                              p2: 'number'=0x70
                                              p3: 'tries'=0x0
    }
 
Chain 1 Signature: <not in a wait><='library cache lock'
Chain 1 Signature Hash: 0x24734cf
-------------------------------------------------------------------------------
 
===============================================================================
Intersecting chains:
 
-------------------------------------------------------------------------------
Chain 2:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (zjedb.zjedb1)
                   os id: 8149
              process id: 160, oracle@zjerac1
              session id: 7
        session serial #: 24737
    }
    is waiting for 'library cache lock' with wait info:
    {
                      p1: 'handle address'=0x89d9bbab0
                      p2: 'lock address'=0x88f8d4ed0
                      p3: '100*mode+namespace'=0x4f0003
            time in wait: 6 min 47 sec
           timeout after: 8 min 12 sec
                 wait id: 6
                blocking: 0 sessions
             current sql: <none>
             short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+2022<-kslwaitctx()+163<-kjusuc()+3400<-ksipgetctxi()+1759<-kqlmLock()+2679<-kqlmClusterLock()+191<-kgllkal()+2183<-kglLock()+1308<-kglget()+343<-kziasfc()+3126<-kpolnb()+6976<-kpoauth()+872<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
            wait history:
              * time between current wait and wait #1: 0.000130 sec
              1.       event: 'latch: ges resource hash list'
                 time waited: 0.005830 sec
                     wait id: 5               p1: 'address'=0x8723bf798
                                              p2: 'number'=0x70
                                              p3: 'tries'=0x0
              * time between wait #1 and #2: 0.000044 sec
              2.       event: 'library cache lock'
                 time waited: 5 min 13 sec
                     wait id: 4               p1: 'handle address'=0x89d9bbab0
                                              p2: 'lock address'=0x88f8d4dd0
                                              p3: '100*mode+namespace'=0x4f0002
              * time between wait #2 and #3: 0.000216 sec
              3.       event: 'latch: ges resource hash list'
                 time waited: 0.005371 sec
                     wait id: 3               p1: 'address'=0x8723bf798
                                              p2: 'number'=0x70
                                              p3: 'tries'=0x0
    }
    and is blocked by 'instance: 1, os id: 7964, session id: 2273',-->其余会话也是被这个sid阻塞
    which is a member of 'Chain 1'.
 
Chain 2 Signature: <not in a wait><='library cache lock'
Chain 2 Signature Hash: 0x24734cf


 ----------------------------------------
    SO: 0x87998d798, type: 4, owner: 0x869081560, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x869081560, name=session, file=ksu.h LINE:12729, pg=0
    (session) sid: 2273 ser: 52201 trans: (nil), creator: 0x869081560
              flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
              flags2: (0x9) -/-/INC
              DID: , short-term DID: 
              txn branch: (nil)
              edition#: 0              oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
    ksuxds FALSE at location: 0
    service name: zjedb
    client details:---->被阻塞的客户端信息
      O/S info: user: WIN-0ELFKQG4A9V$, term: unknown, ospid: 1234
      machine: WIN-0ELFKQG4A9V program: JDBC Thin Client
      application name: JDBC Thin Client, hash value=2546894660
    Current Wait Stack:
     0: waiting for 'library cache lock'
        handle address=0x89d9bbab0, lock address=0x886dac9b0, 100*mode+namespace=0x4f0002
        wait_id=2 seq_num=3 snap_id=1
        wait times: snap=1 min 54 sec, exc=1 min 54 sec, total=1 min 54 sec
        wait times: max=15 min 0 sec, heur=1 min 54 sec
        wait counts: calls=231 os=231
        in_wait=1 iflags=0x15a2
    There is at least one session blocking this session.
      Dumping 1 direct blocker(s):
        inst: 1, sid: 1145, ser: 52551
      Dumping final blocker:
        inst: 1, sid: 1145, ser: 52551
    Wait State:
      fixed_waits=0 flags=0x22 boundary=(nil)/-1
    Session Wait History:
        elapsed time of 0.000459 sec since current wait
     0: waited for 'SQL*Net message from client'

二、处理方式

通过分析之后已经知道了’library cache lock’都是被同一个会话锁阻塞,那么将其手动kill方可解决故障。

alter system kill session 'sid,serial#' immediate;

三、拓展研究

在11g中存在”密码验证延迟“功能,此特性主要用来防止数据库密码被暴力破解,在极少数场景的生产系统业务修改密码,会导致大量会话在等待登录验证,造成大量’library cache lock’。

附MOS文档
文档 ID 1309738.1
In Oracle 11g Release 2 and higher, in order to disable the wait between login failures the event 28401 needs to be explicitly enabled:

The event can be set as follows:

alter system set event =“28401 TRACE NAME CONTEXT FOREVER, LEVEL 1” scope=spfile;
To unset the event, set as follows:

SQL> Alter system set event= '28401 trace name context off' scope=spfile ;
---->密码验证延迟关闭
For more information see:

Document:7715339.8 Logon failures causes “row cache lock” waits - Allow disable of logon delay

The following functions match per above bug:
nanosleep nanosleep kziasfc kpolnb kpoauth

Note: Care should be taken when setting this event, as this is disabling the sleep time which can leave the system more vulnerable.

猜你喜欢

转载自blog.csdn.net/kiral07/article/details/87876185