How to Trace and Diagnose Blocked or Hung Sessions

 
Blocked or Hung sessions are usually the culprit when application batch jobs running on a database suddenly take longer than expected to complete or sometimes don’t complete at all.

I. Identify the Database Session that is Being Blocked

To trace these sessions you first need the SID of a session that you believe is currently being blocked. Executing the statement below will help you identify the blocked session and the resource wait events associated with it.

select event,
p1,
p2,
p3
from v$session_wait
where sid='' and wait_time=0

If you do not know what session is currently being blocked, you can execute the query below to check for all waiting sessions (potentially being blocked) on the database.

select sid,
event,
seq#,
p1,
p1raw,
p2,
p3
from v$session
where wait_time=0;

wait_time=0 indicates that the session is currently waiting for a resource
wait_time greater than 0 indicates the session’s last wait time

II. Determine Whether the Database Session is Really Being Blocked

Oracle processes will be in one of two possible states for the duration of their existence:

o They will be waiting for a resource or call
o They will be on the CPU

This means that sessions/processes on the CPU are not being blocked and are running. Processes waiting will either wait for a resource (an enqueue, a lock , a latch etc), or a call ( i.e. sql-net message from client, db file sequential read). In the case of calls, the session is rarely blocked but if the session is waiting for a resource then it is usually blocked.

Always make sure to execute the query multiple times to verify that it is indeed a genuine blocker. Normal processes will sometimes have to wait for resources but these should not take long.

For a complete list of wait events, check the Oracle documentation here.
Or you can execute this statement below:
SELECT name FROM V$EVENT_NAME ORDER BY name;

III. Identify the Blocker and the Related Resource Type being Held by the Blocker

ENQUEUE

select sid,type,lmode,request,id1,id2,request from v$lock
where id1={p2 from select against v$session}
and id2={p3 from select against v$session};

The following table can be used to determine which sessions with lmode > 0 are blocking:-

LMODE Description Name NULL SS SX S SSX X
—– ———– ——- —– —– —– —– —– —–
0,1 No Lock NULL YES YES YES YES YES YES
2 Row-Share SS YES YES YES YES YES no
3 Row-Exclusive SX YES YES YES no no no
4 Share S YES YES no YES no no
5 Share Row-Excl SSX YES YES no no no no
6 Exclusive X YES no no no no no

LATCH

select sid from v$latchholder where laddr=&p1raw; — where &p1raw is p1raw from select against v$session

PIN

SELECT s.sid, case kglpnmod when 2 then 'S' when 3 then 'X' end "Mode",
case kglpnreq when 2 then 'S' when 3 then 'X' end "Req"
FROM x$kglpn p, v$session s
AND kglpnhdl='&p1raw';
— where &p1raw is p1raw from select against v$session
NOTE: An X request will be blocked by any pins held in S mode on the object.
An S request will be blocked by any X mode pin held, or may queue behind some other X request (as X requests take precedence over
current S requests and do not operate on a FIFO basis

LIBRARY CACHE LOCK

select s.sid,
case x.KGLLKMOD when 1 then 'NULL' when 2 then 'S' when 3 then 'X' end "Mode",
case x.KGLLKREQ when 1 then 'NULL' when 2 then 'S' when 3 then 'X' end "Req",
x.KGLNAOBJ "Object Name"
from v$session s,x$kgllk x
where x.KGLLKADR=s.saddr
and KGLLKHDL='&p1raw';
— where &p1raw is p1raw from select against v$session

LIBRARY CACHE LOAD LOCK

Not currently possible

BUFFER BUSY WAIT

Not currently possible

ROW CACHE LOCK

select s.sid,
case KQRFPMOD when 0 then 'NULL' when 3 then 'S' when 5 then 'X' end "Mode",
case KQRFPREQ when 0 then 'NULL' when 3 then 'S' when 5 then 'X' end "Req"
from v$session s, X$KQRFP x
where x.KQRFPSES=s.saddr
and KQRFPCID='&p1raw';
— where &p1raw is p1raw from select against v$session

IV. What To do After Identifying the Blocker

The following describes the possible scenarios that will occur after the steps performed above. But it is important to note that the easiest way to get rid of the blocking session is by killing it (that is assuming the blocking session isn’t a critical Oracle Database process).
alter system disconnect session 'sid,serial#' immediate;
Don’t forget to take the session details and errorstacks of the blocker before killing it for root cause analysis. Also, you might need to make sure that the session owner is aware that he/she is blocking an important batch job.

1. For blocking sessions with untraceable resource waits i.e. BUFFER BUSY WAIT and LIBRARY CACHE LOAD LOCK, you need to provide a systemstate dump for Oracle Support to analyze. You need to login as sysdba and execute the following:
alter session set events 'immediate trace name systemstate level 266';
This should be executed twice, one right after the other.
A detailed explanation of how to take systemstate dumps for RAC environments go to thisĀ link.

2. Another Scenario is when there is no blocker at all. If this is the case, you need to consider running an AWR and doing some performance analysis of your queries which is not covered by this article.

3. The session you are investigating has come to a deadlock.
For instance, say my first blocking session is SID 147 and waits for an enqueue
SID 167 holds the enqueue and waits for a library cache lock
SID 187 holds the library cache lock and waits for a library cache pin
SID 167 holds the pin

Here we have a deadlock between SID 187 and 167 (SID 147 is just a victim – we are not really interested in
it any more as we know once 187/167 are clear it will acquire the enqueue). We need to take errorstacks from SID
167 and 187.

Taking errorstacks

First note all the SIDs you believe are causing the problem (blockers not blocked/victims)

select process from v$session where sid={sid};
For each one login as sysdba and run:

oradebug setospid {process}
oradebug dump errorstack 3 (Do this 3 times, one after the other}
This will write a trace file to user_dump_dest (The filename will have the spid in the name)
Oracle Support will need these files.

If you need more help with database issues, feel free to send me an email (jimmycdo@yahoo.com) and I'll gladly help.

About Jimbart