UPDATE T1
SET EXP_DATE = SYSDATE
WHERE P_TIME IS NOT NULL
AND EXP_DATE IS NULL
The update was running on a 20M rows table, only using two NULL/NOT NULL filtering criteria, and updates around 10-20K rows.
Since it run for about an hour, using a FULL TABLE SCAN, in this hour other updates that were running simultaneously on the same table (with different criteria) were suffering from row locks on a daily basis, and this was simply not acceptable.
Analyzing this intrusive update indicated that the update was spending most of its time on IO Waits, as many unneeded blocks were returned by the Exadata storage tier to the Exadata database tier, and only then database tied was able to start applying the null filters ( i.e "P_TIME IS NOT NULL AND EXP_DATE IS NULL").
Index was not an option here. So what is?
Since nulls are used here, Exadata Storage Index (which is part of Exadata offloading/smart scan features) could be the solution, if we just let Oracle use it.
For those of you who are yet unfamiliar with Exadata storage indexes, here is a brief explanation:
Storage Indexes is a very powerful Exadata feature, when used. Don't mix it with Oracle’s traditional B-Tree or bitmapped indexes as they are not stored in the database. Instead, its a feature (in-memory structures) of the storage server software that is designed to eliminate disk I/O. They identify locations where the requested records are not, instead of the other way around.
They store minimum and maximum column values for disk storage units. Because SQL predicates are passed to the storage servers when Smart Scans are performed, the storage software can check the predicates against the Storage Index metadata (maximum and minimum values) before doing the requested I/O. Any storage region that cannot possibly have a matching row is skipped.
In addition to the minimum and maximum values, there is a flag indicating whether any of the records in a storage region contain nulls.
You can read more about exadata storage index in the following link
http://www.oracle.com/technetwork/issue-archive/2011/11-may/o31exadata-354069.html
Anyway, before we drill into how storage indexes tremendously helped in this case, here are some basic info on the update: its text, execution plan, and runtime statistics over the last two weeks:
SQL> select sql_text from dba_hist_sqltext where sql_id='bxcvv372pkr7m';
UPDATE T1 SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL
SQL> select * from table ( DBMS_XPLAN.DISPLAY_AWR('bxcvv372pkr7m',3865564104) );
PLAN_TABLE_OUTPUT
---------------------------------------------------
SQL_ID bxcvv372pkr7m
--------------------
UPDATE T1 SET EXP_DATE = SYSDATE WHERE P_TIME
IS NOT NULL AND EXP_DATE IS NULL
Plan hash value: 3865564104
--------------------------------------------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | TABLE ACCESS STORAGE FULL | T1 |
--------------------------------------------------------------------------------------------------
This above plan (to achieve an easier display, I have removed several fields from it) is showing us that full table scan is being used while running the update, and that smart scan can be used here, theoretically.
Selecting basic statistics on Table T1 shows that it has almost 20M rows over 1M blocks (block size is 8k).
SQL> select num_rows,blocks,last_analyzed from dba_tables a where table_name='T1';
NUM_ROWS BLOCKS LAST_ANAL
------------------- ------------ -----------------
19309303 1096332 28-NOV-13
And here are some statistics on its execution taken from dba_hist_sqlstat (b.t.w., dba_hist_sqlstat is an excellent place to grab overtime statistics information on sql executions):
select trunc(b.begin_interval_time) btime ,
a.plan_hash_value plan_hs,
sum(a.executions_delta) execs,
sum(trunc(a.elapsed_time_delta/ 1000000/decode(a.executions_delta,0,1,a.executions_delta) ,3) ) tm_sec,
sum(trunc(a.rows_processed_delta/decode(a.executions_delta,0,1,a.executions_delta) ,2) ) nrows,
sum(trunc( a.buffer_gets_delta/decode(a.executions_delta,0,1,a.executions_delta) ,2)) buff_avg,
sum(a.PHYSICAL_READ_BYTES_DELTA) pysrd,
sum(a.PHYSICAL_READ_REQUESTS_DELTA) pysrq,
sum(a.DISK_READS_DELTA) dkrd,
sum(a.IO_INTERCONNECT_BYTES_DELTA) intercn,
sum(a.IO_OFFLOAD_RETURN_BYTES_DELTA) offld,
sum(a.IO_OFFLOAD_ELIG_BYTES_DELTA) elig
from dba_hist_sqlstat a , dba_hist_snapshot b
where sql_id='bxcvv372pkr7m'
and a.snap_id=b.snap_id
and b.begin_interval_time > sysdate -14
group by trunc(begin_interval_time), plan_hash_value
order by trunc(begin_interval_time) desc;
btime plan_hs execs tm_sec nrows buff_avg pysrd pysrq dkrd intercn offld elig
------ ------ ------ ------ ------ ------ ------ ------ ------ ------
01-DEC-2013 3865564104 1 1751 16752 4327717 32936501248 104370 4020569 32936501248 0 0
28-NOV-2013 3865564104 1 2511 0 3873633 30624784384 59219 3738377 30624784384 0 0
27-NOV-2013 3865564104 1 3042 21454 5210873 41140527104 74379 5022037 41140527104 0 0
26-NOV-2013 3865564104 1 1504 23270 3220407 25268518912 50146 3084536 25268518912 0 0
25-NOV-2013 3865564104 1 1730 0 1954352 15447539712 27075 1885686 15447539712 0 0
24-NOV-2013 3865564104 1 1712 17479 4329434 33956782080 77921 4145115 33956782080 0 0
21-NOV-2013 3865564104 1 2492 16112 4303976 34013880320 71177 4152085 34013880320 0 0
20-NOV-2013 3865564104 1 2749 14181 4352718 33895473152 81683 4137631 33895473152 0 0
19-NOV-2013 3865564104 1 2591 18819 5365660 42522451968 76118 5190729 42522451968 0 0
As indicated by IO_INTERCONNECT_BYTES_DELTA, exadata storage tier had to pass huge amount of IO to the database tier, and as indicate by the zero values of IO_OFFLOAD_RETURN_BYTES_DELTA, although plan is showing "TABLE ACCESS STORAGE FULL", there was no exadata smart scan/offloading here.
So why is that? Why does Oracle not using Exadata powerful offloading abilities?
This is because in order for smart scan to happen, few basic prerequisites need to be fulfilled:
1) There must be a full scan of an object (and indeed there is)
2) The object must be stored on Exadata storage (and indeed it is)
3) The scan must use Oracle’s Direct Path Read mechanism (and it does not) => So this unfulfilled prerequisite disables the use of smart scan here!!!!
How do I know direct path read was not used?
Lets look at DBA_HIST_ACTIVE_SESS_HISTORY (b.t.w, its another great place to historically analyze what was going on while SQL was executed). It will show us what Oracle was doing while running this update over the last two weeks:
SQL> select nvl(event,'CPU') event,count(*)
from DBA_HIST_ACTIVE_SESS_HISTORY
where sql_id='bxcvv372pkr7m'
and sample_time > sysdate - 14
group by event
order by count(*) desc;
EVENT COUNT(*)
---------------------------------------------------------------- ----------
cell multiblock physical read 2018
CPU 94
cell single block physical read 37
read by other session 1
And indeed, what we see here is that most of the time Oracle was waiting on "cell multiblock physical read", which is the regular full scan of exadata (also known as "db file scattered read" in non exeadata machines).
If a smart scan would have been used here I would expect to see the "cell smart table scan" wait event.
OK, so lets undo the third restriction. Lets force Oracle use direct path read.
How do we force it? By telling oracle, using hints for example, to run the update on parallel:
SQL> select sql_text from dba_hist_sqltext where sql_id='88s54njj4n3mj';
UPDATE /*+ full(A) parallel(A,8) */ T1 A SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL
Lets run it with the new hints and see what Oracle will do this time:
When Oracle is using Storage Index, the only way to verify it is by checking the statistics "cell physical IO bytes saved by storage index" is increasing.
SQL> select name,value
from v$mystat s, v$statname n
where s.statistic# = n.statistic#
and n.name in ('cell physical IO bytes saved by storage index',
'cell physical IO interconnect bytes returned by smart scan');
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 0
cell physical IO interconnect bytes returned by smart scan 0
The statistics is showing us that up till now, no storage index offload was used by this session.
Now, lets run the update and see whether this statistic will increase as expected.
SQL> UPDATE /*+ full(A) parallel(A,8) */ T1 A SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL;
SQL> select name,value
from v$mystat s, v$statname n
where s.statistic# = n.statistic#
and n.name in ('cell physical IO bytes saved by storage index',
'cell physical IO interconnect bytes returned by smart scan');
NAME VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 3106807808
cell physical IO interconnect bytes returned by smart scan 4052928
Yes! the above update was using storage index. It is indicating that 3,106,807,808 bytes were saved by storage index. Wow!!! That's a lot!!!
And see other statistics. Now, after storage index was used, only 4052928 bytes were returned from the storage tier to the database tier, through the interconnect.
This is a dramatic saving.
The update has finished in 8 seconds!!! I will say that again folks, as you may not hear me right.
Updating 15k rows in 20M rows table now took only 8 second!!!
No more 1 hour update. Big wow! I call this a huge saving, a tremendous improvement, great relief.
Lets view that execution plan this time:
SQL> select * from table ( DBMS_XPLAN.DISPLAY_cursor) ;
PLAN_TABLE_OUTPUT
------------------------------------
SQL_ID 43x29vnth9z4x, child number 0
-------------------------------------
UPDATE /*+ full(A) parallel(A,8) */ T1 A SET
EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL
Plan hash value: 4279346641
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | PX COORDINATOR
| 3 | PX SEND QC (RANDOM) | :TQ10000 | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
|* 5 | TABLE ACCESS STORAGE FULL| T1 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - storage(:Z>=:Z AND :Z<=:Z AND ("EXP_DATE" IS NULL AND "P_TIME" IS NOT NULL))
filter(("EXP_DATE" IS NULL AND "P_TIME" IS NOT NULL))
The plan is showing us that parallel execution did happen, and that Smart Scan is possible now, as the predicate information is displaying the storage line.
And finally, lets observe which wait events occurred this time while running the update:
SQL> select nvl(event,'CPU') event,count(*)
from v$active_session_history
where sql_id='43x29vnth9z4x'
and sample_time > sysdate - 1/24
group by event
order by count(*) desc;
EVENT COUNT(*)
---------------------------------------------------------------- ----------
cell smart table scan 8
As expected, we now see the wait event "cell smart table scan", which proves that Smart Scan was used now.
Needless to say that after this change locking situation stopped occuring.
Mission completed. Well done :)
That's all for now. Hope you've enjoyed this post. As always, comments are most welcomed.
Merav Kedem,
Founder of
DB OPTimize
Oracle Performance Tuning & DBA Consulting
Realy Thanks for Sharing your Oracle knowledge
ReplyDeleteAwesome results!
ReplyDelete