Merav Kedem's Blog

Exadata Storage Index solving an endless update

2013-12-06T02:19:00.001-08:00

The following simple update was running every evening, endlessly, on a production Exadata machine, Oracle version 11.2.0.3 (well, not truly endlessly, but for about an 30-60 minutes, which is considered to be quite endless in my standards...).

UPDATE T1
SET EXP_DATE = SYSDATE
WHERE P_TIME IS NOT NULL
AND EXP_DATE IS NULL

The update was running on a 20M rows table, only using two NULL/NOT NULL filtering criteria, and updates around 10-20K rows.
Since it run for about an hour, using a FULL TABLE SCAN, in this hour other updates that were running simultaneously on the same table (with different criteria) were suffering from row locks on a daily basis, and this was simply not acceptable.
Analyzing this intrusive update indicated that the update was spending most of its time on IO Waits, as many unneeded blocks were returned by the Exadata storage tier to the Exadata database tier, and only then database tied was able to start applying the null filters ( i.e "P_TIME IS NOT NULL AND EXP_DATE IS NULL").

Index was not an option here. So what is?

Since nulls are used here, Exadata Storage Index (which is part of Exadata offloading/smart scan features) could be the solution, if we just let Oracle use it.

For those of you who are yet unfamiliar with Exadata storage indexes, here is a brief explanation:

Storage Indexes is a very powerful Exadata feature, when used. Don't mix it with Oracle’s traditional B-Tree or bitmapped indexes as they are not stored in the database. Instead, its a feature (in-memory structures) of the storage server software that is designed to eliminate disk I/O. They identify locations where the requested records are not, instead of the other way around.
They store minimum and maximum column values for disk storage units. Because SQL predicates are passed to the storage servers when Smart Scans are performed, the storage software can check the predicates against the Storage Index metadata (maximum and minimum values) before doing the requested I/O. Any storage region that cannot possibly have a matching row is skipped.
In addition to the minimum and maximum values, there is a flag indicating whether any of the records in a storage region contain nulls.
You can read more about exadata storage index in the following link
http://www.oracle.com/technetwork/issue-archive/2011/11-may/o31exadata-354069.html

Anyway, before we drill into how storage indexes tremendously helped in this case, here are some basic info on the update: its text, execution plan, and runtime statistics over the last two weeks:

SQL> select sql_text from dba_hist_sqltext where sql_id='bxcvv372pkr7m';

UPDATE T1 SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL

SQL> select * from table ( DBMS_XPLAN.DISPLAY_AWR('bxcvv372pkr7m',3865564104) );

PLAN_TABLE_OUTPUT
---------------------------------------------------
SQL_ID bxcvv372pkr7m
--------------------
UPDATE T1 SET EXP_DATE = SYSDATE WHERE P_TIME
IS NOT NULL AND EXP_DATE IS NULL

Plan hash value: 3865564104
--------------------------------------------------------------------------------------------------
| Id | Operation                                             | Name            |
--------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                         |                     |
|   1 | UPDATE                                            | T1               |
|   2 |   TABLE ACCESS STORAGE FULL | T1                 |
--------------------------------------------------------------------------------------------------

This above plan (to achieve an easier display, I have removed several fields from it) is showing us that full table scan is being used while running the update, and that smart scan can be used here, theoretically.
Selecting basic statistics on Table T1 shows that it has almost 20M rows over 1M blocks (block size is 8k).

SQL> select num_rows,blocks,last_analyzed from dba_tables a where table_name='T1';

NUM_ROWS     BLOCKS      LAST_ANAL
-------------------    ------------      -----------------
19309303           1096332         28-NOV-13

And here are some statistics on its execution taken from dba_hist_sqlstat (b.t.w., dba_hist_sqlstat is an excellent place to grab overtime statistics information on sql executions):

select trunc(b.begin_interval_time) btime ,
      a.plan_hash_value plan_hs,
      sum(a.executions_delta) execs,
      sum(trunc(a.elapsed_time_delta/ 1000000/decode(a.executions_delta,0,1,a.executions_delta) ,3) ) tm_sec,
      sum(trunc(a.rows_processed_delta/decode(a.executions_delta,0,1,a.executions_delta) ,2) ) nrows,
      sum(trunc( a.buffer_gets_delta/decode(a.executions_delta,0,1,a.executions_delta) ,2)) buff_avg,
      sum(a.PHYSICAL_READ_BYTES_DELTA) pysrd,
      sum(a.PHYSICAL_READ_REQUESTS_DELTA) pysrq,
      sum(a.DISK_READS_DELTA) dkrd,
      sum(a.IO_INTERCONNECT_BYTES_DELTA) intercn,
      sum(a.IO_OFFLOAD_RETURN_BYTES_DELTA) offld,
      sum(a.IO_OFFLOAD_ELIG_BYTES_DELTA) elig
from dba_hist_sqlstat a , dba_hist_snapshot b
where sql_id='bxcvv372pkr7m'
and a.snap_id=b.snap_id
and b.begin_interval_time > sysdate -14
group by trunc(begin_interval_time), plan_hash_value
order by trunc(begin_interval_time) desc;

btime                plan_hs  execs   tm_sec nrows  buff_avg    pysrd                pysrq     dkrd          intercn      offld elig
------                 ------         ------     ------         ------              ------ ------ ------ ------ ------
01-DEC-2013 3865564104   1   1751   16752   4327717   32936501248   104370   4020569   32936501248   0   0
28-NOV-2013 3865564104   1   2511          0    3873633   30624784384   59219     3738377   30624784384   0   0
27-NOV-2013 3865564104   1   3042   21454   5210873   41140527104   74379   5022037   41140527104   0   0
26-NOV-2013 3865564104   1   1504   23270   3220407   25268518912   50146   3084536   25268518912   0   0
25-NOV-2013 3865564104   1   1730          0    1954352   15447539712   27075   1885686   15447539712   0   0
24-NOV-2013 3865564104   1   1712   17479   4329434   33956782080   77921   4145115   33956782080   0   0
21-NOV-2013 3865564104   1   2492   16112   4303976   34013880320   71177   4152085   34013880320   0   0
20-NOV-2013 3865564104   1   2749   14181   4352718   33895473152   81683   4137631   33895473152   0   0
19-NOV-2013 3865564104   1   2591   18819   5365660   42522451968   76118   5190729   42522451968   0   0

As indicated by IO_INTERCONNECT_BYTES_DELTA, exadata storage tier had to pass huge amount of IO to the database tier, and as indicate by the zero values of IO_OFFLOAD_RETURN_BYTES_DELTA, although plan is showing "TABLE ACCESS STORAGE FULL", there was no exadata smart scan/offloading here.

So why is that? Why does Oracle not using Exadata powerful offloading abilities?

This is because in order for smart scan to happen, few basic prerequisites need to be fulfilled:

1) There must be a full scan of an object (and indeed there is)
2) The object must be stored on Exadata storage (and indeed it is)
3) The scan must use Oracle’s Direct Path Read mechanism (and it does not) => So this unfulfilled prerequisite disables the use of smart scan here!!!!

How do I know direct path read was not used?
Lets look at DBA_HIST_ACTIVE_SESS_HISTORY (b.t.w, its another great place to historically analyze what was going on while SQL was executed). It will show us what Oracle was doing while running this update over the last two weeks:

SQL> select nvl(event,'CPU') event,count(*)
from DBA_HIST_ACTIVE_SESS_HISTORY
where sql_id='bxcvv372pkr7m'
and sample_time > sysdate - 14
group by event
order by count(*) desc;

EVENT                                                              COUNT(*)
---------------------------------------------------------------- ----------
cell multiblock physical read                                          2018
CPU                                                                                    94
cell single block physical read                                           37
read by other session                                                            1

And indeed, what we see here is that most of the time Oracle was waiting on "cell multiblock physical read", which is the regular full scan of exadata (also known as "db file scattered read" in non exeadata machines).
If a smart scan would have been used here I would expect to see the "cell smart table scan" wait event.

OK, so lets undo the third restriction. Lets force Oracle use direct path read.
How do we force it? By telling oracle, using hints for example, to run the update on parallel:

SQL> select sql_text from dba_hist_sqltext where sql_id='88s54njj4n3mj';

UPDATE /*+ full(A) parallel(A,8) */ T1 A SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL

Lets run it with the new hints and see what Oracle will do this time:
When Oracle is using Storage Index, the only way to verify it is by checking the statistics "cell physical IO bytes saved by storage index" is increasing.

SQL> select name,value
from v$mystat s, v$statname n
where s.statistic# = n.statistic#
and n.name in ('cell physical IO bytes saved by storage index',
        'cell physical IO interconnect bytes returned by smart scan');

NAME                                                                                               VALUE
----------------------------------------------------------------                      ----------
cell physical IO bytes saved by storage index                                   0
cell physical IO interconnect bytes returned by smart scan               0

The statistics is showing us that up till now, no storage index offload was used by this session.
Now, lets run the update and see whether this statistic will increase as expected.

SQL> UPDATE /*+ full(A) parallel(A,8) */ T1 A SET EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL;

SQL> select name,value
from v$mystat s, v$statname n
where s.statistic# = n.statistic#
and n.name in ('cell physical IO bytes saved by storage index',
        'cell physical IO interconnect bytes returned by smart scan');

NAME                                                                                               VALUE
----------------------------------------------------------------                      ----------
cell physical IO bytes saved by storage index                                   3106807808
cell physical IO interconnect bytes returned by smart scan              4052928

Yes! the above update was using storage index. It is indicating that 3,106,807,808 bytes were saved by storage index. Wow!!! That's a lot!!!
And see other statistics. Now, after storage index was used, only 4052928 bytes were returned from the storage tier to the database tier, through the interconnect.
This is a dramatic saving.

The update has finished in 8 seconds!!! I will say that again folks, as you may not hear me right.
Updating 15k rows in 20M rows table now took only 8 second!!!
No more 1 hour update. Big wow! I call this a huge saving, a tremendous improvement, great relief.

Lets view that execution plan this time:

SQL> select * from table ( DBMS_XPLAN.DISPLAY_cursor) ;
PLAN_TABLE_OUTPUT
------------------------------------
SQL_ID 43x29vnth9z4x, child number 0
-------------------------------------
UPDATE /*+ full(A) parallel(A,8) */ T1 A SET
EXP_DATE = SYSDATE WHERE P_TIME IS NOT NULL AND EXP_DATE IS NULL

Plan hash value: 4279346641
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                     | Name                    |     TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                       |                   |
|   1 | UPDATE                                             | T1              |
|   2 |   PX COORDINATOR
|   3 |    PX SEND QC (RANDOM)                | :TQ10000 |    Q1,00 | P->S    | QC (RAND) |
|   4 |     PX BLOCK ITERATOR                    |                  |    Q1,00 | PCWC |                       |
|* 5 |      TABLE ACCESS STORAGE FULL| T1             |    Q1,00 | PCWP |                       |
--------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - storage(:Z>=:Z AND :Z<=:Z AND ("EXP_DATE" IS NULL AND "P_TIME" IS NOT NULL))
       filter(("EXP_DATE" IS NULL AND "P_TIME" IS NOT NULL))

The plan is showing us that parallel execution did happen, and that Smart Scan is possible now, as the predicate information is displaying the storage line.

And finally, lets observe which wait events occurred this time while running the update:

SQL> select nvl(event,'CPU') event,count(*)
from v$active_session_history
where sql_id='43x29vnth9z4x'
and sample_time > sysdate - 1/24
group by event
order by count(*) desc;

EVENT                                                                          COUNT(*)
---------------------------------------------------------------- ----------
cell smart table scan                                                     8

As expected, we now see the wait event "cell smart table scan", which proves that Smart Scan was used now.

Needless to say that after this change locking situation stopped occuring.
Mission completed. Well done :)

That's all for now. Hope you've enjoyed this post. As always, comments are most welcomed.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

www.dboptimize.co.il

meravkedem.blogspot.com
merav@dboptimize.co.il

Exadata, Oracle and its Optimizer - The Dark Effect of Wrong Statistics

2013-05-02T02:45:00.003-07:00

Exadata or not, sometimes Oracle is still Oracle, optimizer is still optimizer, and math is still math.

Here is the story: Instance is Oracle 11.2.0.3, running on an Exadata machine (X2-2 half Rack).
User is complaining query is taking too long to finish (about 5 minutes).
User has also said that last time he run that query, the range scanned was much wider (about a year), nevertheless no performance problems had occurred. Query had finished rather fast the previous time it run... (this is a point to remember for later on).

Let's look at the query. For the sake of this post, I have simplified it a bit but the idea remains the same.
Basically it is just a simple select, with 4 months range predicates on a date column (SCAN_DATE), and another predicates on some non unique id column (POLICY_NUM).

     SELECT COUNT(SCAN_DATE)
     FROM my_user.my_table a
     WHERE scan_date >= TO_DATE('01/01/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
     AND scan_date < TO_DATE('01/05/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
     AND ( policy_num ='183' OR policy_num='000000183');

Let's run it and see what happens:

09:51:12 SQL> SELECT COUNT(SCAN_DATE)
FROM my_user.my_table A
WHERE SCAN_DATE >= TO_DATE('01/01/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
AND SCAN_DATE < TO_DATE('01/05/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
and ( POLICY_NUM ='183' OR policy_num='000000183');

COUNT(SCAN_DATE)
----------------
0

Elapsed: 00:04:43.87      => NOTICE THE HIGH ELAPSED TIME

Execution Plan
----------------------------------------------------------
Plan hash value: 877559244

----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows| Bytes | Cost (%CPU)|Time |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |                                    | 1 | 16 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE |                                    | 1 | 16 |    |    |
|* 2 | TABLE ACCESS BY INDEX ROWID | MY_TABLE                | 1 | 16 | 7 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN                      | SCAN_DATE_INDX | 9 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("POLICY_NUM"='000000183' OR "POLICY_NUM"='183')
3 - access("SCAN_DATE">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
AND "SCAN_DATE"<TO_DATE(' 2013-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4622469 consistent gets      => NOTICE THE HIGH CONSISTENT GETS
375620 physical reads       => AND HIGH PHYSICAL READS
8320 redo size
533 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

The above execution plan is showing index range scan is used (bolded line). It is also stated in the plan that optimizer thinks only 9 rows will be returned by the index. This looked suspicious to me, since table has millions of rows.

Let's view the execution in OEM SQL Monitoring (this is a great and most recommended feature!!!).

Indeed it shows that although estimated rows is 9 (and this is what making the optimizer wrongly choose his decision), the actual returned rows are 10M.
Exadata or not Exadata, optimizer is still optimizer, and math is still math.
Scanning 10M rows through an index is obviously not the best thing to do (and this is an understatement).
Just think about it. Even if IO latency is very very good, lets say 0.5ms (which is considered to be very good latency for random reads), multiplying 0.5ms by 10M seeks will still take forever.

So what is making Oracle choose a wrong decision?

10:21:32 SQL> select num_rows,last_analyzed
from dba_tables
where owner='MY_OWNER'
and table_name='MY_TABLE';

NUM_ROWS       LAST_ANAL
------------------     -----------------
168,562,490          26-OCT-12

Table was last analyzed on 26.10.2012? Six months ago? This doesn't sound promising. But let's continue and look what are the statistics of those two relevant columns, especially low value and high value..
(btw, I have simplified the select on the low/high_value to make the query below more readable and short. To be able to see the real values on these two column you need to use some conversion on it.)

10:28:29 SQL> SELECT column_name,data_type,num_distinct,density,last_analyzed,low_value,high_value
10:28:29 18   from DBA_TAB_COLUMNS a
10:28:29 19   where table_name='MY_TABLE'
10:28:29 20   and column_name in ('POLICY_NUM','SCAN_DATE');

COLUMN_NAME DATA_TYPE NUM_DISTINCT    DENSITY LAST_ANAL LOW_V                HI_V
---------------            ----------            ------------ ---------- --------- ---------- -------------------- --------------------
POLICY_NUM      VARCHAR2    3922917    5.1066E-06 26-OCT-12                                          8888
SCAN_DATE       DATE              19276420 .00002758   26-OCT-12           1981-1-1 0:0:0      2012-10-22 2:51:48

Look what we found. The highest value of SCAN_DATE column is 22.10.2012. Obviously this is because last time table was analyzed was in 26.10.2012. No wonder Oracle thought there won't be many rows returned for the requested predicate. The requested predicate is requesting rows when scan date between 1.1.2013 - 1.5.2013. This date range is higher thna the high_value of scan_date column:

    WHERE SCAN_DATE >= TO_DATE('01/01/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
     AND SCAN_DATE < TO_DATE('01/05/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')

Now let's see how many rows are actually in the table. Notice the hints I have used, and how fast Oracle returned the result. Those hints are forcing Oracle to use index fast full scan with parallel query (degree 4), which is much more appropriate in this case than a simple index range scan, since table has millions of rows (Note: Exadata offloading is also relevant here, and I have added a special section for that at the end of this post).

10:04:49 SQL> SELECT /*+ index_ffs (a,scan_date_indx) parallel_index(a,scan_date_indx,4) */
COUNT(SCAN_DATE)
FROM my_owner.my_table A;

COUNT(SCAN_DATE)
----------------
       186,863,326

Elapsed: 00:00:05.44

Table has 186M rows. Now let's see how many rows will be returned for the requested dates:

10:05:11 SQL> SELECT /*+ index_ffs (a,scan_date_indx) parallel_index(a,scan_date_indx,4) */ COUNT(SCAN_DATE)
FROM my_owner.my_table A
WHERE SCAN_DATE >= TO_DATE('01/01/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
and scan_date < to_date('01/05/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS');

COUNT(SCAN_DATE)
----------------
        10,213,081

Elapsed: 00:00:04.95

Indeed, 10M rows are returned here. This is not a surprise, since we have already seen the huge gap between estimated rows and actual rows in OEM's SQL Monitoring screen.

Here is the time to say that after finding there is a statistics problem here, I have explored the issue a bit more with the customer, to figure out what was the case when query run much faster (remember this was part of the complaint?). It appeared that query run much faster when the rows were selected from year 2012. Customer has informed me that since they started running the query on year 2013, query was taking forever..
And now we know why..
Let's explore it a bit more, and see how many rows are returned for 2012:

10:06:07 SQL> SELECT /*+ index_ffs (a,scan_date_indx) parallel_index(a,scan_date_indx,4) */ COUNT(SCAN_DATE)
10:06:07   2 FROM my_owner.my_table A
10:06:07   3 WHERE SCAN_DATE >= TO_DATE('01/01/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
10:06:07   4 and scan_date < to_date('01/05/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS');

COUNT(SCAN_DATE)
----------------
        10,381,170

Elapsed: 00:00:04.24

Let's see what Oracle thinks about the cardinality of this predicate:

10:07:16 SQL> SELECT /*+ index_ffs (a,scan_date_indx) parallel_index(a,scan_date_indx,4) */ COUNT(SCAN_DATE)
10:07:16   2 FROM my_owner.my_table A
10:07:16   3 WHERE SCAN_DATE >= TO_DATE('01/01/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
10:07:16   4 and scan_date < to_date('01/05/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS');

COUNT(SCAN_DATE)
----------------
        10381170

Elapsed: 00:00:04.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1948420453
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                         | Name           | Rows | Bytes | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |                        |     1 |     7 |   155K (1)| 00:31:05 |        |      |            |
|   1 | SORT AGGREGATE                                 |                         |     1 |     7 |                  |          |        |      |            |
|   2 |   PX COORDINATOR                                |                         |        |        |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                      | :TQ10000       |     1 |     7 |            |          | Q1,00 | P->S | QC (RAND) |
|   4 |     SORT AGGREGATE                              |                         |     1 |     7 |            |          | Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR                        |                        | 9473K|    63M|   155K (1)| 00:31:05 | Q1,00 |PCWC||
|* 6 |       INDEX STORAGE FAST FULL SCAN| SCAN_DATE_INDX | 9473K|    63M|   155K (1)| 00:31:05 | Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - storage("SCAN_DATE">=TO_DATE(' 2012-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "SCAN_DATE"<TO_DATE('
              2012-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("SCAN_DATE">=TO_DATE(' 2012-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "SCAN_DATE"<TO_DATE('
              2012-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
         12 recursive calls
        108 db block gets
     588111 consistent gets
     585699 physical reads
          0 redo size
        537 bytes sent via SQL*Net to client
        520 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

As expected, when the given date range is for 2012 (i.e SCAN_DATE >= TO_DATE('01/01/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS') and SCAN_DATE < to_date('01/05/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS') ),
the estimated returned rows Oracle is showing (9473K) is much more accurate.
And why does the query returns much faster when the given date range was for 2012?
Because Oracle knows scanning this index will bring 10M rows, so it smartly decides not to use this index. Instead, it uses the index on POLICY_NUM, as shown below.
Notice also the monitor hint. This will enable us to view the execution in SQL Monitoring, since by default, only long running queries (above 5 seconds) or parallel queries are shown.

09:49:11 SQL> set autotrace on
09:49:34 SQL>
09:49:46 SQL> SELECT /*+ monitor */ COUNT(SCAN_DATE)
09:49:47   2   FROM my_owner.my_table A
09:49:47   3 WHERE SCAN_DATE >= TO_DATE('01/01/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
09:49:47   4 AND SCAN_DATE < TO_DATE('01/05/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
09:49:47   5 and ( POLICY_NUM   ='183' OR policy_num='000000183');

COUNT(SCAN_DATE)
----------------
               0

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1090044264
-------------------------------------------------------------------------------------------------
| Id | Operation                     | Name            | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                                     |     1 |    16 |    62   (0)| 00:00:01 |
|   1 | SORT AGGREGATE                             |                                     |     1 |    16  |                |          |
|   2 |   INLIST ITERATOR                              |                                      |         |          |                |          |
|* 3 |    TABLE ACCESS BY INDEX ROWID| MY_TABLE                 |     4   |    64 |    62   (0)| 00:00:01 |
|* 4 |     INDEX RANGE SCAN                      | INDX_POLICY_NUM |    79 |         |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("SCAN_DATE">=TO_DATE(' 2012-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "SCAN_DATE"<TO_DATE(' 2012-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   4 - access("POLICY_NUM"='000000183' OR "POLICY_NUM"='183')

Statistics
----------------------------------------------------------
          1 recursive calls
          0 db block gets
          8 consistent gets
          3 physical reads
          0 redo size
        533 bytes sent via SQL*Net to client
        520 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

So the only question remains open is why does statistics are not updated, as this is an Oracle 11.2.0.3 instance, and statistics gathering through dbms_stats should be done automatically.
I suspected locked statistics. Let's check that:

10:08:41 SQL> select owner, table_name, stattype_locked
from dba_tab_statistics
where table_name='MY_TABLE' ;

OWNER      TABLE_NAME      STATTYPE_LOCKED
----------       ---------------        -------------------------
IMAGE      MY_TABLE       ALL

As suspected, statistics are locked!!! No wonder last time table was analyzed was 6 months ago. For date fields, this is a bad thing. You can lock statistics if and only if it has no effect the optimizer decision. And clearly this is not the case here. Optimizer decision was effected becuase high_value on the date column had a wrong value (much lower value than it should).
Resolution here is simple. Unlock statistics, re-gather statistics, and... Voilà!! . Query "suddenly" runs much faster...

Let's see what happens to "high_value" statistics, after unlocking and re-gathering statistics on MY_TABLE:

12:41:32 SQL> SELECT owner,column_name,data_type,num_distinct,density,last_analyzed,num_nulls,
low_value,high_value
    from DBA_TAB_COLUMNS a
   where table_name='MY_TABLE'
and column_name in ('POLICY_NUM','SCAN_DATE');

COLUMN_NAME     DATA_TYPE NUM_DISTINCT    DENSITY LAST_ANAL NUM_NULLS LOW_V                HI_V
--------------- ---------- ------------ ---------- --------- ---------- -------------------- --------------------
POLICY_NUM      VARCHAR2        5634560 9.9753E-06     28-APR-13          0                                   8888
SCAN_DATE        DATE                  5644288 0 .000045952   28-APR-13          0   1981-1-1 0:0:0       2013-4-28 12:1:31

Brilliant. HIGH_VALUE now has the correct value, so let's re-run the original query and see how Oracle execute it this time:

12:42:21 SQL> SELECT COUNT(SCAN_DATE)
FROM my_owner.my_table A
WHERE SCAN_DATE >= TO_DATE('01/01/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
AND SCAN_DATE < TO_DATE('01/05/2013 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
and ( POLICY_NUM   ='183' OR policy_num='000000183');

COUNT(SCAN_DATE)
----------------
               0

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1090044264
-------------------------------------------------------------------------------------------------
| Id | Operation                                                 | Name                            | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                                       |     1 |    18 |    52   (0)| 00:00:01 |
|   1 | SORT AGGREGATE                              |                                         |     1 |    18 |                |                   |
|   2 |   INLIST ITERATOR                                 |                                       |        |         |                 |                |
|* 3 |    TABLE ACCESS BY INDEX ROWID   | MY_TABLE                    |     3 |    54 |    52   (0) | 00:00:01 |
|* 4 |     INDEX RANGE SCAN                       | INDX_POLICY_NUM |    61 |          |     5   (0) | 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("SCAN_DATE">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "SCAN_DATE"<TO_DATE(' 2013-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   4 - access("POLICY_NUM"='000000183' OR "POLICY_NUM"='183')

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          8 consistent gets
          0 physical reads
          0 redo size
        533 bytes sent via SQL*Net to client
        520 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

Excellent. As expected, after Oracle sees the right statistics, it now knows the highest value for SCAN_DATE is 28.4.13 and not 26.10.12, therefore it can smartly choose much more suitable execution plan. This time Oracle decides not to range scan the index SCAN_DATE_INDX (since it knows such range will be wide), but instead, it chooses to range scan INDX_POLICY_NUM.
This time the result is returned in no time.

Oracle is no mystery. Just need to understand it, and know how it works.
:)

And now, if you can just bare with me a bit longer, here is some cool exadata staff to notice:
Since we forced an index fast full scan with parallel, Oracle was able to use Exadata smart scan offloading (btw, to enable the possibility of smart scan use, full scan and direct read are a must). Since the only column used in this query is scan_date, offloading is extremely effective here: more than 4GB were offloaded (i.e database didn't need to process those 4G as most of them were already filtered by the storage). This can also be seen in OEM, as "cell offload efficiency: 97%":

This smart scan can be also verified through v$sesstat:

SQL> select b.name,a.value from v$sesstat a, v$statname b, v$session c
where a.statistic# = b.statistic#
AND A.SID=C.SID
AND ( B.NAME like '%storage%' or b.name like '%offload%')
and a.sid=1542
order by a.value desc;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload                     0
cell simulated physical IO bytes returned by predicate offload            0
cell simulated physical IO bytes eligible for predicate offload           0
cell physical IO bytes saved by storage index                             0

Notice that before running the query, "cell physical IO bytes eligible for predicate offload" is 0.
Now, let's run the query again.

SQL> SELECT /*+ index_ffs (a,scan_date_indx) parallel_index(a,scan_date_indx,4) */ COUNT(SCAN_DATE)
2 FROM IMAGE.INS_DOC_S A
3 WHERE SCAN_DATE >= TO_DATE('01/01/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS')
4 and scan_date < to_date('01/05/2012 0:0:0', 'DD/MM/YYYY HH24:MI:SS');

COUNT(SCAN_DATE)
----------------
        10381208

Now let's see if statistics values have changed:

SQL> select b.name,a.value from v$sesstat a, v$statname b, v$session c
where a.statistic# = b.statistic#
AND A.SID=C.SID
AND ( B.NAME like '%storage%' or b.name like '%offload%')
and a.sid=1542
order by a.value desc;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload            4798046208
cell simulated physical IO bytes eligible for predicate offload           0
cell simulated physical IO bytes returned by predicate offload            0
cell physical IO bytes saved by storage index                             0

After we execute the query, "cell physical IO bytes eligible for predicate offload" had jump to 4798046208. Meaning, smart scan did happen, and this offload is all due to this query.
Cool and effective!
:)

That's all for now. Hope you've enjoyed this post.
As always, comments are most welcomed.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

No, It is not always the database fault..

2012-07-12T09:37:00.001-07:00

When things are starting to get wrong, Database ( i.e DBA ) is always the first tier to blame. I am sure you have faced this situation, probably much more than once or twice.
Guilty as charged, until proven innocent, right?

Well, it is not that nothing can't go wrong in the database, but I wish it would stop be treated as the black sheep of the family.

Recently, I had been in a situation where application was constantly complaining for bad response time, day after day. First thing to explore was whether application changed its behaviour, since most of the time that is the cause for performance issues. Surprisingly enough, it did not. No new SQLs, no major changes in execution count.
Second, I went to look for database changing its behaviour. But it didn't. No changes in execution plan (usually major cause for performance degradation). As much as I have looked, from the database side, it looked like database didn't change its behavior.

So what could be wrong?

Storage can go wrong for example, and that was the case here.

Looking for problems in the wrong place will not assist in solving them. But just saying "it is not the database" probably won't help. You will probably always have to prove it...
Average physical read/write time is what oracle experiencing when dealing with IO. But underneath these numbers, two more tiers are involved. Storage and File System.

Well, after checking average physical read time and average physical write time of several files in this database, it was clear (unfortunately, just to me for a while) that writes became much slower, at exactly the time when the application have started to complain.

Don't think it was easy. For many days Storage guys continues to say "Nothing is wrong with the storage" (how many times have you heard this? I wonder), but finally, when numbers became so high, it couldn't be ignored any more.

At last, spotlight went to spot on the right place (and stopped pointing to the wrong place, the database). Well, probably just until the next complaint will rise again...but I guess I can't win it all..

Here is a query that can be run in Oracle 10g and above. It might look scary, but in short, it's just go over dba_hist_filestatxs (where file statistics are held overtime), and since numbers there are incremented every snapshot (it doesn't hold the deltas), I have used LAG analytic function to subtract current value from preceding value (i.e, current snapshot from preceding snapshot), and I did it for every statistics I wanted to observe.
That's it.

select *   from (
   select
      b.begin_interval_time begin_interval_time ,
      nullif((phywrts - ( lag(phywrts,1,0) over( order by a.snap_id ) )),0) delta_phywrts,
     trunc( (writetim*10 - ( lag(writetim*10,1,0) over( order by a.snap_id ) ) ) /
             nullif((phywrts - ( lag(phywrts,1,0) over( order by a.snap_id ) )),0) ,3 ) avg_write_ms,
      nullif((phyrds - ( lag(phyrds,1,0) over( order by a.snap_id ) )),0)   delta_phyrds,
     trunc( (readtim*10 - ( lag(readtim*10,1,0) over( order by a.snap_id ) ) ) /
             nullif((phyrds - ( lag(phyrds,1,0) over( order by a.snap_id ) )),0) ,3 ) avg_read_ms,
      nullif((singleblkrds - ( lag(singleblkrds,1,0) over( order by a.snap_id ) )),0) delta_singleblkrds,
      trunc( (singleblkrdtim*10 - ( lag(singleblkrdtim*10,1,0) over( order by a.snap_id ) ) ) /
             nullif((singleblkrds - ( lag(singleblkrds,1,0) over( order by a.snap_id ) )) ,0),3 ) avg_singleblkrds_ms
   from dba_hist_filestatxs a ,
   dba_hist_snapshot b
   where filename = 'my_file.dbf'
   and a.snap_id=b.snap_id
)
where to_char(begin_interval_time,'HH24') between 7 and 11  -- in case you wish to see only specific hours
and to_char(begin_interval_time,'HH24') not in ('FRIDAY','SATURDAY ') -- in case you wish to filter nonworking days
order by begin_interval_time;

Below is part of the real life output, where major changes in average write time can clearly be seen, especially since 24.5. I have taken statistics of a specific data file, each time between 09:30-12:00, in normal working days (not in a weekend). Snapshot interval is 15 minutes. (For simplification, although above query shows reads and writes statistics, below I am showing only writes, since this is the issue relevant for this post).

on 7.5, average write time was between 2-7 ms.
on 16.5, average write time was between 6-11 ms.
on 21.5, average write time was between 6-14 ms.
on 24.5, average write time was between 12-21 ms.
on 13.6, average write time was very very bad. around 40 ms write time. This is 10 times slower than a month ago!
Finally, on 6/7, average write time went back to normal. went down to around 5 ms.

Pay attention to the increasing values of the 3rd column, avg_write_ms, which shows the average physical write time of a specific datafile, for every snapshot.

BEGIN_INTERVAL_TIMEDELTA_PHYWRTSAVG_WRITE_MS
============================================
07-MAY-12 09:30:0519703.395
07-MAY-12 09:45:1020074.489
07-MAY-12 10:00:1216792.829
07-MAY-12 10:15:1721203.037
07-MAY-12 10:30:1923552.169
07-MAY-12 10:45:2316092.094
07-MAY-12 11:00:247507.533
07-MAY-12 11:15:267372.713
07-MAY-12 11:30:287793.465
07-MAY-12 11:45:296252.448
16-MAY-12 09:30:485726.73
16-MAY-12 09:45:496455.395
16-MAY-12 10:00:515768.072
16-MAY-12 10:15:5272010.513
16-MAY-12 10:30:5462211.495
16-MAY-12 10:45:557457.208
16-MAY-12 11:01:006929.205
16-MAY-12 11:15:0192010.793
16-MAY-12 11:30:038398.116
16-MAY-12 11:45:057558.172
21-MAY-12 09:30:5421568.409
21-MAY-12 09:45:57201614.627
21-MAY-12 10:00:59187710.436
21-MAY-12 10:15:0322246.753
21-MAY-12 10:30:06114512.043
21-MAY-12 10:45:1086912.105
21-MAY-12 11:00:1277411.989
21-MAY-12 11:15:148578.191
21-MAY-12 11:30:167378.71
21-MAY-12 11:45:1882612.421
24-MAY-12 09:30:2766114.19
24-MAY-12 09:45:2971512.881
24-MAY-12 10:00:3080015.637
24-MAY-12 10:15:3271612.835
24-MAY-12 10:30:3389519.307
24-MAY-12 10:45:3779919.148
24-MAY-12 11:00:3871621.983
24-MAY-12 11:15:4184121.878
24-MAY-12 11:30:4472415.138
24-MAY-12 11:45:4665315.773
13-JUN-12 09:30:152460824.476
13-JUN-12 09:45:172424931.427
13-JUN-12 10:00:212016439.09
13-JUN-12 10:15:231980738.68
13-JUN-12 10:30:241957238.088
13-JUN-12 10:45:261881241.594
13-JUN-12 11:00:291900340.993
13-JUN-12 11:15:332166044.534
13-JUN-12 11:30:342324238.171
13-JUN-12 11:45:362105542.248
06-JUL-12 09:30:5030576.319
06-JUL-12 09:45:5134065.733
06-JUL-12 10:00:5238808.103
06-JUL-12 10:15:5343245.381
06-JUL-12 10:30:5454094.906
06-JUL-12 10:45:5527145.036
06-JUL-12 11:00:5627234.697
06-JUL-12 11:15:5724347.781
06-JUL-12 11:30:5838044.91
06-JUL-12 11:45:5922435.737

Ladies and gentlemen of the jury, what is your verdict?
We have found the defendant NOT GUILTY, your honor.
To sum it all up, as stated in the beginning, Database is not always to blame. This time the jury found it innocent.
:)
First thing to check is the application. If you see that applications is behaving more or less the same (no new SQLs, same execution count per SQL), check the database.
And if the database is behaving more or less the same, it is simply not there.
Something else is the source of the problem.

As always, comments are most welcome..

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Bind variables in Oracle 11.2

2012-03-20T08:20:00.003-07:00

A bit of an history before I begin...
Oracle 10.2 had exposed bind variables values passed to a SQL statement via a new view named V$SQL_BIND_CAPTURE. Nevertheless, to limit the overhead, Oracle captured the bind at most every 15 minutes for a given cursor.

In other words, you were not able to see bind values for each SQL execution.
In some cases, especially when binds values have major influence on SQL execution, knowing the bind values passed to a given execution could be very important.

Oracle 11.2 has extended this ability.V$SQL_MONITOR was expanded and a new column BIND_XML was added to it.
V$SQL_MONITOR is one of my favourite additions to 11g. It displays very useful information on SQL statement executions and it is automatically started when a SQL statement runs parallel or when it has consumed at least 5 seconds of CPU or I/O time. (My only complaint about this view is that it is not kept in history. No dba_hist_sql_monitor yet. Maybe someone in Oracle will read this post and add it to their TODO list...).

Anyway, with the addition of BIND_XML new column, this means that in 11gR2 we will be able to see bind variables values for each execution of such SQL statements. Sure, it is still only partitial picture of bind variables world (since we will have bind information only for those statements who run in parallel or run longer that 5 seconds), but it is still more than we had before...

And as examples always make life much easier, here is a simple demonstration:

SQL> var a1 number;
SQL> var a2 varchar2(30);
SQL> var a3 varchar2(40);
SQL>
SQL> exec :a1 := 1000;

PL/SQL procedure successfully completed.

SQL> exec :a2 := '%A%';

PL/SQL procedure successfully completed.

SQL> exec :a3 := '01-JAN-2010';

PL/SQL procedure successfully completed.

SQL> set timing on

SQL> select count(*)
2 from dba_tables t, dba_objects o
3 where o.object_id > :a1
4 and t.table_name like :a2
5 and o.created > to_date(:a3,'DD-MON-YYYY')
6 and o.object_type='TABLE'
7 and o.owner=t.owner
8 and o.object_name=t.table_name;

COUNT(*)
----------
2163

SQL> set long 1000
SQL> set lines 130
SQL> select xmltype(binds_xml) from v$sql_monitor
2 where sql_text like 'select count(*)%from dba_tables t, dba_objects o%';

XMLTYPE(BINDS_XML)
--------------------------------------------------------------------------------
<binds>
<bind name=":A1" pos="1" dty="2" dtystr="NUMBER" maxlen="22" len="2">1000</bind>
<bind name=":A2" pos="2" dty="1" dtystr="VARCHAR2(32)" maxlen="32" csid="38" len="3">%A%</bind>
<bind name=":A3" pos="3" dty="1" dtystr="VARCHAR2(128)" maxlen="128" csid="38" len="11">01-JAN-2010</bind>
</binds>

As you can see, since running this SQL took a bit more than 5 seconds, its execution as well as all bind variable values passed for this execution can be instantly observed in v$sql_monitor.

Note #1: It is possible to change the 15 minutes bind capture interval by setting an undocumented parameter “_cursor_bind_capture_interval”, (though it can have a large impact on system performance), i.e

ALTER SYSTEM SET "_cursor_bind_capture_interval" = 30; -- capture every 30 seconds

Note #2: v$sql_monitor requires a licenses for Oracle Diagnostics & Tuning Packs

Comments are most welcome..

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

AWR Formatter - worth a try

2011-10-21T08:43:00.000-07:00

I have bumped into one of Tom Kyte's posts, recommending an AWR Formatter plug-in, written by Tyler Muth.
As Tom Kytes said: "It's pretty cool - works as a Chrome plugin - and it makes an AWR report a little more 'friendly' to use. It creates hot links for many of the wait events (so you know what they mean) and it summarizes up a lot of stuff - making the AWR report a lot more "interactive". Check it out and give him feedback on it if you have time."

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Bad screen design, bad performance

2011-10-11T08:27:00.000-07:00

Recently one of my customers had loaded a new version into production. Immediately performance issues have started to rise. I have spotted a new query that became one of the top-3 heaviest SQLs running in the instance. Certainly not something that can be ignored…

The query was something like:

Select distinct ….

From T1

Where code = :1

And ID like :2

Order by …;

This query was being executed 10k-20k a day, with average execution time 1.5 seconds, and had an intensive CPU usage. That summed up to 4-8 hours a day of cpu usage.

Next thing I went to v$sql_bind_capture, to examine the values sent to ID parameter. To my surprise I saw that most of the time (more than 95%) application was sending the complete ID without adding "%" as I would have expect it to be since query is written with LIKE operator " and ID like :2" and not written with equal operation on ID. That looked very weird. Why using LIKE without using % , right?

As a result of this use of LIKE operator in the query Oracle couldn't apply an index unique scan, and forced to use an index range scan + sort unique thus became a performance issue.

Let's examine the facts again:

1. The combination of (CODE, ID) is unique since a unique index was built on those 2 columns.

2. Application rarely uses wild card "%" when passing values to ID. Most of the time they only send a complete number, i.e …and ID like '12345' , and appose to and ID like '12345%' , so in fact only one unique row will be returned.

3. Query is using distinct and order by on the returned rows. In case no wild card is sent, both distinct and order by can be omitted from the query since only one unique row will be returned anyway.

From what I have seen it was clear to me that performance became here an issue due to some bad design of something. It looked like someone was trying to generalize a use of a query.

I thought to myself: If you want to send a complete number, use equal operator. If you wish to apply a wild card, use a like operator. Certainly do not use like for both cases.

Well, the time for a little investigation had come. After a little research with the application, it appeared there was a new screen allowing the user to search for ID. The screen has a combo box which its default value was set to "Begins with" (which sent the LIKE operator), so every time a user was searching for ID the same query with LIKE was sent in. User is passing 12345%, LIKE is being used (right). User is passing 12345, again, LIKE is being used (wrong).

I have asked the application to change three things:

1. Split this query into two different queries. One with LIKE and the other with equal operator "=".

2. Query with equal operator should omit DISTINCT + ORDER BY.

3. Redesign the screen, so default combo value will be set to "equal" (as this was the major use of the users), and in this case query with equal operation will be used.

Query with equal operator should look like:

Select ….

From T1

Where code = :1

And ID = :2;

Indeed, after 2 weeks application had passed a fix to production and a performance relief could be noticed. The new query average execution time went down to 0.004 second (remember, query using LIKE had average execution time of 1.5 sec). And that summed up to only 40-80 seconds a day (as opposed to 4-8 hours when using LIKE).

Of course, query with LIKE operator is still being used, but only when really needed, and that is how it should be.

To sum it all up, what we saw here was bad screen design which lead to bad performance…

There is no doubt on my mind. As I said before and will say again, DBAs should work together with application. DBAs should definitely participate during design phases. Otherwise this type of scenario would occur over and over again…

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Thinking of adding a new index? Tune SQL's text before

2011-07-31T09:47:00.000-07:00

Creating an index can certainly do marvelous things to SQLs. Even after all those years in the field, it is still always thrilling for me to witness huge performance improvement seen after applying a suitable index to an unturned query. The performance gain is simply spectacular and the notices slump in resource consumption this query is taking (I/O, CPU, etc) is immediate.

Nevertheless, adding an index does have its cost. Good things never come for free. Every index occupies storage, and each index has a side effect on DMLs, causing some overhead (Oracle need to updated index block, therefore must read them before, from memory or from disk).

I often see cases when indexes are created too often, tables end up with dozen indexes and more, and yet, SQL are still not tuned.

Moreover, before applying an index, you should always ask yourself whether this SQL worth adding an index. Sometimes it could be SQL which runs only few times a day, and the way it performs is good enough for its customers. It does a Full Table Scan? so what. If it runs only once a day, during night, and takes 2 minutes, than this could be just fine as it is. No need to add extra index just for this once a day execution.

So basically, every situation needs to be carefully examined, then according to the required needs, decide whether it is worthwhile adding an index to improve SQL performance.

Another important advice before applying a new index is to tune the text itself (assuming you can change the text and it is not coming from some third-party product), i.e:

1. Understand what the SQL is aimed to do. Ask yourself (and go back to the application if necessary) what is it that this SQL is doing. Being able to say it in your own words makes it easier to understand it, then to rewrite it.

2. Make sure there are no unnecessary joins. This is extremely important. You will be surprise to know how many times I have identified unnecessary joins in SQLs written by application, seeing join with large tables without a reason. Just imagine what a needless join with a table that has 18M rows can do to SQL performance (small clue: really bad things).

3. Make sure all join predicates exist (to avoid Cartesian joins)

4. Make sure all needed filter predicates exists (where clauses match the relevant conditions). Apply filters which yield the smallest result set as possible.

Of course there are many more issues with writing a tuned SQL, though this post will not cover them all.

Then, if, and only if, text is 100% tuned, go ahead and search whether a suitable index can improve query performance.

Yet, remember that though you are tuning a specific SQL, a new index can and probably will influence other SQLs (not just DMLs but also other queries) for better or for worse. It's always a tradeoff.

Therefore, before you adjust the index to your system:

Try to identify what will be its influence on other SQLs running on the system: which SQLs will be influence and how will they be influence by the adding the new index (there are tools which do that). Of course, whenever possible, first apply the index in your testing environment, which resembles production as much as possible (load, schema, data), and examine index influence on your specific SQL, on other queries, and on DMLs.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Beyond Database Tuning - Part 2 (or adjusting schema to application needs)

2011-06-10T11:58:00.000-07:00

In my previous post "Beyond Database Tuning" I have shown a situation where application was running a very heavy SQL (~2 hours) each day, while in fact, after speaking with the application, I understood that this SQL is simply not needed and its output is only functioning as a loop counter in a Java code.

What run inside this loop was another heavy SQL which is the subject of this post.

Inner SQL had looked like that:

      SELECT …
         FROM T1 s,
                   (SELECT max(max(seq)) over (partition by id order by id ) maxseq,
                                   id
                       FROM T1 s2
                    WHERE seq is not null
                    GROUP BY id) maxresults
     WHERE s.id = maxresults.id AND
             s.seq = maxresults.maxseq AND
             s.col_a = 0 AND
             s.col_b = :1 AND
            s.col_c < :2 AND
             s.col_d = :3 AND
             rownum < 100

Application was constantly running this very heavy SQL hour after hour day after day. It accessed a very big table T1 that has 146M rows on 2.1M 8k blocks.
Each execution took 35 minutes on average. Since each execution took so long, only 20-70 executions were able to run each day.

After a short time examining this SQL I knew that as it is now, tuning capabilities were very limited. Most of the time was spent on building the inline view with the analytic function (marked in red). Oracle had to first build it all before being able to then join it with T1, therefore it couldn't benefit from the count stopkey resulting from "rownum < 100".

Again, I went to speak with the application to fully understand the logic behind it.

It appeared that every time application was doing something on id, a new row was inserted into table T1 with id and seq=max(seq) + 1. This running number was per ID (compare to a running number on whole table), thus an Oracle sequence couldn't be used here. I have also found that application was mostly interested in max(seq) for each id, nevertheless, it didn’t hold this value anywhere.

It was clear to me that the way schema was designed simply didn’t match application needs.
This is when I knew that schema design must be changed. There has to be a table holding max(seq) for a given id, and this change would surly lead to performance boost.

Therefore I have created a new table T1_MAXSEQ ( ID number, SEQ number), with unique index on (ID,SEQ). Now, something has to keep this table updated and fully match values in table T1.
I found that application was already using a before insert trigger on T1, which set the correct value of SEQ to max(seq) + 1 for any given ID. Since DMLs were not intensive on T1, I have decided to alter this trigger so it would also insert a new row into T1_MAXSEQ when a new ID is inserted into T1, and update an existing row to SEQ+1 for an existing ID.

Schema change (new table and trigger update) was minimal and transparent to the application. All was left for them to do was to rewrite the SQL as I have instructed, so that it will now join T1 with T1_MAXSEQ instead of using inline view to find max(seq) for every IDs.

SQL should now look like that:

      SELECT …
         FROM T1 s,
                     T1_MAXSEQ maxresults
      WHERE s.id = maxresults.id AND
             s.seq = maxresults.maxseq AND
             s.col_a = 0 AND
             s.col_b = :1 AND
            s.col_c < :2 AND
             s.col_d = :3 AND
             rownum < 100

By adjusting schema to application needs and then rewriting this SQL, execution went down from 35 minutes to 1 sec, thus application was able to execute this SQL 25k times a day (compare to only 20-70 before).
Oracle is now using the new small table T1_MAXSEQ  as the outer table in the nested loop, thus capable of benefit from count stopkey (execution plan before and after the change can be seen in the link below).
Heavy overload on the instance as well as on machine and disks was stopped. SQL performance was improved dramatically, allowing huge increase in application capacity and productivity.

This can be visully seen in the following pdf: http://www.dboptimize.co.il/oracle_adjust_schema_to_application_needs.pdf

Again, just from a short conversation with the application and by asking the right questions I manage to understand where the problem resides and expand my ways to solve this problem.

I will say it again (and probably not for the last time): Most of the performance issues I have seen in instances were application issues and not instance configuration issues. Knowing your application is one of the most important steps in each tuning process.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Beyond database tuning

2011-05-04T13:47:00.000-07:00

Let me share with you something that happened to me today.
I saw this very heavy SQL, running several times a day, taking up to 2 hours each time, consuming lot of database resources (CPU & IO).
This select was doing a count on a very big table (150M rows) joined with a smaller table (5M rows) and had some non selective filter predicates in its where clause.
Tuning capability didn’t look so promising for this SQL.
I have decided to go and ask the application for the purpose of this SQL, and investigate what it does with its result (count).
What I found was quite a shock. They code was basically doing that:

1. select count(*) from T1, T2 where ... /* This brings back a number after 2 hours processing. Lets call it X */

2. devide X/100 (to get number of loops to be made)

3. for 1 in 1.. x/100 {

select ... from T1,T2 where ... and rownum =< 100 /* rowset to be processed. 100 each time */

}

As a result, row processing (the SELECT inside the loop) couldn't not start till the count query is finished.
In other words, program is wasting two hours for nothing waiting for this count to finish, while in fact this count is unnecessary, causing unneeded overhead, and is just bad code writing.
Application should remove the count query, and simply run select in a loop till no_data_found is returned. It should logically look like that (pseudo):

cont_processing = 1
while ( cont_processing ) {
select ... from T1,T2 where ... and rownum =< 100 ; -- just the inner loop
if no_data_found then cont_processing =0
}

I know. This is pure code tuning, and has nothing to do with SQL tuning. Is it DBA responsibility? Some will say yes, and some will say no. But, if I wouldn't have gone to the application, and examine the code by myself, this extremely heavy SQL will continue to run in the database, interfering others, causing performance degradation.

What is the conclusion from this real life story?
I believe that DBA and application should work together. It is always proving itself. So my advice to you is to talk with your application representative as much as possible. Know the source of the SQL you see running in your database. Learn their code, and remember: Knowledge is power, and when you know more, you expand your options for solving a problem.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Tuning a frequently executed SQL

2011-04-23T10:43:00.000-07:00

When we search for untuned SQLs we can look for SQL that its execution time is objectively long, but we should also search for SQL which has high amount of short executions, or in other words frequently executed SQL. When dealing with frequently executed SQLs, even a small improvement in one execution (average execution time) will yield a performance improvement, since this "small" improvement of one execution is multiplied by its many executions.

In other words, when looking at v$sql, look for the ones with high executions:
select * from v$sql order by executions desc;
or for the ones with high average execution time:
select * from v$sql order by elapsed_time/executions desc;

I was visiting a customer which one of his top three heaviest SQLs was a query executed 1.5M a day, with average execution of 0.013 sec. Allegedly, when we look at its average execution time, this SQL won't be our first candidate for tuning, but when we look at its overall consumption (execution time of all its executions), well, that would be another story.

I have tuned this query by simply changing the order of index columns, causing average execution to go down to 0.005 sec (2.6 times faster).

After that overall resource consumption of the query went down dramatically. Complete tuning flow can be visually seen in the following pdf:

http://www.dboptimize.co.il/Oracle_frequently_executed_sql.pdf

Note: The example shown is for Oracle, thought the methodology of tuning fast and frequently executed query is relevant for all databases: Oracle, MSSQL, DB2, Sybase, Mysql, etc.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Abnormal program behavior and massive log file sync

2011-04-23T01:14:00.000-07:00

Historical performance data is extremely important when tuning application. This gives us the ability to instantly know when things start go wrong, and the ability to investigated over the past what start causing things to go wrong. Often I am feeling like a modern Sherlock Holmes, searching for criminals :)

OEM does this, and so does several third party tools (such as Precise). Of course this can be done by manual scripts also.

Here is an example:

One of my customers was replicating data from mainframe to Oracle, and was complaining that replication has started working very slowly. With this information I went comparing program's current behavior to program's normal behavior bases on overtime activity information collected and saved for this instance. Indeed I saw that suddenly the program have started waiting much longer than it use to on "log file sync". This was clue #1.

When looking at a wider time frame of program's activity I have spotted the exact time when things start going wrong. This was clue #2.

Ok, something was changed. Now I needed to find out what was changed that was causing replication program to start waiting longer and longer for "log file sync".

As we all know one of the reasons for "log file sync" is massive commits. I went looking for overtime information on Commits, and saw that the amount of commits were doubled compared to their usual amount per day (3.4M instead of 1.8M). This was clue #3. Suspicious....

I needed to look for the cause of this massive increased in commits. First, to answer - when do we do commits? After DMLs of course. So let's look for suspicious DMLs (new DMLs, DMLs with high executions, DMLs with increased executions, etc). This was clue #4.

Now we are getting closer to solving the mystery. When adding clue #4 (suspicious DMLs) to clue #2 (time when things start going wrong), it wasn't hard to find the criminal to blame. A massive new UPDATE had started at that time, executing over 2.6M executions a day, causing the massive commits, and the increase of "log file sync".

Now all I have to do was to go back to the application and ask them few questions, such as: what was this SQL, why was it added, why was it executing so many times a day.

You will be surprise to know that the answer was "by mistake". Well, they remove this UPDATE and things went back to normal.

All this investigation was finished by less than an hour. No doubt it would have been much harder and much longer to find the root cause of the abnormal program behavior without overtime information saved on application activity and instance activity.

All this investigation and conclusion can be visually seen in the following link: http://www.dboptimize.co.il/Oracle_abnormal_program_behavior_Redo_log.pdf

Remember: from my experience at least 80% of instance performance problems are caused by application... Much less are caused by wrong instance tuning (init parameters etc). It is always beneficial to know your application and talk to people who are responsible to it.

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

Identify heaviest plan_line_id for a given execution plan (Oracle 11g)

2011-04-10T07:39:00.000-07:00

I think the most mysterious question to me (well, as far as Oracle was the subject) was always that:
For a given SQL and its execution plan, where does Oracle spend most of its time?

Oracle 11g revealed this mystery by exposing Real Time SQL Monitoring, v$sql_monitor, and v$sql_plan_monitor. However, these performance views will only show information either for relatively long running queries ("when it has consumed at least 5 seconds of CPU or I/O time") or for all parallel queries. Suppose you have SQL with shorter execution time, running many times? You won't be able to find it there...
And before Oracle 11g was released, things were a bit harder...

In the following session I will demonstrate how to find, for a particular execution plan, in which execution plan step and on which object Oracle spends most of it time.

Oracle used is 11.1.0.7
Application is running the following SQL 600k times each day:

select *
from t1
where ID= :1
order by SEQ;

Let's find sql_id of this SQL:

SQL> select sql_id,plan_hash_value,sql_text from v$sql where sql_text like 'select * from t1 where ID= :1 order by SEQ';

SQL_ID PLAN_HASH_VALUE SQL_TEXT
------------- --------------- --------------------------------------------------------------
b652k1fnuqz5w 406502908 select * from t1 where ID= :1 order by SEQ

Let's find execution plan used:

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor('b652k1fnuqz5w'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID b652k1fnuqz5w, child number 0
-------------------------------------
select * from t1 where ID= :1 order by SEQ

Plan hash value: 406502908

----------------------------------------------------------------------------------
| Id | Operation                                                | Name               | Rows | Bytes | Cost |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                    |        |          |     1 |
|   1 | TABLE ACCESS BY INDEX ROWID| T1              |     4 |   512 |     1 |
|* 2 |   INDEX RANGE SCAN                       | T1_IX1     |     4 |          |     1 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=:1)
filter("ID"=:1)

Note
-----
- cpu costing is off (consider enabling it)

24 rows selected.

Execution plan is very simple. Oracle is using INDEX RANGE SCAN on T1_IX1, applying predicate "ID = :1", then, it is using rowids stored in the index to fetch requested data " select * ".

How can we tell where does Oracle spend its time?

Let's have a look at v$active_session_history. This magnificent view was introduced in 10g, though in Oracle 11g several valuable columns were added to it. Execution plan step information is one example. For each sample and for each row added to this view (which reflect session's activity), it states to which step this activity is related.

Therefore, if we select all samples for the given sql_id, and group all those returned rows by
sql_plan_operation, sql_plan_option, sql_plan_line_id, current_obj# and event, we will be able to see the breakdown of sql_plan_line_id and find the heaviest step in the plan.

The step with the most samples will be the heaviest execution plan step, and our tuning efforts should be focused there.

Of course, for a simple plan it's quite easy, but go figure an execution plan with several dozens of steps without knowing which step is problematic.

With this technique mystery can be revealed.

SQL> select sql_plan_operation operation, sql_plan_options options, sql_plan_line_id line_id, current_obj# obj#, event, count(*)
   2 from v$active_session_history
   3 where sql_id='b652k1fnuqz5w'
   4 and sql_plan_hash_value=406502908
   5 group by sql_plan_operation, sql_plan_options, sql_plan_line_id, current_obj#,event;

OPERATION          OPTIONS                LINE_ID       OBJ#   EVENT                         COUNT(*)
---------------              ---------------                ----------   ----------    -------------------------         ----------
INDEX                     RANGE SCAN                 2       3657730    db file sequential read             44
TABLE ACCESS    BY INDEX ROWID         1        302302     db file sequential read           110

We can see that most of the time is spent on fetching the data (select *) from the table. 110 samples out of 154 (44+110) samples (71.4%) were captured on TABLE ACCESS BY INDEX ROWID, while only 44 samples out of 156 samples (28.6%) were captured on INDEX RANGE SCAN. In other words fetching * from the table using rowid brought by the index is very expensive.

Let's examine table (current_obj#=302302):

SQL> select table_name, num_rows, blocks
   2 from dba_tables
   3 where (owner,table_name) =
   4 ( select owner,object_name from dba_objects where object_id=302302);

TABLE_NAME  NUM_ROWS     BLOCKS
---------------------- -------------------    ------------
T1                                   25019183        479813

Let's examine index (current_obj#=3657730):

SQL> select i.table_name,i.index_name,i.column_name,i.column_position ,c.num_distinct
   2 from dba_ind_columns i,dba_tab_columns c
   3 where (index_owner,index_name) =
   4 ( select owner,object_name from dba_objects where object_id=3657730)
   5 and i.table_owner = c.owner
   6 and i.table_name = c.table_name
   7 and i.column_name = c.column_name
   8 order by column_position;

TABLE_NAME     INDEX_NAME  COLUMN_NAME   COLUMN_POSITION NUM_DISTINCT
---------------------- ---------------------   --------------------------- ---------------------------    --------------------------
T1                             T1_IX1                ID                                 1                                    5733888
T1                             T1_IX1                SEQ                              2                                    236

Looking at the index definition and statistics of the used index T1_IX1(ID,SEQ), we see that the index is well suited for this statement (where id = :1 order by seq).

Saying that, we need to go back with that to the Application and find out whether they do need to " select * " from the table (table T1 has 31 columns).
In case needed columns list is much narrow than "select *", application should eliminate "select *" use, and replace it with only needed columns. Then, if only a small set of columns are fetched, those columns can be added to the index (covering index), and Oracle can fetch needed data only from the index, without accessing the table.

Few notes:
1) v$active_session_history is part of Oracle Diagnostics Pack, therefore requires Oracle Diagnostics Pack license. v$sql_monitor and v$sql_plan_monitor are part of Oracle Tuning Pack, therefore requires Oracle Tuning Pack license.
2) Regardless the cost of fetching all columns, selecting * is never recommended. Suppose that someone added a column to the table without notifying all applications and one application is using select * in its code. Suddenly an extra unexpected column is returned back to the application.

3) sql_plan_operation and sql_plan_options are new columns added to v$active_session_history on version 11g. These two columns do not exist in v$active_session_history in version 10g. Nevertheless, there are ways to located heavy steps in 10g also. This will be covered on next session.
To be continued...

Merav Kedem,

Founder of

DB OPTimize

Oracle Performance Tuning & DBA Consulting

http://www.dboptimize.co.il/
merav@dboptimize.co.il

BEGIN_INTERVAL_TIME	DELTA_PHYWRTS	AVG_WRITE_MS
===================	=============	============
07-MAY-12 09:30:05	1970	3.395
07-MAY-12 09:45:10	2007	4.489
07-MAY-12 10:00:12	1679	2.829
07-MAY-12 10:15:17	2120	3.037
07-MAY-12 10:30:19	2355	2.169
07-MAY-12 10:45:23	1609	2.094
07-MAY-12 11:00:24	750	7.533
07-MAY-12 11:15:26	737	2.713
07-MAY-12 11:30:28	779	3.465
07-MAY-12 11:45:29	625	2.448

16-MAY-12 09:30:48	572	6.73
16-MAY-12 09:45:49	645	5.395
16-MAY-12 10:00:51	576	8.072
16-MAY-12 10:15:52	720	10.513
16-MAY-12 10:30:54	622	11.495
16-MAY-12 10:45:55	745	7.208
16-MAY-12 11:01:00	692	9.205
16-MAY-12 11:15:01	920	10.793
16-MAY-12 11:30:03	839	8.116
16-MAY-12 11:45:05	755	8.172

21-MAY-12 09:30:54	2156	8.409
21-MAY-12 09:45:57	2016	14.627
21-MAY-12 10:00:59	1877	10.436
21-MAY-12 10:15:03	2224	6.753
21-MAY-12 10:30:06	1145	12.043
21-MAY-12 10:45:10	869	12.105
21-MAY-12 11:00:12	774	11.989
21-MAY-12 11:15:14	857	8.191
21-MAY-12 11:30:16	737	8.71
21-MAY-12 11:45:18	826	12.421

24-MAY-12 09:30:27	661	14.19
24-MAY-12 09:45:29	715	12.881
24-MAY-12 10:00:30	800	15.637
24-MAY-12 10:15:32	716	12.835
24-MAY-12 10:30:33	895	19.307
24-MAY-12 10:45:37	799	19.148
24-MAY-12 11:00:38	716	21.983
24-MAY-12 11:15:41	841	21.878
24-MAY-12 11:30:44	724	15.138
24-MAY-12 11:45:46	653	15.773

13-JUN-12 09:30:15	24608	24.476
13-JUN-12 09:45:17	24249	31.427
13-JUN-12 10:00:21	20164	39.09
13-JUN-12 10:15:23	19807	38.68
13-JUN-12 10:30:24	19572	38.088
13-JUN-12 10:45:26	18812	41.594
13-JUN-12 11:00:29	19003	40.993
13-JUN-12 11:15:33	21660	44.534
13-JUN-12 11:30:34	23242	38.171
13-JUN-12 11:45:36	21055	42.248

06-JUL-12 09:30:50	3057	6.319
06-JUL-12 09:45:51	3406	5.733
06-JUL-12 10:00:52	3880	8.103
06-JUL-12 10:15:53	4324	5.381
06-JUL-12 10:30:54	5409	4.906
06-JUL-12 10:45:55	2714	5.036
06-JUL-12 11:00:56	2723	4.697
06-JUL-12 11:15:57	2434	7.781
06-JUL-12 11:30:58	3804	4.91
06-JUL-12 11:45:59	2243	5.737