Search This Blog

Friday, June 10, 2011

Beyond Database Tuning - Part 2 (or adjusting schema to application needs)

In my previous post "Beyond Database Tuning" I have shown a situation where application was running a very heavy SQL (~2 hours) each day, while in fact, after speaking with the application, I understood that this SQL is simply not needed and its output is only functioning as a loop counter in a Java code.
What run inside this loop was another heavy SQL which is the subject of this post.
Inner SQL had looked like that:



      SELECT …
         FROM T1 s,
                  
(SELECT max(max(seq)) over (partition by id order by id ) maxseq,
                                    id
                       FROM  T1 s2
                    WHERE seq is not null
                    GROUP BY id) maxresults
 
    

     WHERE s.id = maxresults.id AND
             s.seq = maxresults.maxseq AND
             s.col_a = 0 AND
             s.col_b = :1 AND
             s.col_c < :2 AND
             s.col_d = :3 AND
             rownum < 100



Application was constantly running this very heavy SQL hour after hour day after day. It accessed a very big table T1 that has 146M rows on 2.1M 8k blocks.
Each execution took 35 minutes on average. Since each execution took so long, only 20-70 executions were able to run each day.


After a short time examining this SQL I knew that as it is now, tuning capabilities were very limited. Most of the time was spent on building the inline view with the analytic function (marked in red). Oracle had to first build it all before being able to then join it with T1, therefore it couldn't benefit from the count stopkey resulting from "rownum < 100".


Again, I went to speak with the application to fully understand the logic behind it.


It appeared that every time application was doing something on id, a new row was inserted into table T1 with id and seq=max(seq) + 1. This running number was per ID (compare to a running number on whole table), thus an Oracle sequence couldn't be used here. I have also found that application was mostly interested in max(seq) for each id, nevertheless, it didn’t hold this value anywhere.


It was clear to me that the way schema was designed simply didn’t match application needs.
This is when I knew that schema design must be changed. There has to be a table holding max(seq) for a given id, and this change would surly lead to performance boost.


Therefore I have created a new table T1_MAXSEQ ( ID number, SEQ number), with unique index on (ID,SEQ). Now, something has to keep this table updated and fully match values in table T1.
I found that application was already using a before insert trigger on T1, which set the correct value of SEQ to max(seq) + 1 for any given ID. Since DMLs were not intensive on T1, I have decided to alter this trigger so it would also insert a new row into T1_MAXSEQ when a new ID is inserted into T1, and update an existing row to SEQ+1 for an existing ID.

Schema change (new table and trigger update) was minimal and transparent to the application. All was left for them to do was to rewrite the SQL as I have instructed, so that it will now join T1 with T1_MAXSEQ instead of using inline view to find max(seq) for every IDs.


SQL should now look like that:


      SELECT …
         FROM T1 s,
                     T1_MAXSEQ maxresults     

      WHERE s.id = maxresults.id AND
             s.seq = maxresults.maxseq AND
             s.col_a = 0 AND
             s.col_b = :1 AND
             s.col_c < :2 AND
             s.col_d = :3 AND
             rownum < 100



By adjusting schema to application needs and then rewriting this SQL, execution went down from 35 minutes to 1 sec, thus application was able to execute this SQL 25k times a day (compare to only 20-70 before).
Oracle is now using the new small table T1_MAXSEQ  as the outer table in the nested loop, thus capable of benefit from count stopkey (execution plan before and after the change can be seen in the link below).
Heavy overload on the instance as well as on machine and disks was stopped. SQL performance was improved dramatically, allowing huge increase in application capacity and productivity.


This can be visully seen in the following pdf: http://www.dboptimize.co.il/oracle_adjust_schema_to_application_needs.pdf


Again, just from a short conversation with the application and by asking the right questions I manage to understand where the problem resides and expand my ways to solve this problem.

I will say it again (and probably not for the last time): Most of the performance issues I have seen in instances were application issues and not instance configuration issues. Knowing your application is one of the most important steps in each tuning process.


Merav Kedem,
Founder of
DB OPTimize
Oracle Performance Tuning & DBA Consulting

No comments:

Post a Comment