Monthly Archives: September 2011

You are browsing the site archives by month.

DB2 LUW Research

I was set a task by one of the company directors to find out about the DB2 ecosystem but there seems to be very little information out there that is freely accessible, or anybody that hints that they have it to buy so I have decided to collect my own. So this is a tester to see how willing you guys are to impart some information, equally once you have filled it in then you should be able to see the answers and have the info too.

So the first question was:

We use DB2 LUW on Linux SUSE. The next of the starter questions for the “DB2 ecosystem” was:

Here at Holiday Extras we use DB2 LUW for OLAP and data warehousing. The final question for this

Thank you for taking the time to fill in this questionnaire. There maybe more questions later and I would really appreciate it if you could keep an eye out and fill them in too.

DB2 Automated Maintenance Vs Automated Maintenance Scripts

I am the first to admit when I am wrong and accept the consequences but IBM some times do not make it easy to work out what you are meant to be doing and answers of  ”it depends” are not entirely helpful, equally I understand that I know what I know about DB2 and I am more than willing to learn. Those of you that have come to my blog before have probably seen the articles that I have done before on stored procedures that you could automate to Reorganise tables and indexes, Runstats on the reorganised tables and then finally rebind the stored procedures. These recently have started to overrun and affect production systems that they were not meant too, so a new way of working needs to be found, therefore we are back to the automated maintenance found in the DB2 ESE product itself or editing the scripts, but from my research automated maintenance does not really do things in the right “order”.

The automated maintenance provided with DB2 from what I understand allows two time periods online window and an offline window, and in essence three different types of work method “do something” or “tell someone who’s bothered” or “do something and tell someone who’s bothered”. In the online window you can carry out runstats and activities that can be carried out on tables without talking them offline. In the offline window DB2 will carry out regorgs of tables in an offline classic reorg. At no point will it will it rebind the stored procedures and in no way are these joined up e.g. if a table is reorganised it will then have the runstats done on it unless DB2 formulas behind the scenes decide too. Where as the scripts and stored procedures I created will do everything in order, but is it needed.

I was listening to the excellent free webinar (live) from DBI on “DB2 LUW Vital Statistics – What you need to know” (replay download at the bottom) and listening to guest John Hornibrook explain how and what you can set in DB2 to gather statistics was elightening and I learned so all good. Having been researching the automated maintenance I thought of a question “Do you need to runstats after a table / index reorg?”, the host thought that he knew the answer, but I think John threw him a little bit of a curve ball by responding with (something like) “well the data has not changed but the locations and distribution on disk have changed” (would have liked to get the exact quote but no sound on replay I downloaded!), well I was even more confused. I would have loved to have submitted a follow up question but they drew it to a close in short order after that. My next question would have been “Will a table be marked for runstats after it has been reorganised?”.

So on the theme of that question I thought IBM developer works might know and if did have some very useful information on it Automatic table maintenance in DB2, Part 1 and Automatic table maintenance in DB2, Part 2. These articles are very good and explain how automatic table maintenance works, but equally left me with questions. A line in the Part 2:

“If you reorganize the table and do not update the table statistics by issuing a RUNSTATS command, the statistics will still indicate that the table contains a high percentage of overflow rows, and REORGCHK will continue to recommend that the table be reorganized”

But in Part 1 on runstats there is a list of decisions DB2 will make as to wether it needs to runstats:

  1. Check if the table has been accessed by the current workload.
  2. Check if table has statistics. If statistics hare never been collected for this table, issue RUNSTATS on the table. No further checks performed.
  3. Check whether UDI counter is greater than 10% of the rows. If not, no action on the table.
  4. Check whether UDI counter is greater than 50% of the rows, issue RUNSTATS on the table if UDI counter is greater than 50% of the rows.
  5. Check if the table is due for evaluation. No further action performed if the table is not due for evaluation. An internal table is used to track if tables are due for evaluation.
  6. RUNSTATS if the table is small.
  7. if table is large (more than 4000 pages), sample the table to decide whether or not to perform RUNSTATS.
So this seems that a table might not get runstat’ed if it did not fall into these criteria and then it would keep being targeted for reorganisation. Another thing that intrigued me was that:

“All scheduled reorganizations (and other automatic maintenance operations, like automatic runstats) are maintained in a queue. When the corresponding maintenance window begins, reorganizations are performed one after another until the end of the window”

So if your tables are large or your window when your tables can not be accessed is short then not a lot of work will be done. It is not multi threaded like the stored procedures that I wrote, but it does have one advantage that the reorganisation phase is to a window, something that is not built into my scripts. Equally the stored procedures have their disadvantages as the reorganisation is IO heavy and the runstats is CPU heavy, so if you have multiples of these things going off all could be at different stages and become quite a load on the server.

I think that the solution is that automatic maintenance is useful just to keep your runstats ticking over during the week because as explained by John this automation is very “light” and also can be set to evaluate before a query is run, but for reorganisation I think I am going to write a new version of the scripts and stored procedures that I blogged about before and build in time windows that work will be carried out under because it is a more joined up way of doing things and also will include the rebind which is essential for DB2 knowing the best execution plan for stored procedures.

I would love to know your experience with automatic maintenance or other methods of keeping your reorganisations and runstats up to date so please feel free to comment on this posting.