DB2 uses 300 tasks (DB2 writers - it could be more now with later DB2 releases) to write modified (aka, "dirty") pages to the DB2 table spaces (either permanent tables or DB2 work files). From a DB2 perspective this is "efficient" but from a resource sharing perspective there are huge bursts of writes to disk followed by very little writes (several seconds) to these table spaces. DB2 is NOT being a good sharer of resources. These resources include:
The base UCB (there's only one base address for a parition of a table space.
Aliases... there's at most 128 aliases being used for LCU for most customer configurations. DB2 (and zFS) are the ONLY applications that exhaust both the aliases for the LCU and all other aliases in the management group! (That's a SuperPAV implementation construct.)
FICON channels (there's a limit to the number of open exchanges on a channel. Each DB2 writer causes an open exchange at the start of the SSCH and it remains open until the SSCH completes (assumes no physical disconnect which is almoast always the case). The maximum open exchanges depnds on the the channel technology. I believe zHPF and 16Gbit channels had a 128 open exchange limit. Not sure about 32Gbit channels but I suspect it's increased.
DSS host adapter ports. Most customers have 1-1 mapping of channels to DSS HA ports but this need not always be the case.
DSS PPRC ports. This has been my experience of where most of the contention occurs. There are usually fewer PPRC ports for metro mirror than FICON ports used to keep the DB2 tables synchronously updated.
What I've seen at several IntelliMagic customers is very high write disconnect time for SMS storage pools containing DB2 table spaces (both the permanent DB2 table spaces and the DB2 work files). The overall PPRC send response times for ALL writes are normally double than expected even though these DB2 writes are a small percentage of writes to disk using PPRC.
DB2 customer insult themselves from it's own bad sharing behavior by using zHyperwrite for DB2 logs (DB2 transactions can be waiting for the commit to the log to complete). SInce most of the bottleneck is at the PPRC ports the DB2 log writes perform well (and are usually on separate LCUs than the DB2 table spaces). That DOES NOT mean the 300 DB2 writers are free to consume ALL of the (base address, PAVs (aliases), channels & PPRC resource for short bits of time (compared with the 15 minute or so RMF interval where most I/O activity is averaged.
There is no workaround for this issue. The I/O response time pain is these consistently whenever the DB2 writers are activity. IBM customers are fortunate it's for a modest duration with an RMF interval of time with the rest of the interval unencumbered with DB2 write activity to the table spaces.
The proposed solution is to "meter" the writes from the CF or local buffer pools such as to reduce contention but still free up modified space. The DS8K firmware team has long employed a metering scheme for destaging modified data to disk. Their advantage is the code knows the hardware... DB2 does not. Still, I'm confident a metering scheme of 'n' writers for 'd' time duration and then check to see if modified data has been reduced (if it's increased add 2x'n' more DB2 writes, if it's about the same add 1x'n' more DB2 writers and if it reducing nicely "stay the course". You can add some history to the buffer pool being managed by saving the peak # of DB2 writers used concurrently and the next iteration of dirty page cleanup can start with this number of writers. I'm sure the DB2 folks (software and HW performance) can come up with a better scheme that this one. The point is it's not very complex and other software (namely, DS8K) already have metered writes for many years and many generations of HW and SW.
The benefit is to the IBM customer workload as a whole and NOT DB2 by itself. This has been the main sticking point for "doing nothing". If this algorithm impacts benchmarks but works well in the field you can always add a hidden zparm value which disables the metering. :-)
The benefit is not saturating several levels of software (base addresses, alias addresses, open exchanges) and hardware (FICON channels, DSS host adapter ports and DSS PPRC ports) for short periods of time. It's painful for everything else (including DB2 I/O) albeit for short periods of time (in my experience).
In my experience every enterprise DB2 customer has this issue. For sure, every fortune 500 and likely fortune 2000 customer experiences this issue. DB2 customers don't complain because writes to the DB2 tablespace is asynchronous to transaction completion and the short duration nature of the write "saturation" makes it hard to point the finger at the inefficiency of the 300+ DB2 writers as the cause.
Most of the impact I see is Monday-Friday first shift. During nightly batch there's usually a different behavior of DB2 that it's hard to determine (without a GTF trace) how bad the 300+ DB2 writers are to other applications performance.
Dear Joseph, Thank you for submitting this enhancement request. While we view all the enhancement requests we receive as valuable and we would like to implement all of them, in an effort to have maximum transparency we are not considering the ones that aren't currently appearing on our 12 month product roadmap. This is not to say that these enhancements may not still be implemented at some point in the future as we do evaluate our requirements on the basis of customer demand and technical impact, to deliver maximum value to our customers.
We would like to point out that Db2 13 had implemented more gradual GBP write behavior based on prior customers (and your) suggestion. We appreciate your input to the Db2 for z/OS development team and we will continue to look into improve the write behavior from Db2.
Sincerely,
The Db2 for z/OS Team
Joseph, Thank you for your reply. I shared your feedback with our SME and will notify you when they respond with their next update.
Sincerely
The Db2 for z/OS team
Janet,
I checked (as best I could) the DB2 level. So far all the DB2 level's I can verify with IBM hardware is DB2 12. I found one environment with DB2 13 but this one is almost all HDS boxes. The write disconnect times are poor in this environment as well.
I'm not sure if there's something else in DB2 13 (a zparm?) that also needs to be tweaked with your granular GBPOOLT checking. I can say that DB2 work files exhibit the batched arrivals of writes I see in the permanent table spaces. The work files are only local buffer pools (as I understand it) and not group buffer pools.
Hello Joseph, Thank you for submitting this Aha! Idea. Please let us know if this observation comes from Db2 12 or 13. We did implement the granular GBPOOLT checking in Db2 13 to address this issue.
Sincerely,
The Db2 for z/OS Team