Among the infrastructure changes in DB2 10 is the way that PGFIX(YES) buffer pools are managed on a System z10* or zEnterprise* 196 (z196) processor. The System z10 processor with z/OS* 1.10 introduced 1 MB page frames. As the amount of memory increases, large page frames help z/OS improve CPU performance. IBM laboratory testing has measured CPU improvements of a few percent for specific transaction workloads. To exploit large frames, define the buffer pools with PGFIX(YES) and specify the LFAREA z/OS system parameter. LFAREA is the amount of real storage that z/OS uses for large frames.
RELEASE(DEALLOCATE)
The RELEASE(DEALLOCATE) option has been part of the DB2 \BIND/REBIND command for a long time, but DB2 10 makes the function more useful. The dramatic Database Services Address Space Parameters (DBM1) virtual-storage constraint relief in DB2 10 achieved with rebind makes it possible to make more use of RELEASE(DEALLOCATE). This change saves up to 10 percent of CPU time for high-volume transactions with short-running SQL, without changing applications or Data Definition Language (DDL).
Distributed Applications
For Distributed Data Facility (DDF) work, after rebinding packages with RELEASE(DEALLOCATE), the customer must issue the MODIFY DDF PKGREL(BINDOPT) command to allow DB2 to use RELEASE(DEALLOCATE) processing for packages bound with RELEASE(DEALLOCATE). DDF inactive thread processing (CMTSTAT=INACTIVE) takes advantage of new high-performance database-access threads (DBATs) to increase distributed DBAT thread reuse. Implementing the MODIFY DDF PKGREL(COMMIT) command can commit behavior when, for example, you need to allow utilities to break in. With DB2 10 for Linux*, UNIX* and Windows* 9.7 Fix Pack 3, the Call Level Interface (CLI) and JDBC packages are bound with RELEASE(DEALLOCATE) by default. RELEASE(DEALLOCATE) assumes the use of well-behaved applications that adhere to required locking protocols and issue frequent commits.
DB2 10 further improves overall Distributed Relational Database Architecture (DRDA) application performance for result sets from SELECT statements that contain the FETCH FIRST 1 ROW ONLY clause. Simply combine the OPEN, FETCH and CLOSE requests into a single network request. DB2 10 also offers improved DDF performance through restructuring distributed processing on the server, particularly the interaction between the DDF and DBM1 address spaces.
Migrating to DB2 10 Conversion Mode, rebinding packages, exploiting large page frames and exploiting RELEASE(DEALLOCATE) can save up to 10 percent of the CPU for transaction workloads and up to 20 percent for native SQL-procedure applications. Much greater CPU savings is possible for queries that contain large numbers of index Stage 1 predicates, as well as IN-list predicates. Also, more Stage 2 predicates can be pushed down to Stage 1.
Capacity Improvements
Since most of the thread storage has been moved above the bar, DB2 10 can support more threads than DB2 9, thereby making it possible to reduce the number of DB2 data-sharing members, or at least hold the number of members constant while increasing transaction throughput.
If you were previously unable to increase the MAXKEEPD zparm value due to a DBM1 virtual-storage constraint in DB2 9, you may be able to increase MAXKEEPD since the local-statement cache is moved above the bar. Increasing the constraint value may reduce prepares for more SQL statements and provide additional CPU savings.
I/O Improvements
Other CPU improvements apply in more specific scenarios. Some of these are related to I/O improvements since I/Os are one of the significant consumers of CPU time. These include an improved, dynamic prefetch sequential-detection algorithm. List prefetch support for indexes helps minimize index I/Os when scanning a disorganized index. The number of log I/Os is also reduced and long-term page fixing of the log buffers saves CPU time as well.
zIIP and zAAP Exploitation
Buffer pool prefetch and deferred write Supervisor Request Blocks (SRBs) ordinarily aren’t big CPU consumers, but DB2 10 makes this specific processing eligible for System z* Integrated Information Processors (zIIPs); zIIP processing can reduce the number of billed MIPS a company uses. This CPU savings is more significant when using index compression.
As in DB2 9, DB2 10 can direct up to 80 percent of CPU-intensive parallel-query processing to run on an available zIIP. DB2 10 makes more queries eligible for query parallelism, which can result in more zIIP exploitation. In DB2 10, portions of the RUNSTATS utility are eligible to run on a zIIP.
XML schema validation and nonvalidation parsing of XML documents are eligible for zIIP or System z Application Assist Processing (zAAP). If XML parsing is done under DDF enclave threads, it’s eligible for zIIP. If the XML parsing is done under a batch utility, it’s eligible for zAAP.
Query Performance
Range-list index scan is a new type of access path DB2 10 uses to significantly improve the performance of certain scrolling-type applications where the returned result set is only part of the complete result set. The alternative in DB2 9 was to use multi-index access (index ORing), which isn’t as efficient as single-index access. Prior to DB2 10, list prefetch couldn’t be used for matching IN-list access. In DB2 10, list prefetch can be used for IN-list table access, ACCESSTYPE=’IN’.
The process of putting rows from a view or nested table expression into a work file for additional query processing is called physical materialization, which is an overhead. Additionally, it limits the number of join sequences that can be considered and can limit the administrator’s capability to apply predicates early in the processing sequence. The join predicates on materialization work files are also not indexable. In general, avoiding materialization is desirable. In DB2 10, materialization can be avoided in additional areas, particularly for view and table expressions involved in outer joins.
The processing of Stage 1 and nonindex matching predicates has also been enhanced. DB2 now processes non-Boolean predicates more efficiently when accessing an index and Stage 1 data-access predicates. You don’t need to rebind your static applications to take advantage of some of these optimization improvements. However, a rebind is required to take full advantage. More complex queries with many predicates show higher improvement. Queries that scan large amounts of data also show higher CPU savings.
DB2 10 also contains some SQL-sorting enhancements. DB2 10 introduces hash support for large sorts, which potentially reduces the number of merge passes needed to complete these sorts. Hashing can be used to identify duplicates on input or to sort, if functions such as DISTINCT or GROUP BY are specified. Some additional cases also exist where, when FETCH FIRST N ROWS is used, DB2 10 can avoid the sort process altogether.
Insert Performance
The performance of high concurrent inserts is better in DB2 10, with tests showing a typical range of 5- to 40- percent reduction in CPU time when compared with DB2 9 performance. Higher CPU performance is achieved when processing sequential inserts. The amount of improvement also depends on the table-space type, with Universal Table Spaces seeing higher improvements.
When a series of inserts are sequential with respect to the cluster key–but the key is less than the highest key in the index–a new page-selection algorithm helps minimize getpages, which can help reduce CPU cost.
DB2 10 contains some referential integrity-checking improvements on inserts that may result in reduced CPU utilization and I/O processing.
Utilities and Large Objects
Generally speaking, DB2 utility CPU performance in DB2 10 is equivalent to that of DB2 9, but DB2 9 already introduced performance improvements of up to 50 percent CPU savings compared to DB2 8 for various utilities processing.
DB2 10 also contains numerous large object (LOB) enhancements and one of them applies to Conversion Mode, namely LOB materialization avoidance. For LOBs, IBM has observed up to a 16-percent reduction in CPU consumption for DDF Inserts.
Add Up the Savings
All of the CPU savings described here apply to Conversion Mode. Some of the performance improvements are available upon installation of DB2 10, without additional changes, while others require very minimal changes, such as a change to installation parameters. Some require a System z10 processor and perhaps changing the buffer-pool parameters. Most require appropriate rebinds to fully realize the benefits. Additional CPU benefits can be obtained when migrating to New Function Mode, which will be outlined in a future article.