Sunday, December 27, 2015

Quantifying Data movement Costs

A typical IBM mainframe customer moves multiple terabytes of OLTP data from z Systems to distributed servers every day.
A recent IBM analysis found that this activity, often called extract, transform, and load (ETL), can consume 16% to 18% of a customer's total MIPS. For some, the figure approaches 30%.


Below Table shows the results of the data movement study, which focused on two large banking customers in Europe and Asia that each routinely moved their OLTP data off-platform for analysis.

Distributed Core Consumption Total MIPS Consumption
EU Bank 28% 16%
ASIAN BANK 8% 18%
Therefore, moving data to a separate analytical platform clearly consumes a lot of resources. But what does this mean in terms of dollars and cents?

To quantify the cost of this intensive ETL activity, IBM conducted a separate, laboratory-based study that resembled the way that the example banks moved their data from their z Systems environment and onto an x86 server (in this case, a pre-integrated competitor V4 eighth unit single database node). A four-year amortization schedule was used to spread out the cost of the system (hardware, software, maintenance, and support), along with network, storage, and labor expenses.







The result was a unit cost per GB or ETL job to move data off of the z Systems platform. These metrics were used to compute the cost of moving 1 TB of data each day using a simple z Systems software stack, including the IBM z/OS® operating system, IBM DB2® for z/OS, and various DB2 tools. The data would be moved from an IBM z13 to an operation data store (ODS) and then on to three data marts.


As shown in Figure, the study projected total data movement costs of more than $10 million over the four-year period. The study assumed there are four cores on the z13 running at 85% utilization and 12 cores on each of the x86 servers running at 45% utilization. In this scenario, ETL activity burned 519 MIPS and used 10 x86 cores per day.


The primary focus of this ETL study was the cost of extracting and loading data, not transforming it. So the true cost to the banks, or any company, would be substantially higher than is shown here if you added the expense of data transformation using tools such as IBM DataStage®, Ab Initio, Informatica, or others.

No comments:

Post a Comment