UNIVERSAL INTEGRATION SOLUTION
4 December 2001
by Don Estes
This simplest form of application integration involves doing what we do now -- sending data back and forth between applications via the exchange of batch data files or real-time messages. XML will definitely address this problem. But the real task for robust application integration is not replicating data in multiple databases, but rather storing it in one place while making it readily available to all qualified users. Note that these users could be on the same platform, on different platforms at the same physical location, or scattered around the world. From this point of view, replicating data might be desirable from a performance- tuning point of view, just as we denormalize relational databases to gain performance, but it would never be necessary for functional reasons.
There is a serious problem with replicating data stores and data queries on multiple platforms. In addition to the obvious problems of databases' getting out of synchronization with one another and the time delays in replication, there is the much worse problem of exactly duplicating the business rules associated with the data for each instance. For example, consider a data warehousing system in which the data from the mainframe database has simply been exported to a database server. The data is all there, exactly as on the mainframe, but when the users attempt to query against that data, they get different results from the server and from supposedly equivalent mainframe queries. Worse, the discrepancies may seem to be intermittent, sometimes right and sometimes wrong, for reasons that are not obvious to the end user. This leads to a fundamental lack of trust in the query against the server.
These discrepancies can always be resolved, of course. The operational issue is how long it takes to diagnose each discrepancy, resolve it, and ensure that other, rarer discrepancies are not waiting to reveal themselves. More importantly, as the business rules evolve on the primary platform, the replicated rules must change in precisely the same way, in synchronization.
Whatever the difficulty in precisely replicating query business rules, there is greater concern regarding the precise replication of update business rules. This is because the damage caused by an incorrect update is usually greater than the damage caused by an incorrect query. For this reason, it is not unusual for a organization to allow direct query access to data but route all update requests through the standard transaction processors that contain complex but trusted validation logic.
All of this complexity devolves from the simple case of lack of trust in the replicated processes. However, whenever we undertake a project to replicate existing processing, we always optimistically presume that we can exactly duplicate what we have now. The reality is that this is much, much more difficult than we ever expect, and it is made more difficult still if the programmers attempt to "clean up" some of the queries in the process.
The practical solution to this problem is the creation of trusted data components (TDCs), a new tier in an n-tier application architecture. These components reuse existing queries, obviating the need for duplication of query logic, yet expose them to user requests from any platform via a simple-to-use mechanism. Having a single point of maintenance also reduces maintenance costs while ensuring that the result will be trusted by all users.
The physical implementation may take one of two forms. It may consist of a gateway to the primary host, a gateway to the secondary host once the replicated rules are proven, or both. Having a dual pathway has several interesting implications. First, if there is a valid reason to go to the primary source in any given instance, that pathway can be forced. Second, the component can provide diagnostics, if a variance is suspected, by querying both pathways and comparing the results. Third, in the case of an outage in one pathway or a need for load balancing, the TDC can be switched to the other without causing a disruption at the end-user application. Finally, if there is a maintenance upgrade being applied to the primary source, the TDC can be switched to the primary pathway until the corresponding upgrade of the secondary pathway has been proven.
-- Don Estes, Senior Consultant, Cutter Consortium
[For more on XML and enterprise application integration, see the August 2001 issue of Cutter IT Journal, available from Cutter Information Corp. at +1 800 492 1650 or +1 781 641 9876, fax +1 800 888 1816 or +1 781 648 1950, or e-mail service@cutter.com.]
Universal Integration Solution

