Cutter Consortium
13 November 2007

The Data Steward: Bridging Business and IT

The data steward role is not a business role in the "take customer orders" or "prepare invoices" sense. It is clearly a role that bleeds over into what was traditionally the IT space. IT and business must come together, and the data steward is one of many roles that will help accomplish that. At first glance, the following tasks look as though they would be IT prerogatives, but they are in fact essential business responsibilities:

Finding and correcting dirty data. A big part of the data steward's job is auditing the data, which means finding dirty data and correcting it. This process requires knowledge about extracting data from all types of files and databases, being proficient in the use of data profiling tools, knowing some rudimentary SQL to customize data audit queries, and being intimately familiar with the business rules when spot-checking data domains and determining how to correct them. While operational businesspeople don't normally know how to do these activities, a data steward should know how, be trained to do it, and/or work with a data quality analyst on the IT side to do it.

Providing data validation rules and data-cleansing algorithms. Data validation rules are business rules for the data. Say a data element can only contain the values A, B, C, G, I, and J and not D, E, and F. Unless the data steward makes this explicit, an IT person would have no way of knowing that the business rule doesn't allow values of the whole alphabet. Consequently, he would never know that if the file has values of D, E, and F, he must catch them with a program edit-check and reject them as errors. A data steward on the business side must provide those rules, and because many times it isn't as simple as rejecting invalid values, she must also provide the data-cleansing algorithms for turning bad data values into good ones. She may say that if there is a "D," then the IT person should look at another data element and -- depending on some other criteria -- convert it to either a "B" or an "I." The IT person wouldn't have known how to fix the problem, but it's the data steward's job to know how.

Ensuring data integrity. Ensuring data integrity refers to preventing (or rooting out) data violations among dependent entities or attributes. For example, a person's date of birth should obviously precede his date of death. If it doesn't, it's a data integrity issue, and the data steward is responsible for resolving it.

I welcome your comments on this Advisor and encourage you to send your insights on the market in general to me at lmoss@cutter.com.

-- Larissa T. Moss, Senior Consultant, Cutter Consortium

The Data Steward: Bridging Business and IT