It’s All About That Data

Posted October 23, 2018 | Leadership | Technology | Amplify
data model
In this issue:


In our third piece, we hear from Michael Müller, an architect consultant in BI for almost 20 years. Like ten Napel, his view is that managing change is essential for a digital business initiative. This position emerges from his experience with BI projects where “Babylonian confusion” reigns, as business and IT lack a common vocabulary and an ability to communicate clearly about data needs and structures. Müller posits that digital business shares these same characteristics, but at a much larger scale because of the nature of big data. He offers an overview of a pathway for creating and maintaining conceptual-level data models that may be useful to implementers of digital business environments.

A digital business thrives on its data. The business must create value from the data not only to gain success, but also to close the gap with successful competitors — especially with new competitors that have had the opportunity to build a greenfield solu­tion and utilize their data more extensively to create increased revenue and business value.

A less expensive way than a greenfield approach to achieve more customer value is by building new processes or using existing data from the data warehouse in new, innovative ways. Unfortunately, there is often very little knowledge about the underlying data. Although business users tend to have a clear picture about the information stored in the data, they are often unable to express its precise meaning in a way that it is understandable to IT or other departments. IT, on the flip side, is often unable to translate its data concepts in a way that is understandable to business users. This situation breeds a great deal of miscom­munication and misunderstanding. Thus, we need a common vocabulary to end the pattern of discussions that result in Babylonian confusion.

This Babylonian confusion is well known to anybody implementing or developing a data warehouse. In a data warehouse, a conceptual data model offers the opportunity to overcome the lack of a common vocabulary. A high-level conceptual data model delivers a common language for describing the life­cycle of data. With this model, each interaction of the customer journey can be seen along the data trail, delivering a deeper understanding of how the data moves through IT systems. There’s a greater focus on the big picture, describing the necessary results and artifacts rather than giving step-by-step instructions of how tasks and processes should be done.

This big picture approach makes clear the relevant infor­mation and necessary data. A conceptual data model helps declare the requirements for a business intelligence (BI) system and the requirements of the IT systems. This model allows the team to analyze strengths and weaknesses of various possible solu­tions. These solutions can be tested before being implemented. With a clear map of the company data, it becomes easier to find a way to integrate external data and easier to spot whether the external data generates revenue and business value.

A conceptual data model shifts the focus to valuable metadata. In the past, conceptual data models were often costly and poorly maintained. The effort involved in keeping the model up to date was simply too high. To achieve success with a conceptual data model, we must cut away the excess and integrate the model with the existing metadata. Harvesting the metadata from existing systems saves time. Constant checking of the model and the corresponding systems is required to ensure future accuracy.

Managing Change Is Essential

Digital business means changing the current business model by delivering customer value through a new combination of the digital and physical worlds. This novel way of creating customer value is done with data (i.e., machine-processed information), connecting the physical objects in the same way as power lines. But unlike a power grid, the connection through data is often not well documented. Since data is not a physical object, people’s understanding of its purpose and con­tent can vary widely. Therefore, a greenfield approach looks very promising because all participants typically have the same clear picture of the result. Planning is done in parallel across all areas to yield the best solution with a clear architecture. And on top of all that, the project has the full attention of high-level management — something that the cost alone will guarantee.

Now, everything may look positive on the surface, at least until the change necessitated by the solution becomes evident. Then there is a high risk that the needed change is not com­patible with the existing solution or, worse, with the existing understanding of the business. Thus, it is not only the solution that must be maintained over time; an understanding of the underlying picture by all must be ensured as well.

In the end, attempting a greenfield approach comes with the same problems as an evolutionary approach. Throughout evolutionary change, a common picture must also be maintained. This problem has been at the root of BI systems for several years now. Every five to seven years, someone tries a greenfield approach in BI. But only a handful of companies have managed to maintain and extend a solution for more than 10 years — those who have found a common picture. These new, successful BI systems started with a common picture and a common language. One of the essential success factors is avoiding fruitless general discussion and instead focusing on facts and results, while allowing differences of opinion. Because these differences reveal a lot about “what is really going on,” having them out in the open helps change perspectives.

A Data Warehouse Reveals a Common Picture

In data warehouse development, a glossary addresses the need for a common language. This glossary has evolved in recent years into a conceptual data model. A conceptual data model defines business objects in the same way as a glossary but has the advantage of also showing the relationships between business objects. A conceptual model is a high-level model that allows all IT systems to redefine their part of the conceptual model by creating their own logical/physical data models. The design of a data warehouse usually calls for the integration of the existing IT systems’ data in such a high-level conceptual data model. The resulting model is then used to generate a first version of a logical data model for the data warehouse. The generation of this first version is a huge part of data warehouse automation because some part of data warehouse development is done automatically.

A Conceptual Data Model Provides the Vocabulary

A conceptual data model is a very good starting point for implementing a digital business. In such a model, there are business objects like “customer” or “product” along with dependent business objects like “order” (see Figure 1). Dependent business objects are usually the result of a business transaction; they link the content of the transaction to the assets (business objects). Business users easily understand this logic. A conceptual model is an information model — a model made for human consumption rather than for computers but still formal enough that computer-readable data can come out of it.

Figure 1 — A simplified conceptual data model of an order.
Figure 1 — A simplified conceptual data model of an order.

An individual instance of a business object is identified by a business key (also shown in Figure 1). Business keys are easily found by asking business users, “How do you refer to a specific customer/order/product?” Technical keys — keys used in the IT system for identification — usually exist as well. There might be more than one technical key because a particular business object is used in several IT systems. Put all the keys into the model because they may provide insight into future problems that might need solving at some point. Gathering all this information is part of getting to know “what is really going on.”

Ultimately, there may be problems surrounding definition. For example, sales may define a customer as someone who has placed an order in the last six months, while marketing may say everybody interested in our products is a customer. The resolution is to show the difference by using precise definitions, such as:

  • A marketing customer is somebody in our customer database.

  • A sales customer is somebody in our database who has placed an order in the last six months.

Honor the Differences Between Departments

Such precise definitions avoid long discussions that attempt to come up with one all-encompassing defi­nition and allow differences and various focuses to become clear. This is good; different departments have different jobs to do. Appreciate business users by honoring their view on their own area of expertise. And, to return to our customer example, if customer care has a third definition of customer, that’s fine, too. Everybody now has the vocabulary in hand to define the differences. Eventually, these differing definitions will bring some inherent problems out into the open and might help close age-old trenches between departments. At the very least, senior management of these various departments — when identifying a need for change arising out of differing definitions — will know what to change.

Moreover, these definitions automatically deliver the important attributes to a conceptual data model. They will be needed in the BI system to make proper reporting possible for different departments on, for example, the number of marketing customers, the number of sales customers, or the number of support customers. Laying out the differences and providing everyone with the same understanding as well as a vocabulary to address the differences clears the way for making and implementing strategic decisions. Without a common understanding, the results are often disastrous because people feel threatened and only work with their current definitions. Proper reporting when a common understanding is lacking requires people to find their own precise definitions, a process that takes time — time that is usually not available. If differing definitions are accepted and people understand the contrasts, they usually find a good way to integrate the definitions into their thinking and accept and work with them. Because they see the benefits, the change becomes something they desire and will find the time to implement.

Harvest Existing Knowledge

The discussion of the conceptual data model cannot start with the open-ended question, “What are your business objects?” That kind of approach usually results in the business asking the typical question, “Well, what business objects are there?” — meaning that business users prefer to choose from available data rather than pick just one. To avoid this standoff, IT must prepare by understanding which business objects might be available and by using the language of the business. The knowledge about the business objects is there, if one knows where to look. Before going in depth with the business, look at the important existing systems. Look at existing documentation or old requirements documents, but don’t take them at face value. Remember that implementation usually brings in a new per­spective, change that is seldom documented after the fact. Instead, harvest the metadata from the existing database, creating a data model, and collect valuable information. The data model created from the existing databases and the information you obtain from the actual data by profil­ing it are of even greater value than existing documentation. Even lightweight profiling on the attribute level can provide significant insight. By looking at an attribute and its type (e.g., numeric, alpha, alphanumeric), examining the range of values across all records, and counting how many different values exist, one can grasp the meaning of this attribute. By looking at the number of different values and the number of null values, one can find candidates for a key. With this kind of profiling, even undocumented databases can be understood.

From the information gathered, a conceptual data model can be built within days. Check the accuracy of the model with the product owner of the analyzed IT system. If you find constraints within the model, write them down. Keep a link to the existing system and its data model so that before releasing every new version of the IT system, a check of the actual model against the conceptual model can be implemented and run. Ensuring that the conceptual data model stays up to date is a very good way to make sure future development will not go astray.

Deliver Insight and Collect More Knowledge Through Collaboration

Once prepared with the insight from profiling, all par­ticipants will find the discussion about the conceptual data model insightful; this will help ensure buy-in. Putting the conceptual data model on a collaborative platform will help as well. On this platform, every­body can add comments, ask questions, and be automatically notified about changes. Don’t use this as a replace­ment for meetings; rather, employ this to prepare for meetings by answering a few questions: Were people active? What questions did they ask? On which parts did they comment?

In this way, work happens on all sides, even between meetings. Such collaboration should help generate excitement for your digital business initiative. Through the input received via the collaboration platform and the creation of the conceptual data model, you’ve already done most of the conceptual and design work on your data warehouse. The data warehouse then creates the necessary insight for evolving your digital business in an ever-changing world.

Add the Customer View

The conceptual data model is static. It doesn’t tell anything about the processes that collect the data or in which order the transactions are carried out. It is only about the results of transactions and only about your own current data.

All the dependent business objects are customer touchpoints with your company. These touchpoints provide a picture of a customer’s current documented interaction with the company, a very high-level view of how a customer experiences the company. A customer’s interaction with the company resembles a process with each touchpoint acting as a task that results in data — your current dependent business objects. Participating business people are likely to think of possible immediate improvements to the customer experience now that they can follow through the data.

However, the picture is not yet complete. Only those touchpoints that have generated data are documented. Some touchpoints, such as when a potential customer first becomes aware of a company or its products, hardly result in any data, and no information is stored. To complete the customer picture, we must determine which interactions are not being documented and how to measure our performance on those interactions. What is not documented cannot be measured, and what cannot be measured cannot be improved. If a currently undocumented touchpoint is important, we need to determine whether we can obtain the data from external sources, and, if not, whether there is another way to collect it. With this knowledge, we can now add other (external) data to our conceptual data model. What is the customer’s journey? What are the touchpoints during the phases of awareness, favorability, consid­eration, intent to purchase, and conversion?

Create a High-Level Business Model

A complete customer view provides a full picture of customer touchpoints, from first awareness until that person is sadly no longer a customer. The customer journey is not a process with a known beginning or endpoint. It is just single steps, some of which are dependent on prior steps. For example, a customer needs to complete the step “buy product” before the step “contact customer care” can happen. It must be clear where in the lifecycle of being/becoming a customer each step happens. And for each step, it must be evident whether there is company data or not; if data is there, the step directly links to the conceptual data model. If there is no data, a decision must be made about whether this data will be collected in the future (or maybe there is sufficient external data available in order to avoid additional development).

This process model of the customer journey will be put on the collaboration platform so that everybody can see and collaborate on improving the customer process and associated data collection. The model provides a living picture of the status quo, which can be modeled in Business Process Modeling Notation (BPMN) 2.0, since it allows for complex processes where the tasks are neither mandatory nor executed in a certain order. BPMN 2.0 allows data objects as output.

Find Possible Improvements and Evolve

The conceptual data model, updated with the com­plete customer journey, makes it easy to find potential improvements in the business model and to align them with company strategy. The model can help determine whether a new development will bring the desired results. The initial Babylonian confusion is gone. Even though each participant still has a personal view on the business, he or she understands the differing views of other participants. Everyone can now communicate in a common language and work together to find the best possible solutions. Resistance and pushback should be low because people are heard. The company can function as a whole, utilizing its complete brainpower to find the fastest way to become a digital business. It can quickly achieve its first successes by aiming for the lowest hanging fruit; these first successes create the willpower to achieve and the budget to fund larger changes.

Integrate the Change

Change is likely to result in the addition of external data sources. As soon as this external data is on the company servers, it should be added to the conceptual data model as well as to the originating task/touchpoint. These changes will be visible on the collaboration platform. As the entire company sees the business model and the data behind it evolving, people will generate different ideas on how to proceed. Good, we need the best; test them, prove them, and apply them.

If the change affects existing IT systems, it should be visible in the customer journey as well as in the conceptual data model. The change will first appear as planned development; later, it will replace parts of the model as the new system goes into production.

When it comes to the Internet of Things, in particular, it is common for a product to generate information that varies depending on the version of the firmware. This should be documented in the conceptual data model, too, along with any limitations (e.g., limited usage of the data because of legal reasons). Restrictions are often among the first things forgotten.

Get a Living Model of Your Business

Now the customer journey and the conceptual data model are alive and evolving, rather than becoming outdated. In the past, a conceptual model often failed because changes were not integrated back into the model. To avoid this pitfall, the checks with the conceptual data model and the customer journey must be integrated into the release management of the software development process for the company’s IT systems. This small effort produces a huge impact.

The two high-level models — the conceptual data model and the customer journey, which contain a lot of information — are very good requirements documents. The information can be used as metadata to the new or evolving IT systems, especially in development of BI systems, where this metadata has been used in the last few years to automate BI development. In my experience, this can decrease development time of BI systems by as much as 25%.


Digital business is all about data, so maintaining a conceptual data model provides a company with a vocabulary to address this data, thereby enabling individual employees to work with the data. Sharing the analysis work and the resulting conceptual model with the data warehouse development process saves time and effort. The more data is used, the more collaboration and input for change affect the data, and the more it becomes a living, evolving asset. Binding the conceptual data model to customer touchpoints creates insights into the business model. This business model is known to everyone and evolves with the business over time. There are two important factors to success: people and metadata.

With knowledge and a common vocabulary, people have two powerful tools to do their job better. Involving people and listening to them ensure their participation. Having a very clear picture of the current situation may sometimes be painful; however, accepting the status quo as the best possible solution so far — not imposing solutions — ensures collaboration on improving the current situation, a process that is now easier because of the new tools.

In the past, conceptual models often failed because of their tendency to become outdated. Using metadata for a vertical integration between the conceptual business models, the logical IT models, and the actual IT system solves this outdating problem. As a bonus, the meta­data can be used to speed up the development process, especially in BI where data warehouse automation has been successfully implemented. The key factor for data warehouse automation is the metadata provided by conceptual models. Digital business is indeed all about data, and by creating a common vocabulary for the data of your organization and by visualizing the customer journey, you can finally talk about it and be understood.


About The Author
Michael Muller
Michael Müller has been a consultant in business intelligence (BI) for nearly 20 years. He focuses on data modeling, data architecture, data vault modeling, and data warehouse automation. Currently, Mr. Müller is expanding into architectures and collaboration on data strategy/requirements for BI. He is on the board of directors of the German-language Data Vault User Group. He can be reached at