Last week Vincent explained the BDUF Pitfall en this week we’ll continue with #4: Incorrectly applied Canonical Data Model (CDM).
CDM is one of the silver bullets often fired in SOA projects. It should address miscommunication, ease integration and reduce integration costs. It surely can facilitate all of this, but attempts to use a CDM can also turn your SOA project into an endless discussion because one attempts to cover too much, because of a lack of alignment with business and because of a lack of design principles.
A Canonical Data Model (sometimes called CIM: Common Information Model) defines the business entities relevant for a specific integration domain, their relations and their semantics. What added value does a well defined and correctly used CDM bring to the table? First of all, it facilitates a common understanding of what a business entity really is. For example is the ‘Customer’ business entity a person or organization? Or is ‘Customer’ business entity a role that can be executed by a ‘Person’ or ‘Organization’ entity. In the same realm of "understanding", it facilitates a common understanding of the relations between business entities. This common understanding eases communication between departments and on a broader scope between organizations as illustrated by the SID model of TM Forum. Lastly, integration costs can be significantly reduced if systems to be integrated speak the same language / use the same concepts (language in this case is not a programming language, but an understanding what a business entity is and what the relations between these entities are).
What are the pitfalls when attempting to create and use a CDM? CDM creators often try to boil the ocean and include each and every piece of information used in the organization. This explodes the amount of entities to be modeled and turns the CDM initiative into an endless exercise. A CDM is intended to be used in the integration domain and should therefore only include entities that are relevant in that domain. Another pitfall refers back to SOA Pitfall #10: Not Invented Here Syndrome and are from the ground up developed CDMs. Potential models that could be reused are ignored, while various potential reuseable domain models are available (SID, UDEF and AFD). Some are industry specific, but even then, definitions for customer, contracts, etc. can often be reused. The next pitfall is the big flat CDM without any structuring. This makes the model hard to use and understand, even when you only need to interact with a small part of it. It slows down adoption of the model. Adoption is also slowed down by inconsistencies in naming conventions and modeling patterns used. One of the biggest pitfalls is to not consult domain experts when defining the CDM entities and their relations. A CDM, just like any IT artifact, should support the business. Therefore it is crucial to ensure that the model reflects the business and it not a pure IT view. And lastly there is the pitfall of CDMs based on vendor models or current applications. A model like a CDM should model business concepts and therefore not be bound to vendors or current applications. Both vendors and systems come and go, your business hopefully survives these.
To prevent failure of the CDM the following guidelines can help:
A CDM is intended to be used for integration and therefore should only cover entities that are used on the integration layer. Entities that are not exchanged between systems should not be part of the CDM. The bright red area of the figure on the right illustrates the information that should be part of the CDM.Defining a CDM is a challenging exercise, but following these guidelines should help you to win the challenge. Not using a CDM in an SOA can introduce extra complexity in the SOA because there will be many point-to-point connections on the data level. As stated in pitfall #6 - SOA does not solve complexity automatically, a CDM is one the items that can reduce complexity.
Next week Viktor will take us to #3...
Filed under Architecture, Java, Performance, SOA | 1 Comment »
Very well said.
One good example for industry standard is the ACORD XML (refer http://www.acord.org.) They defined generic insurance data standards. Entities that are exchanged between two or many systems should be part of CDM. The diagram is a little misleading.