The reason databases currently are not data integration friendly is simply because they are not design to be data integration friendly! Currently, data integration is not a database design consideration. Most data models are independently designed and result in the definition of heterogeneous disparate data models. Each independently designed heterogeneous data model, when instantiated as a database, forms an disparate island of disparate data or if you wish an independent information silo. For example, ten heterogeneous data models, when instantiated, result in ten isolated islands of disparate data. These islands of disparate data and information silos are dominant in our current data architectures. However, the root cause of this data isolation has not been previously defined. Without understanding the root cause of this data isolation, several data integration methods have been developed over the past 25 years. Now that we understand the root cause of this data isolation, we know that none of the prior art data integration methodologies correct the actual problem. Since it is our database design methods of defining disparate heterogeneous data models that cause data isolation in our data architectures, the correction of the root cause needs to be a correction of the data modeling methods. The Data Reintegration Methodology is a long overdue modification to the current data modeling methodology. Amazingly, eliminating the data isolation caused by our heterogeneous data models results in the integration of these previously isolated data models. That is, all heterogeneous data models that are created or enhanced using our patented Data Reintegration methodology become a part of a single network of integration data models. Likewise, the databases instantiated from the integrated data models are also integrated provided that these databases are properly populated with data as defined within the Data Reintegration Methodology. Under these conditions, every Data Reintegration database becomes a part of a single network of integrated databases. Any database that is enhanced using the Data Reintegration methodology is now integrated with any other so enhanced database. With the Data Reintegration Methodology, we are simply removing the data isolation artifact of our prior art data modeling and permanently re-integrating the data that never should have been isolated! You are invited to leave comments on this blog. For more detailed information, contact us and request the Data Reintegration Methodology whitepaper. Data Reintegration is a trademark of Strategic Insights Inc. The Data Reintegration Methodology is patented by U.S. patent no. 7,979,475 and other pending patents. ©Copyright Strategic Insights, Inc. 2012. All rights reserved. Comments02/04/2012 04:41
Kubilay Tsil Kara • Agree, there are not integration friendly because of type of shortcuts taken in past database designs to satisfy other shortcuts taken in application designs. De-duping and data quality are just words spun out to describe bad design and architecture, there is nothing wrong with data is the logic which is flawed.
Reply
02/06/2012 20:20
I think this most often happens when a corporation or agency creates a new sub-corporation or sub-agency that requires much of the same data as the parent, but slightly different in format or content. Instead of sharing the established data as an enterprise asset, the 'new' group replicates the established data, makes changes / enhancements then struggles to stay in sync with the parent schema. Tah-Da - information silo created!
Reply
02/07/2012 11:35
Most enterprise data products are organic, growing from an unexamined requirement through small data sets to larger, monolithic silos. By the time these untamed animals become somebody's pet (empire building, @Malcolm), they are usually rigid and are fought with esoteric definitions and implied meta-data.
Reply
Thank you for your well written comments. To tell you the truth, 7 years ago, I would have said something similar.
Reply
02/12/2012 12:26
---- What causes islands of disparate data?
Reply
02/13/2012 23:10
A large US based computer game concern has acquired companies in Europe and Asia. These companies all have their own independently developed databases with data models and data in a number of languages including English, French, German, Mandarin and Cantonese. As many of the non-native English speaking companies cater for the English speaking market they typically have a combination of languages in the one data base. Many of the companies sell on-line and whilst many customers have bought from more than one of those companies, each company has identified them independently and much of the data is out of date as the customer data often only gets updated as the customer buys another game. Some of the companies make no attempt to match a customer from one purchase to the next and simply keep a new copy of customer data for each purchase. Some of the companies list game titles in one table with a many to many relationship to the gaming platform. Others consider the game title to be the concatenation of the name and platform so there is no hard link between "GameX Xbox" and "GameX Playstation". Add to this product variants such as "GameX Deluxe Edition" and some games may even be completely renamed for different platforms. (Ever used one of those movie services where they recommend to you the DVD version even though you have just watched the Bluray version?)
Reply
02/13/2012 23:29
@Robert,
Reply
Ian: Your hypothetical reminders me of some of my past projects!
Reply
02/13/2012 23:51
Ok, can't help jumping in here...
Reply
02/21/2012 05:57
Can't agree more, Doug.
Reply
02/21/2012 06:04
Like Michele already said: "Can't agree more, Doug". That's why I gave Doug's analysis a 'Like', although if I look at the root of the problem I should have voted 'DISlike', lol.
Reply
02/21/2012 06:14
Hi Rob
Reply
02/23/2012 07:58
Most inconsistent data management policies, insecure managers and directors, even IT pros, absolute no knowledge of integrated data management policies are among the most common causes. The old "this is my data" prejudice...
Reply
03/04/2012 22:00
In my experience lack of data ownership is the lead cause. In some cases abandonment of data ownership. Service Orientated Architecture can solve some of this issues if managed from a data level. This can only work if the quality of the data is part of the data owner’s job description. Data ownership is not a sideline job.
Reply
03/04/2012 22:04
The key reason can be classified in to four ?
Reply
03/25/2012 07:34
In larger corporations, I think that the teams building solutions simply don't know that data is already governed by the enterprise. Without a comprehensive metadata strategy, how would they know what data standards exist or where identical data is stored?
Reply
03/25/2012 09:25
Most large companies assign budgets on a departmental basis, as they do goals. In such cases, departmental managers are only interested in meeting their goals and in many cases will only finance IT projects that further this goal.
Reply
03/25/2012 19:35
There are a few problems. But the biggest problem is the business itself and the architecture that it wants to enforce. If a business would have general architecture principles (as Togaf proposes) one could imagine that there would be a principle like "Redundancy of data must be avoided".
Reply
What causes islands of disparate data? As Doug said above - Projects.
Reply
What causes islands of disparate data? Lack of Data Stewardship.
Reply
While much of what has been said faulting IT management and policies is true, the old fact that computerized systems merely model the business is a strong factor here. How many businesses run as consistently and rigidly as good data governance advocates? In many cases that I've seen, different departments constantly are 'doing their own thing' either for expediency or political reasons. Often the underlying enterprise we're trying to model is riddled with inconsistencies and different views of their information needs. How can we hope to accurately model such a world with a consistent view? It's by definition wrong from the get-go!
Reply
What causes islands? Growth in companies, information, software, IT, the need for more and faster information, etc. A company that was growing rapidly in the 1960's probably opted for mainframes, then came servers, then PCs or maybe not quite in that order but you get the drift. There is the big issue of data quality! Also, vendors have provided departmental or business process solutions that are targeted only at one specific subject or process, they were faster than going through the mainframes or trying to consolidate data through traditional IT departments and cheap (off-the-shelf solutions). Then there is data quality! I actually remember the trend being mainframes, then "departmental" data stores, then came the data warehouse, then came business intelligence, then came CRM, then ERP, then back to centralization, returning to business unit specific, etc. etc. It is difficult for most companies to swallow the cost of centralizing millions of terabytes of data so that access is efficient. Then there is data quality!!!!! There is also the "big bang" conversions, migrations, centralizing, de-centralizing, data warehousing, data mart (islands), etc. Then there is data quality!!!! The real answer is simply progress ... and never understanding the value of "good" data and what it takes to it.
Reply
04/16/2012 05:46
Businesses are always looking for the silver-bullet solution to their technology woes. And usually this ends-up with IT having to support and integrate disparate systems. One healthcare organization I know has a separate system for Hospital Patient Care, Urgent Care Facilities and Physician Practices. None of which are totally integrated and each having their own way of identifying a Patient. Recently this same organization wisely contracted to implement a brand new system to handle all these business areas, but will still have to support the legacy systems until the final switch is pulled years down the road. Businesses seek and depend on reliable and integrated technology solutions to help them function in a highly competitive environment, which usually means they acquire the closest thing that meets their requirements, leaving IT with the daunting task of integrating it with everything else. Disparate databases usually maps to disparate applications, purchased by desperate organizations looking for the final answer to their prayers.
Reply
05/18/2012 04:54
I have visited your blog, and the solutions- they are all good- natural state of data is to be integrated- proper use of data models- making them universal is the answer..
Reply
Your comment will be posted after it is approved. Leave a Reply | AuthorRobert Mack, Ph.D. ArchivesApril 2012 CategoriesAll Click the Twitter button above to follow us for updates and special offers.
Get a free copy of the Data
Reintegration Methodology whitepaper by: Adding a blog comment on this page or completing the information request form. |