Data Transformation in the Real World
In real-world data warehouse solutions that involve disparate operating systems, you almost never find relatively clean data.
April 21, 2000
Be warned: In real-world data warehouse solutions that involve disparate data sources, you almost never find relatively clean data like that in the Northwind database. I've extracted legacy system data that has no domain, entity, or referential integrity. I once found parts of people's names in a column that was supposed to contain dates. In another case, I couldn't find any way in the operational system to relate individual billing accounts to customers, even though these items are clearly related in a business sense.
The ease of this fictitious project can help introduce you to some concepts, but it's in no way representative of the data quality, integration, and management challenges you'll likely face in most data warehousing efforts. A data mart solution and the OLAP tools you might connect to it for reporting and analysis represent the tip of the iceberg in a data warehouse. Lurking below the surface are data quality and life-cycle analyses; data integration and extraction processes; data extraction, transformation, and loading (ETL) processes; design and implementation considerations; and meta data management. Don't let these underlying pieces catch you off-guard.
About the Author
You May Also Like