BI Without the Data Warehouse, Part 2
Michelle discusses the benefits of building a data warehouse rather than analyzing raw transactional data.
September 8, 2008
I have a strong prejudice against using one database instance for transactional data capture and data analysis and reporting. Past experience has shown me that it’s not just a matter of properly sizing the database and server to manage both workloads. Optimizing a database for transactional performance can be counterproductive for BI tasks and vice-versa. Despite recent advances in database management system technology and functionality and server processing power, it still might not be possible to balance two such disparate workloads in one environment. However, you can compromise; instead of using your OLTP database for BI functions, you could replicate (using SQL Server transactional replication) multiple data sources to a single operational data store, and offload the BI processing from the OLTP servers. Or you could set up regularly scheduled snapshots of the transactional database(s) for analysis.
Related: BI Without the Data Warehouse, Part 1
In my previous commentary, I discussed the pros and cons of the concept of business intelligence (BI) with the data warehouse. This week, let’s step back and take a look at the other side of this discussion—why not build a data warehouse, or at least a functional data mart? If your source data is coming from an enterprise system such as customer relationship management (CRM), shop floor production, or sales force automation, once you look under the hood, you’ll quickly discover that these systems are built for data entry, not data retrieval. As a DBA, you’ll find that you’re challenged to figure out which attributes to use for such-and-such a report. Offering such a database schema to BI analysts is a nightmarish scenario.
Putting the Power in Your Users' Hands
BI is demand-driven, and the demand must come from the business users. Business users’ needs dictate the technical solution for a BI project, not the other way around. If the IT group attempts to drive BI projects, oftentimes it fails to deliver the information business users need. A 2004 survey from Intelligent Enterprise indicated that 55 percent of the business users questioned felt that the tools and information provided by their data warehouse failed to hit the mark—more than half of the 20 percent of data warehouse projects that succeeded, according to the 2002 Gartner Group study. The BI development cycle is, by nature, open-ended, interactive, and iterative. Business people, such as front-line decision-makers and senior executives, are responsible for defining, creating, and using BI information. Therefore, it makes sense for these business users to own the process that will result in a BI solution, rather than trying to force-fit IT personnel into the role of a data analyst.
Is there a way to combine these two approaches (i.e., using the raw, transactional data at a very fine grain, and using the summarized, cleansed, and reorganized data that’s traditional to a data warehouse)? Quite possibly, yes. It would require a change in mindset and procedure, but it could be done. To do so, you must first create an enterprise data warehouse by doing nothing more than continuous data loading directly from your data sources (e.g., transactional databases and unstructured data sources).
In reality, this data warehouse would be more like an operational data store because these up-front operations would execute only the load portion of the extraction, transformation, and loading (ETL) process. You can run reports and BI analysis on this data set and look for gaps between perception and reality. Then, when the data’s on its way out of this repository and is being placed into data marts and cubes, you can execute the extraction and transformation portions of the ETL process. Data delivered to the data marts and data cubes would be cleansed and interpreted for conventional reporting. Analysis of data held in the initial repository, compared to analysis of processed data from the data marts and cubes, could also yield information about the discrepancies between the two systems. In this fashion, “what really is” meets “what they think it is” and the ensuing gap is an opportunity to bring the two worlds into alignment.
Planning Your Data Warehouse
The results of BI analysis are only as good as the data on which it’s based. If you have absolute confidence in the integrity and veracity of the data you’re analyzing, or if you intend to use raw, unfiltered data as a way to expose business process shortcomings, then go for it. Otherwise, consider planning your conventional data warehousing project today.
About the Author
You May Also Like