Data Warehousing Step by Step
Avoid common, but serious, data warehouse disasters
December 20, 1999
In the year since we launched SQL Server Magazine in February 1999, our readers have been clamoring for more data warehousing and business intelligence (BI) coverage. Start-ing with this issue, we'll provide data warehousing and BI information on a regular basis. We'll work with data warehousing experts to bring you information you can use whether you're a data warehousing pro or just learning what a cube is.
We kick off coverage this month with two articles that tackle the data warehousing design problem from slightly different perspectives. And we've included a third piece for advanced audiences who are trying to deal with the complex problem of slowly changing dimensions.
The first paragraph of Mark Scott's "7 Steps to Data Warehousing," page 32, offers the following insightful advice , which you'll want take to heart during your data warehousing endeavors: "You must understand what questions users will ask [of the data warehouse] because the purpose of a data warehouse system is to provide decision-makers the accurate, timely information they need to make the right choices."
I'll take this statement one step further. BI is about letting the right people ask the right questions at the right time, then applying the answers to achieve a competitive advantage for the organization. Doing this properly often requires self-service data analysis, which eliminates the need for report gophers.
Figure 1, page 26, illustrates the self-service data analysis concept. Joe Analyst is an information consumer with a series of questions to ask. The existing reports don't fully answer his data analysis questions, so Joe often needs to bring in his report gopher, Tim IT, to help. Tim's an IT jock who knows how to tame the existing querying and reporting tools and source system data integration problems, so the database system can answer Joe Analyst's questions. Unfortunately, this process can take hours, days, or weeks depending on the information request's complexity and how many other people Tim IT is playing report gopher for.
Now, let's look at Figure 2 to see how self-service data analysis can create data nirvana for the information consumer. An OLAP cube replaces the report gopher. Rather than brokering information requests for information consumers, Tim IT now spends time designing data-presentation layers that can use OLAP cubes and advanced BI-oriented end-user tools. Joe Analyst can now ask questions and get answers without involving the IT department. When built and managed properly, this setup can provide BI tools that let Joe quickly and easily browse the information, while thinking up new questions as he goes along—a key BI concept. Business analysts often don't know which questions to ask until they have answers to preliminary questions. IT departments can't prepare a set of reporting requirements if they don't know all the questions the business analysts need to ask. You can use BI to solve this problem by creating self-service analysis centers for the analyst information consumers.
Brian Lawton and Don Awalt's "Data Warehousing: Back to Basics," page 28, introduces key project-planning concepts associated with data warehousing. First, you need to build a data warehousing team with the right representatives from the end-user and development communities. Second, you need to tightly manage the project's scope, so you'll know which questions the data warehouse needs to answer. Third, you need to design for the enterprise, but build the warehouse incrementally so you're continuously providing the business analyst community measurable results.
You rarely need to build a data warehousing project plan completely from scratch. The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses by Ralph Kimball, Laura Reeves, Margy Ross, and Warren Thornthwaite (John Wiley & Sons, 1998) is an excellent resource. The book provides an online project template that covers the entire life cycle of designing, developing, and deploying a data warehouse. The project plan includes a staffing key that will help you understand what types of business and technology roles you'll need at each step of the data warehousing process.
Memorize the article "Top 10 Reasons Data Warehouses Fail," published by The Data Warehousing Institute at http://www.dw-institute.com/mistakes/dw-consultant.html. If you've never built a data warehouse, you can learn from others' mistakes listed in this document. Make sure your project plans and methodology address these top 10 reasons, and you'll avoid most of the common, but show-stopping, data warehousing disaster areas.
Joe Luedtke's "Implementing Slowly Changing Dimensions," page 37, is a dose of reality if you think building a data warehouse is nothing more than installing OLAP Services and popping open the PivotTable wizard in Excel 2000. As with any other technology solution, doing simple data warehousing tasks is easy, and solutions become more complex to model as problems become more difficult. The problems arise when the business information in your data warehouse changes over time and when you need to track different levels of historical information. You might find that changing business information is trickier to deal with than managing historical data and changes in online transaction processing (OLTP) systems. Slowly changing dimensions is a standard phrase to describe this data-changing problem. For more tips on this subject, you'll find dozens of books with excellent coverage, including The Data Warehouse Lifecycle Toolkit. Also, see Microsoft's 25-page white paper "Managing Slowly Changing Dimensions Using Microsoft SQL Server OLAP Services" at http://www.microsoft.com/sql/productinfo/dimensions.htm.
Establishing the self-service data analysis concept as the cornerstone of the BI process will give your organization a competitive advantage and save you time and headaches. Next month, look for more data warehousing and BI basics.
About the Author
You May Also Like