Optimizing Queries for Fact Tables
Help a DBA devise a scheme that lets him easily optimize two queries. The indexing scheme or optimization technique that you come up with should be flexible enough to help with the queries use a table.
May 22, 2007
Here's the solution to the May Reader Challenge. (Read the full Challenge, "Check for Objects Created Without SET Options Enabled.")
Solution:
Richard can use the metadata functions COLUMNPROPERTY and OBJECTPROPERTY to determine the state of the ANSI settings for columns and objects respectively. For example, by using COLUMNPROPERTY, he can determine whether a particular character or binary column was created with ANSI_PADDING enabled. The COLUMNPROPERTY function takes the object identifier, column name, and property name and returns one or zero as the value. The query in Web Listing 1 shows the columns that were created with ANSI_ PADDING OFF.
Similarly, Richard can use the OBJECTPROPERTY metadata function to determine whether the QUOTED_IDENTIFIER or ANSI_NULLS option was enabled at the time stored procedures, functions, and triggers were created. Web Listing 2 shows the query that can help Richard determine the settings and list the objects.
Richard can now use the above queries along with RAISERROR to signal an error condition if any rows were returned. Web Listing 3 shows the modified batches.
June Challenge:
Test your SQL Server savvy in this month's Reader Challenge. Submit your solution in an email message to [email protected] by June 14, 2007. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We'll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.
Problem:
Mark is a database developer in a company that sells scientific applications. Some of the applications collect measurements from various scientific instruments and store the results in a data warehouse repository for reporting purposes. The measurements are stored in fact tables. The characteristics of one specific fact table are as follows:
Measurements for a specific instrument are always inserted into the fact table.
The fact table can contain up to 20 measurements for each specific instrument.
Each row can contain values for, at most, any two measurements. This means only a few columns will be populated in each row.
The fact table contains the creation time for each row.
The fact table is created as a heap, and rows are typically bulk inserted from the application.
Sample schema and data for the fact table is shown in Web Listing 4.
Mark is building several reports that will utilize the data from the fact table just described. Most of the reports are built with the latest measurements for either a particular instrument or all of the instruments. For example, Mark writes a query that will list the latest measurements for a particular instrument. Listing 1 shows this query.
Mark also writes a query that lists the distinct number of facts for a specific measurement and each instrument by year/month:
SELECT f.InstrId, convert(char(6), f.CreateTime as CreateTime, 112), count(DISTINCT f.Msr1) as "# facts" FROM dbo.Instrstats as f WHERE f.Msr1 Is NOT NULL GROUP BY f.InstrId, CONVERT(char(6), f.CreateTime, 112)
Your challenge is to help Mark devise a scheme that lets him easily optimize both these queries. The indexing scheme or optimization technique that he comes up with should be flexible enough to help with the queries using the table.
About the Author
You May Also Like