Analyzing a Query from the Inside Out

Microsoft's SQL Server development team takes a close look at a query that works from the inside out to return information from tables that are related only indirectly.

ITPro Today logo in a gray background | ITPro Today


Our organization's database has three related tables: Category, Products, and Transactions. All of the database transactions have a ProductID that's associated with the Products table. The Products table in turn has a CategoryID that's associated with the Category table. Table 1 shows the tables' relationships to each other.

I want to use a T-SQL SELECT statement to pass the variable @customerID and return a list of all category names and a value of the most recent transaction associated with each category, if one exists, sorted by the specified customer ID. I can use the TransactionID to determine which transaction is the most recent. What's the most efficient T-SQL SELECT statement I can use to return the specific information I want?

This question is complex because no direct relationship exists between the CategoryName and the CustomerID values. The values are linked only through their relationships with the Products table. So, logically, you must expand the Transactions table to include the category information. You can then select the most recent value for the subset of the data relating to the customer in question. Listing 1 shows a T-SQL SELECT statement that might return the list of category names you want. Let's break down Listing 1 into its parts so that you can understand what each section is doing.

At callout C in Listing 1, the code matches categories to products (the first join). Then, for each match, the code asks for matching transactions for that customer. The code uses an outer join to retain product information if it doesn't find any matching transactions.

After the code obtains the matching transactions, it performs the SELECT statement that callout B shows and returns the most recent transaction for every product in every category. Next, the code looks for the most recent transaction in each category, as the SELECT statement at callout A shows. The code disregards the individual products because you don't need them in the list of category names. You needed the products information only to make the connection from category to transaction and customer. Finally, the code searches the Transactions table to obtain the value of the most recent transaction per category.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like