Diving into MDX Queries
Sample MDX queries introduce techniques for slicing answer sets.
This month, we'll re-review some key OLAP and MDX terminology, then dive into a few sample MDX queries that come with SQL Server OLAP Services. We'll also solve last month's MDX puzzle. The queries this month introduce several important techniques for sorting and slicing MDX answer sets.
Let's start with a review of OLAP terminology. A cube is the multidimensional equivalent of a relational table. Cubes contain one or more measures, and two or more dimensions. A measure is just a numeric quantity such as sales or inventory count. A dimension is a hierarchical group of like members that describe a measure.
For example, if the measure is Sales, typical dimensions are Time, Product, and Customer. The Time dimension has members such as Quarter 1 or Apr 98. The dimension members roll up into a hierarchy such as days, weeks, months, quarters, and years.
When you view the value of a measure, you check its value at a particular intersection point in the cube where a member from every single dimension is specified. In this case, the term member means a distinct value within a dimension. By specifying a member from each of the dimensions in a cube, you identify a set of cube coordinates that uniquely defines a single cell. For example, you might be able to see that the value of the Sales measure is 10,000 where Time = Q4 1998, Product = Snack Mix, and Customer = "Joe Smith." This value means Joe Smith bought 10,000 units of Snack Mix in Q4 1998. A member from every dimension is always specified in an MDX query, even if you haven't explicitly referenced one member. If you don't explicitly reference a member for every dimension, OLAP Services assumes you're referring to the default member. In this example, Snack Mix is the selected member within the Product dimension. If you didn't reference a specific member, OLAP Services chooses All Products, the default member of the Products dimension. But you can change the default member of a dimension using the OLAP Manager. In this case, you're checking the value of Sales where Time = Q4 1998, Product = All Products, and Customer = "Joe Smith."
Solving the MDX Puzzle from last month requires you to understand the concept of default members. We asked how you would change OLAP Services in the sample query in Listing 1 to return information about 1998 instead of 1997. The sample query returns Unit Sales for 1997 because 1997 is the default member of the Time dimension. You can easily override the default value of 1997 by specifying which Time member to use. This capability seems obvious, but it's an important concept. Every MDX query fully specifies a complete set of dimensional coordinates (also called a position, or tuple) in the cube, whether or not you have explicitly referenced a member in the dimension.
Last month's puzzle didn't specifically mention a member in the Time dimension, so OLAP Services processes our MDX query in the same way as if we had explicitly used the default Time member, which is 1997. You can use the WHERE clause to filter or slice the cube to limit the result set by changing the default member of a dimension. The WHERE clause slices the cube down to just the portion you are interested in. With this capability in mind, you can see that the sample query shown in Listing 1 is semantically identical to the MDX query shown in Listing 2, page 61, which has a WHERE clause that slices the sales cube so that you see only 1997 data. Both queries return the same result set in Screen 1, page 61.
To solve the March MDX Puzzle, you add a WHERE clause to filter 1998 data, as Listing 3, page 61, shows. The result set from Listing 3 in Screen 2 may seem strange because no Unit Sales values are returned. But the query you wrote is correct, the Sales cube simply doesn't have any data for 1998.
Adding the WHERE clause changes the context of the query by specifying a tuple. A tuple is a group of members, one from each dimension, that defines a cross-section of the cube. Sometimes the WHERE clause tuple is referred to as a slicer. In SQL, the MDX WHERE clause filters the contents of a query, as in Screen 2. When an explicit WHERE clause is absent, you determine a tuple, or position, in the cube by using the default member of each dimension.
The query in Listing 4 creates a 2D result with the measure called Unit Sales on the column's axis. The row's axis contains members from the Promotion Media dimension at the hierarchy level called Media Type. Because you have only two dimensions, you can easily use standard row and column terminology to visualize the answer, which SQL users are familiar with. This isn't always the case when there are more than three dimensions in a query.
SQL users may notice that the order keyword is in the SELECT clause rather than the WHERE clause. This new position provides the ability to order the values displayed on each axis independently of the order used to display the values on other axes. In Listing 4, the order syntax tells OLAP Services to display the members within the Media Type level of Promotion Media sorted in descending order by the value of Unit Sales for Each Media Type using a different set of rules. To do this, we use the break hierarchy, BDESC, keyword, which we'll describe shortly.
The order function accepts three parameters: the set of members to order, the value to sort by, and the sort order (i.e., descending or ascending). In the case in Listing 4, the column axis contains the set consisting of members from the Time dimension, and we're asking to order the set by Unit Sales in descending order. Screen 3, page 62, shows that we ordered the result set from Listing 4, but it may not be what you'd expect. The values in the result set are not in descending order of Unit Sales, even though the order is what we think we asked for. By default, the order function will not change the natural hierarchical order of levels; it will change only the order of members within a particular level.
In Screen 3, the Quarter level within the Time dimension is the highest level of the hierarchy for the default Time member of 1997. Q4 has the highest total for Unit Sales, so Q4 is the first value you see. Month is the next level after Quarter within the Time hierarchy, so now we'll order the months by Unit Sales within Q4. December, the twelfth month, has the highest Unit Sales for any month in the fourth quarter of 1997, the next value you see followed in order by the other months in Q4. MDX then determines that Q1 has the next highest value for Unit Sales at the Quarter level of the Time dimension, so we show the value for Q1 followed in order by all the months in Q1.
If you want to break the hierarchical order and do a true sort based solely on descending order of Unit Sales, you must replace the keyword DESC, descending order, with BDESC. This variation of the order function returns the members in strict descending order as you may have expected. The dual functionality of order with DESC and BDESC is necessary to efficiently process a variety of subtotal reports. Without server syntax to specify our request, a client-side report writer must do the processing, which can be incredibly inefficient.
About the Authors
You May Also Like