Mastering OLAP: Using English Query with OLAP
Bring business information directly to decision-makers
January 26, 2000
Using English-language sentences to interact with computers isn't a new idea. About 20 years ago, I used an English-language recognition program called Eliza, a small BASIC program that posed as a psychoanalyst. Eliza kept up a dialog and made people sort out their problems by expressing their feelings. You typed in an English statement, and Eliza reformulated it as a question. Eliza dissected common sentence structures, identified verbs and nouns, and turned statements into questions. When Eliza couldn't reformulate a sentence as a question, it responded with a general question, such as "And how do you feel about that?" Although Eliza fooled many people into thinking that they were communicating with a computer, it didn't understand the English semantics; it understood only simple syntactical rules.
English recognition has come a long way since Eliza. You can use English-language recognition with the SQL Server English Query component to generate a Multidimensional Expression (MDX) query. Also, with a Visual Basic (VB) 6.0 application, you can integrate English-language recognition into OLAP applications. (See Web Dev, "The Amazing English Query Tool," April 1999, for more about English Query.)
To follow this article's demonstration, you can install the SQL Server English Query component (a beta version) from the Microsoft Developer Network (MSDN) CDs, February 2000 edition. If you have an MSDN subscription, you can also download the SQL Server English Query component from the MSDN Web site http://msdn.microsoft.com/subscriptions/ resources/subdwnld.asp. You need to use a beta version because the English Query version included with SQL Server 7.0 can't generate MDX. If you don't have a subscription to MSDN, you'll need SQL Server 2000, which is a beta version at the time of this publication and will probably be available this summer. You also need SQL Server 7.0 or SQL Server 2000 OLAP Services running on your system.
The combination of English Query and OLAP is almost a marriage made in heaven. OLAP's promise is to easily and effectively bring business information to decision-makers. OLAP's limitation is that business decision-makers who need OLAP often don't have the experience or inclination to use complicated query tools to perform functions such as filtering, sorting, and time series functions. Imagine your chief financial officer (CFO) saying to a computer, "Give me the year-to-date sales for all products sold in Europe broken down by month," and the computer responding by displaying the correct grid of numbers. In a few more years, this feat will be possible, and with some limitations, it's possible today.
Why Use OLAP and English Query
When you apply English Query and other technologies, such as data mining, multidimensional (OLAP) databases have an advantage over relational databases. You use business rules and terminology to structure multidimensional databases; by looking at an OLAP cube's structure, you can discover how a business organizes its entities. In fact, the OLAP cube's meta data contains business terminology. For example, you can see a fiscal calendar's structure, how product families break down into products, and how sales regions are organized. Such information is difficult or impossible to determine by analyzing an equivalent OLTP database. A good relational data model can represent the structure of business entities; it's the structure of business entities that isn't inherent in a relational model.
English Query uses the OLAP cube structure to create an appropriate vocabulary of entities and relationships. English Query requires entities and relationships to make sense of the English questions you ask it. To make an English Query vocabulary model, let's use the standard FoodMart Sales cube that comes with SQL Server 7.0. If you're using SQL Server 2000, the FoodMart 2000 Sales cube will also work for this demonstration. First, run the Microsoft English Query program in the SQL English Query folder in the Start menu. When the English Query development environment starts, it will prompt you for a new project, as Screen 1 shows. Type in Sales for the project name, pick a new or an empty folder on your hard disk, and type its path name for the location. Select the folder on the left that's labeled English Query Projects and the icon on the right that's labeled OLAP Project Wizard. When you click Open, SQL Server prompts you for an OLAP Server and database. Here you type in the name of the machine that is running OLAP Services or LOCALHOST if you're running OLAP Services on your local PC. Select FoodMart or FoodMart 2000 database.
The next window prompts you to choose the OLAP cubes to include in your English Query project. Select Sales, and click the right arrow to move Sales into the list box on the right, as Screen 2 shows. When you click OK in this window, English Query will read in information about the Sales cube and present the entities and relationships it discovers. The entities appear in a tree view that you can expand to view the relationships English Query discovered. As you can see in Screen 3, many of the relationships aren't checked. The project wizard relies on you to check which relationships make sense. For this example, I suggest you check all the relationships for all entities, then click OK to enter the English Query design mode.
To view a portion of the model, you can drag an entity, such as customer, to the main window, right-click it, and select Explode to view an entity-relationship of the customer entity, as Screen 4 shows.
English Query does an excellent job of determining basic entities and relationships in a cube. However, it can't determine how two dimensions are related through a measure. For example, nothing in the FoodMart cube's meta data says that customers buy products, so you need to add this relationship to the English Query model. To do so, first close the Customer entity by right-clicking on it in the main window and selecting Implode. Then drag the Product to the main window. You'll see three boxes in the main window: Customer, Time, and Product. One relationship, Customers have products, should appear between the boxes. To add the new relationship, drag Customer to a position on top of Product, which brings up a New Relationship window. Next click Add, which is to the right of the empty Phrasings list. In the Select Phrasing window, select Verb Phrasing, and click OK. In the next Verb Phrasing window, drop down the Sentence type combo box and select Subject Verb Object. Then fill in the Subject with customers, the Verb with buy, and the Direct Object with products, as Screen 5 shows. Click OK. With the addition of this sentence type, English Query will now recognize other forms of the verb buy and several synonyms such as acquire, accept, and approve of. English query knows these synonyms because it includes a dictionary of thousands of English words with their synonym relationships. You can also modify the dictionary in the English Query development environment.
Compiling and Using the Model
Let's try the model. In the model's design mode, click Start (green triangle) on the toolbar; the SQL Server English Query component will compile the model, then let you type in English queries. English Query will analyze and restate each query as it understands it. With the restated query, English Query displays the resulting MDX query. For example, type in the following statement, and press Submit:
List the ten products with the most unit sales.
English Query restates this English sentence as Show the 10 products with the highest total unit sales and responds with the MDX query:
SELECT {Measures.[Unit Sales]} ONColumns,topcount([Product].[ProductName].members, 10,Measures.[Unit Sales]) ON RowsFROM [Sales]
To make sure the MDX query accurately states what you're asking for, click on the toolbar icon that looks like a grid. You'll then see what you suspected all along; Special Wheat Puffs is the top-selling product. This result leads to a follow-up question. Type in:
What customers bought those products?
In the follow-up question, English Query restates the query exactly as you typed it and responds to this question with the following MDX query:
SELECT {} ON Columns, extract(filter([Customers].[Name].members*top- count([Product].[Product Name].members, 10, Measures.[Unit Sales]), NOT isempty(Measures.[Unit Sales])), [Customers]) ON RowsFROM [Sales]
English Query remembers the criteria to find the top 10 products and uses them instead of the phrase these products.
The resulting MDX query isn't for the faint of heart. It starts by doing a cross join (as the shorthand notation asterisk specifies) of customer name members with the top 10 product name members based on unit sales. The query then filters out the empty elements and reverses the cross join by extracting only the customer name members.
This query's complexity underscores a danger of using the English Query component: If English Query creates the wrong MDX query, you might not recognize the problem even when you're looking at it. Questions worded in English can be ambiguous, which can result in more than one answer. But don't dismiss the technology because of this concern. English recognition is still a quick way to ask many types of complicated questions. When you've found something significant in your data, be sure to look at the restatement and closely inspect the generated MDX query.
Integrating the Model into OLAP with VB
Now that this English Query model works, you can integrate it into your OLAP application with VB. But before you exit the English Query development environment, you need to compile the model into an English Query Domain (.eqd) file. To compile the model, pull down the Build menu and select Build. If nothing happens, it's OK because you already built the model when you tested it with sample questions. Now you can exit the development environment and start VB.
To start a new VB project with English Query, you need to add a project reference to the English Query type library. (If you downloaded the code sample for this column, it loaded the project reference for you.) Drop down VB's Project Menu, and select References to add the project reference. Then find and check the box labeled Microsoft English Query Type Library 2.0. Click OK. Then you're ready to use the model in VB.
The English Query development environment Test window lets the user type in an English query. The VB sample program has similar functionality, as Screen 6 shows. Listing 1 shows the source code for the program. Click Go, and you'll see the restatement of the question and the resulting MDX query. For this sample program, change the file name and the directory to match the location of your English Query Domain file (sales.eqd).
This VB sample program doesn't execute the MDX, but adding this capability wouldn't be too difficult to do. Refer to Mastering OLAP, "Writing Applications with ADO MD" (November 1999), for information on how to use ADO MD to execute MDX statements.
In Listing 1, the first step in using the English Query programming model with this sample program is to create a Session object and initialize it with a domain file. Call the InitDomain method with a file path to the compiled domain file. A more fully featured program can build its own model, but this isn't necessary because the English Query development environment is available for building domain files. If you're developing a program that lets you enter multiple sequential English questions, you'll want to create the English Query Session object once, then reuse it for each question. For brevity in this example, create the session for each question, then set the ClarifySpellingErrors property to False. This setting tells English Query to correct spelling errors without asking you first.
Next you parse the English question and get the response object. The English Query component uses the response object to return the information associated with the question. The program handles the situation in which the English query is successfully converted into an MDX query and an error occurs.
Another response type that the sample program doesn't handle is nlResponse UserClarify. The English Query component uses this response type to ask the user questions to clarify the query. For example, if you set ClarifySpellingErrors to True and a misspelled word is present, you might get an nlResponseUserClarify response. This response object will contain several possible spellings for the user to choose from. The program can then prompt the user for the correct spelling, notify English Query, and continue parsing the English request. See the Help file included with English Query for more details about how to implement this feature.
If the response is nlResponse Com-mand, English Query parsed the English request successfully and generated a command. So, in this example, you'll find an MDX command. In some cases, English Query will generate SQL in addition to or instead of MDX. If English Query generates SQL, then you need to use the SQL query against the star schema tables that built the OLAP cube. To determine whether the query is SQL, check the value of the DBMSType property of the QueryCmd object. The DBMSType will be equal to nlDBMSSQLServer if the query is SQL and will be equal to nlDBMSMSOlap if the query is MDX. SQL can answer some English questions more efficiently than MDX. For example, MDX can answer a question such as What customers bought drink products? only with a large non-empty cross join. English Query will recognize this type of question and return both SQL and MDX, but English Query will return the SQL answer first in the list of QueryCmd Objects. Some questions English Query can answer only in SQL, in which case English Query won't generate MDX. English Query answers in SQL when your English question references a column in your star schema that doesn't exist in the OLAP cube.
Although this is a whirlwind tour of English Query, I hope it gives you a good starting point for using it in your applications. English Query is a great way to interact with OLAP Services. Even if you don't make English Query available to your end users, you might want to use it to generate MDX (an endless source of good sample MDX queries). If you have ideas or questions for future columns, please send email to [email protected].
About the Author
You May Also Like