Integrate SQL Server 2000 With XML

Learn how to combine the FOR XML T-SQL extension with ADO.NET to improve your Web server’s performance.

asli bilgin

October 30, 2009

15 Min Read
ITPro Today logo

asp:feature

LANGUAGES: VB

TECHNOLOGIES: SQLServer 2000 | XML

 

Integrate SQL Server 2000 With XML

Learn how to combine the FOR XML T-SQL extension withADO.NET to improve your Web server's performance.

 

By Asli Bilgin

 

XML comprises a large chapter of the .NET Framework'sdata-access story. In fact, XML serves as the only persistence format for datausing ADO.NET. Quite fittingly, SQL Server, as a .NET Enterprise Server, makesgreat strides with native support for querying and updating data using XML. Likealmost all .NET technologies, it uses the Web as a fortifier, leveragingInternet standards. SQL Server supports the HTTP protocol for data transfer,XML for storing data, and XSD for data schemas.

 

Microsoft provides many ways to work with XML data usingSQL Server. You could use satellite XML technologies such as XPath queries andXSL templates, but in this article I'll show you how to leverage FOR XMLT-SQL extensions, in conjunction with the SELECT statement, to retrievepublishers and their titles from a sample pubs database. You'll use the FORXML EXPLICIT mode to hand-craft your own custom XML schema, and you'll seehow you can pass an XML stream to Visual Studio .NET using a stored procedure.

 

Additionally, I'll demonstrate how to draw on the classeswithin the System.Data and System.Xml namespaces to write andread XML from SQL Server. You also will learn how to stream XML data from SQLServer using the XmlTextReader class.

 

Leveraging cached data is a great way to improve your Webserver's performance. In this article, you'll see how to push the XmlTextReaderstream into an ADO.NET DataSet object. Because ADO.NET DataSetspersist as XML, they make ideal caching mechanisms for the XML data retrievedfrom SQL Server.

 

Finally, you'll use ASP.NET Web Forms to display thecached XML file in master and detail DataGrid controls by binding to DataSetand DataView objects.

 

Retrieve XML Data

One way to retrieve XML data from SQL Server is to use theFOR XML extension, which you append to the SELECT statement.Traditionally, SQL Server returns relational data in a tabular format. When youappend the FOR XML clause to the end of your SELECT statement,SQL Server returns data as XML. You then can take this XML stream and persistit as an XML document. Alternately, you can store the XML results in memory asan ADO.NET DataSet.

 

Why would you want to use the FOR XML extension?Representing your data hierarchically rather than relationally has manyadvantages. First, data packaging is cleaner. XML lets you display your datawith hierarchy-friendly controls, such as TreeView, and it makes datatransport easier thanks to its text-based nature with different platforms andfirewalls. XML also enjoys wide industry acceptance within the technology andbusiness sectors, making it a logical choice for interoperability.

 

In addition, XML not only works well with hierarchicalcontrols, but it is the only persistence data format within ADO.NET, whichenables DataSets and DataReaders to grab results quickly andreadily with little or no conversion. Those of you familiar with classic ADO'sshaped RecordSets will see that the FOR XML statement is a greatreplacement for classic ADO's SHAPE command, which uses a proprietaryformat for displaying parent and child data. Finally, XML is great for caching.Caching XML for your ASP.NET pages can improve the performance of your Webapplication significantly. You can combine XML data caches with ASP.NET's newpage and fragment caching. For example, consider a situation where you areretrieving data from SQL Server to populate navigation bars. Most likely, thatdata is not going to change too much over time. You can use XML files to holdthe data on the Web server rather than calling the database every time.Drop-down lists also are likely candidates for caching.

 

You append the FOR XML extension to a SELECTstatement using this syntax:

 

   FOR XML { RAW | AUTO | EXPLICIT }

                [ , XMLDATA]

                [ ,ELEMENTS ]

                [ , BINARYBASE64 ]

 

You can't use the FOR XML clause without specifyingone of these modes: RAW, AUTO, or EXPLICIT. The latterthree keywords - XMLDATA, ELEMENTS, and BINARY BASE64 -are optional. The XMLDATA keyword returns both schema and data as yourXML results; ELEMENTS is useful for generating XML data that uses nestedXML elements; and BINARY BASE64 lets you encode binary information, suchas images, for text-based transport. Newcomers to XML often hold the commonmisconception that because XML is a text-based format, it is unable to packagebinary information such as Word documents. Using BASE64 encoding, however, youcan provide the means to transport binary data.

 

Next, let's explore different queries that retrieve XMLdata using T-SQL extensions. We will use the Northwind and pubs database, whichship with SQL Server 2000.

 

RAW Mode

You can't specify element names using the RAW mode.Instead, you get XML data using generic tags. With tags, you have no control over the naming and you also can't control thestructure. The tag forces the column names and values intoattributes rather than making them child elements within the element.

 

Additionally, the RAW mode flattens your data.Because all your data rows are tagged with the tag, youcan't take advantage of the hierarchical nature of XML. Consider thisstatement:

 

SELECT customers.customerid, customers.companyname,

orders.orderid, orders.orderdate

FROM customers, orders

WHERE customers.customerid = orders.customerid

FOR XML RAW

 

The previous statement returns the data from the Northwinddatabase, shown here:

 

orderid="10643"orderdate="1997-08-25T00:00:00"/> orderid="10692" orderdate="1997-10-03T00:00:00"/>   As you can see, your customerand their order(s) don't share a parent-child relationship.   There's not much value in flattening data that's meant tobe hierarchical. You can think of this data as a classic ADO RecordSetconverted to XML using the adPersistXml option. It's relational, nothierarchical, XML data.   If you use the Query Analyzer to test your queries, besure to display the results in text rather than a grid because the grid viewtruncates large result sets in an unintelligible fashion. Additionally, be sureto increase the characters displayed by the Query Analyzer: Select Tools > Options from the menu, navigate to the Results tab, and adjust the Maximum Characters PerColumn setting to a larger value; I like to use a value of 3,000.   AUTO Mode As its name indicates, the AUTO mode is intelligentenough to generate XML data automatically based on the structure of your SELECTstatement and the joins you use. The AUTO mode, along with the ELEMENTSkeyword, improves your XML's readability because it enables you to use nestedelements. This is nice for simple queries, but if you want to get more complex,you will be confronted with a few limitations.   The AUTO mode doesn't enable you to have more thanone child for an element. For example, you can't create a structure like this:           Instead, experiment with the AUTO mode using thefollowing query. You will have to insert some data into the CustomerDemographicstable first (or you can download and run the script that I have provided to dothis):   SELECT customers.customerid, customers.companyname, orders.orderid, orders.orderdate , customerdemographics.CustomerDescFROM customers, orders, customerdemographicsWHERE customers.customerid = 'ALFKI'  ANDcustomers.customerid = orders.customerid ANDcustomers.customerid = customerdemographics.customertypeidFOR XML AUTO, ELEMENTS   Here is the XML obtained when running the previous query:     ALFKI   Alfreds Futterkiste        10643     1997-08-25T00:00:00            Istanbul           You can see by the results that SQL Server doesn'tinterpret the CustomerDemographics table as a child of the Customers table andinstead tucks it under the Orders table. This happens because SQL Server relieson the order of your tables in the FROM clause. In this case, theCustomerDemographics table follows the Orders table, causing the nesting yousee in the query.   Although the AUTO mode doesn't work correctly withmore complex queries, its support for single child nesting is sufficient fornow. Consider this statement:   SELECT customers.customerid, customers.companyname, orders.orderid, orders.orderdate, [order details].productidFROM customers, orders, [order details] WHERE customers.customerid = 'ALFKI'  ANDcustomers.customerid = orders.customerid ANDorders.orderid = [order details].orderidFOR XML AUTO, ELEMENTS   See Figure 1 for the resultsof running this query.     ALFKI   Alfreds Futterkiste        10643     1997-08-25T00:00:00                28                       39                       46           Figure 1. Using the ELEMENTS keyword, the AUTO mode isuseful for working with single-level element nesting.   Using the AUTO mode does have other caveats.Although you have some control over the structure, you can't combine attributesand elements for column data. You have two choices to circumvent thislimitation: You can use all attributes (which is the default), which containsthe column data in attributes as in the RAW mode; or you can use allelements using the ELEMENTS keyword. In addition, AUTO does notsupport aggregate functions and grouping within your SELECT statement.   The EXPLICIT Mode Although EXPLICIT mode is the most complicated ofthe three modes, it is your best bet when producing an XML document forconsumption by an ADO.NET DataSet. It employs a proprietary schemamanagement model that's difficult to understand at first, but the extra paincomes with a reward because EXPLICIT gives you complete control over thelayout of your final XML document.   Using VB .NET, create a new ASP.NET Web Application namedEmployeeList. From the Server Explorer in VS .NET, drill down to the node thatrepresents the SQL Server you wish to work with. Right-click on the Stored Procedures node under the Pubs Database node and select the New Stored Procedure option (alternately you cancreate the stored procedure using SQL Server Enterprise Manager). Replace thecode in the code designer window with the code in Figure 2.   CREATE PROCEDURE dbo.GetEmployeesAS  SELECT 1 AS Tag, NULL AS Parent, NULL AS [PubEmp!1!RootNode!element], NULL AS   [Publisher!2!PublishingHouse!element], NULL AS [Publisher!2!PublisherID!element], NULL AS [Employee!3!EmployeeID!element], NULL AS [Employee!3!EmployeeName!element], NULL AS [Employee!3!EmpPublisherID!element] UNION ALL  SELECT 2 AS Tag, 1 AS Parent, NULL AS [PubEmp!1!RootNode!element], p.pub_name  + ' (' +p.city + ')' AS [Publisher!2!PublishingHouse!element], p.pub_id AS [Publisher!2!PublisherID!element], NULL AS [Employee!3!EmployeeID!element], NULL AS [Employee!3!EmployeeName!element], NULL AS [Employee!3!EmpPublisherID!element] FROM publishers pUNION ALL  SELECT 3 AS Tag, 2 AS Parent, NULL AS [PubEmp!1!RootNode!element], p.pub_name  + ' (' +p.city + ')' AS PublishingHouse, p.pub_id AS PublisherID, e.emp_id As EmployeeID, e.fname + ' ' + e.lname AS EmployeeName , e.pub_id AS EmpPublisherIDFROM employee e, publishers pWHERE e.pub_id = p.pub_idORDER BY [Publisher!2!PublishingHouse!element], [Employee!3!EmployeeID!element] FOR XML EXPLICITFigure 2. The FOR XML EXPLICIT clause is constructedas a series of SELECT statements, which should be designed carefully to ensurethe XML data is returned appropriately.   Make sure you set the appropriate execute permissions onthe GetEmployees stored procedure. (Note: You must have the EnterpriseArchitect version of VS .NET to create stored procedures. If you don't, you canuse the SQL Server Query Analyzer to create your stored procedure. If you areusing integrated Windows Authentication to connect to your database, you'llhave to set a couple security settings. Be sure the MACHINE_NAME/ASPNET userhas a valid login to your SQL Server database, where the MACHINE_NAME is thename of the machine on which you are running VS .NET.)   Although the code in Figure 2 looks intimidating, theresults are quite basic: The query simply retrieves a list of employees and thepublishing houses they work for.   The first SELECT statement sets up the root node,padding the columns with null values. The SELECT statement must have aTag value that uniquely identifies the results of SELECT block. The Tagcolumn must be a number, and it's generally easier to read if you useconsecutive numbers. The Parent column is null, indicating that it is the rootnode.   The second SELECT statement sets up the Publishers node. It references the root using theParent column, which references the Tag column of the parent result set. As youmight guess, the third SELECT statement contains the employeeinformation.   With the EXPLICIT mode, you can keep stringing onnodes to the XML document without being limited to the linear singleparent-child structure you have with the AUTO mode. It's easy to getcreative, using the Parent references to create multiple children nodes under asingle parent.   Why the exclamation points? As I mentioned earlier, the EXPLICITmode uses a proprietary schema-creation engine. One of its conventions is touse exclamation points as delimiters for creating column names. Here is thebasic syntax:   ElementName!TagNumber!AttributeName!Directive   Take a look at one of the column names from Figure 2:   Employee!3!EmployeeName!element   The ElementName variable indicates the name of theelement node in the final XML document. In this case, you have an Employeeelement. TagNumber references the SELECT block that retrieves thedata for that value. It makes sense that you use the SELECT block taggedas 3 because this is the one that retrieves employee data. AttributeNameindicates the name of the attribute (or subelement) for the column data. Thefourth variable, element, serves as a directive. This indicates that youwant your column data as subelements, which are named according to the valueyou specify in AttributeName. If you don't specify a directive, thecolumn data would appear as attributes of the element you specified in the ElementNamevariable.   It would be nice if FOR XML enabled you to map yourschema to an XSD file; after all, it is the W3C standard. Unfortunately,however, you must create the schema manually. Later in this article, you'lldiscover other ways to work with mapping schemas using SQL Server.   Work With ADO.NET and ASP.NET There are advantages to using FOR XML to cache dataon the Web server, and I'll show you how to bind this cache to a server-sideASP.NET control such as a DataGrid.   Continuing with the EmployeeList example, now add a buttoncontrol to a Web form in the project. Change the button's Text propertyto Cache Data. This button will contain the code that generates thecached XML file. Then, from Server Explorer, drag the GetEmployeesstored procedure and drop it onto the Web form. This creates the relevant SqlConnectionand SqlCommand objects. Next, open the code-behind file for the Web formand add this line:   Imports System.Xml   Now, find the button's click event handler that uses theXmlTextReader object of the System.Xml library to read the stream of XMLdata from the stored procedure, and add this code:   SqlConnection1.Open()Dim xrEmployees As XmlTextReader = _ SqlCommand1.ExecuteXmlReaderDim ds As New DataSet()xrEmployees.Read()ds.ReadXml(xrEmployees) ' RetrieveAppBinPath is a custom function to' retrieve the local bin directory. ds.WriteXml(RetrieveAppBinPath() & "CachedEmployees.xml")   This code uses the ExecuteXmlReader method tocreate an XmlTextReader object. The Read method pulls the XMLdata off the wire. You then load the data into a DataSet object. The DataSethas methods that enable you to persist the XML data to a file. Make sure to setpermissions on the bin directory for your machine's ASP.NET user. Then executethe project.   The page should create a new XML file on the Web server.You can view this file from the Solution Explorer by selecting the Show All Files icon and drilling into the bindirectory. Open the CachedEmployees.xml file and examine the contents. Thestructure of this data will help you better understand the schema-creationmechanism of the FOR XML EXPLICIT extension. You can toggle between therelational and hierarchical view of the data by clicking on the XML and Data tabs,respectively. Figure 3 shows you the Data view of the XML file. You alwaysshould specify the file path of the final application explicitly, otherwiseyour file will be stored in the default Windows System directory. In thisexample, I use the private method RetrieveAppBinPath() to get a handleon this project's application directory.  
Figure 3. When you examine the resulting XML file in VS .NET, you canget a good sense of how the FOR XML EXPLICIT clause works.   You use the DataSet's ReadXml method to bindthe master DataGrid control to the cache: Set the DataSourceproperty of the DataGrid to the Publisher table.   Bind a second DataGrid to the Employee table usingthe DataNavigateUrlField to relate the two DataGrids together.Use the RowFilter property of a DataView object to capture the PublisherIdfrom the query string. The DataView enables you to bind the second DataGridto the employees belonging to the publishing house selected by the user. Figure4 and Figure 5 show screenshots of the ASP.NET application generated by thisarticle's downloadable code.  
Figure 4. The master DataGrid is generated from a cached XML file.  
Figure 5. The detail DataGrid is generated from a cached XML file.   The FOR XML extension is a great way to retrieverelational data quickly as an XML document. Unfortunately, this has somelimitations. For example, you can't use it with a recursive schema. Youwouldn't be able to use the employee table to show a hierarchy of employees andtheir managers because the ReportsTo column in the Employee table points backto itself.   But the lack of XSD support is the FOR XMLextension's biggest limitation. The proprietary schema mapping of the EXPLICITmode is powerful, but it's cumbersome and tedious. Ideally, the EXPLICITmode should enable you to point to an XSD file to map the schema rather thancrafting it by hand.   One more T-SQL extension is the OPENXML function.This enables you to insert XML data into a SQL Server database. It's great forbatch inserts, but unfortunately, it doesn't support updates and deletions.   There are other, impressive ways of retrieving andupdating XML data with SQL Server. The latest Web Releases of SQL Serverinclude these new features. Updategrams leverage the XSD rather than aproprietary mechanism. This new feature, along with XPath queries, templates,and HTTP access to SQL data are fodder for future articles. Once you understandthe variety of ways you can work with SQL Server data using XML technologies,the better you will be able to find a solution that best meets your needs.   The sample code in thisarticle is available for download.   Asli Bilgin is anMCSD serving as a .NET Technical Evangelist for Dell Professional Services (http://www.dell.com). She consults for Fortune100 companies in architecting distributed enterprise solutions using Microsofttechnologies such as SQL Server, ASP.NET, and VB.NET, and she is on the charterspeaker board for INETA. She also speaks at Microsoft conferences and recentlyco-authored MasteringVisual Basic .NET Database Programming (Sybex). E-mail her at mailto:[email protected].  Tell us what you think! Please send any comments about thisarticle to [email protected] include the article title and author.      

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like