Using XML Bulk Load to Load ADO-Generated XML DataUsing XML Bulk Load to Load ADO-Generated XML Data
Have you hit a brick wall trying to use XML Bulk Load to upload ADO-generated data? Try this nifty solution.
February 20, 2002
Editor's Note: Send your XML questions to Rich Rollman at [email protected].
I have an XML document generated from a Recordset through the ADO Save method. I want to use XML Bulk Load to upload the data in the XML document into another database, but the upload fails when the XML contains date fields. How can I use XML Bulk Load to load ADO-generated date fields?
Your upload fails because XML Bulk Load requires a mapping schema to correctly process date fields—output from ADO in ISO8601 format, which is the XML Data Reduced (XDR) representation. Because SQL Server 2000 can't convert dates from ISO8601 format into SQL Server's internal format, a mapping schema is required to instruct XML Bulk Load to convert the date format before loading the data into SQL Server.
Building a mapping schema is usually easy. However, the unique structure of the XML output from ADO's Save method can pose a particular challenge. Figure 1 shows a sample XML document that ADO generates when it persists a Recordset that holds the result of the query
SELECT Top 5 OrderID, OrderDate FROM Orders
which uses the Northwind database. The XML document contains an inline schema, namespace-qualified elements, and an unnecessary tag around the row elements that contain the data. The rs:data element in Figure 1 is unnecessary because it introduces a level of hierarchy in the XML representation when no hierarchical relationship exists in the data itself. The presence of those entities requires you to construct a set of mapping schemas that correctly specify how XML Bulk Load should load the data and convert the date field values. Let's look at the construction process.
Listing 1, page 44, shows the process's first step—building the top-level mapping schema that includes references to the other schemas. The top-level schema contains one declaration for the xml element, which is the root of the document that ADO generated. (Note that the XML Language Specification reserves the use of the string xml and any strings with the prefix xml—including mixed-case variants—for future use. However, XML parsers let you use xml because the specification doesn't dictate its usage as an error. In any case, you need to avoid using xml for names in your XML document.) The sql:is-constant annotation on the xml element declaration instructs XML Bulk Load that the root element doesn't map directly to the database so that XML Bulk Load ignores the xml element when processing the XML document.
The xml element contains two subelements: s:Schema and rs:data. The next step in building a set of mapping schemas is to account for these subelements. The s:Schema element—an inline XDR schema—defines the data types, constraints, and ADO-specific metadata for the row element and its attributes, which hold the data from the Recordset. Because XML Bulk Load attempts to load all data from the XML document into the database, you need to ensure that XML Bulk Load doesn't treat the inline schema as data. Moreover, because the current XML Bulk Load implementation doesn't support inline schemas, XML Bulk Load generates an error if an inline mapping schema is present. Fortunately, Microsoft anticipated the user's need to ignore parts of XML documents when using XML Bulk Load by providing the sql:mapped annotation in mapping schemas. When you specify sql:mapped with the value false on any element or attribute declaration in a mapping schema, XML Bulk Load doesn't attempt to store the associated data in the database.
In most cases, adding the sql:mapped annotation to the Schema element's declaration in the top-level mapping schema causes XML Bulk Load to ignore the Schema element. But in this case, ADO specifies the Schema element from a namespace different from the xml root element's namespace. (See xmlns:s='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882' on the xml element in Listing 1.) The uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882 namespace is an old version of the Uniform Resource Identifier (URI) for the XDR namespace, which corresponds to the prefix s. Because the example uses mapping schemas in XML Schema Definition (XSD) format (XSD is the World Wide Web Consortium—W3C—standard schema specification language) and XSD schemas don't support the declaration of elements from different namespaces, you need a separate mapping schema to declare the Schema element. Listing 2 shows this mapping schema, called the Schema mapping schema because it defines the Schema element and its content. Because your goal is to instruct XML Bulk Load to ignore the Schema element and its contents, you again use the sql:mapped annotation on the Schema element's declaration.
Your next step is to construct the Schema element's content model specification (the elements that the Schema element can contain). If you needed data from the Schema element's subelements, you would need to define all the subelements so that you could create a mapping to the database. But because you need only ensure that XML Bulk Load ignores the Schema element and whatever it contains, you can take advantage of the any tag in XSD schemas. The any tag declares that the Schema element might contain any element from any namespace. After you build the Schema mapping schema, you reference it from the top-level mapping schema by using XSD's import instruction. Import lets you associate a schema with a particular namespace and include the declarations from that schema into the top-level schema. In this example, associating the Schema mapping schema from Listing 2 with the namespace uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882 lets XML Bulk Load successfully ignore the Schema element in the ADO output.
Handling the rs:data element is a bit different from working with s:Schema. Like the xml element in the top-level mapping schema, rs:data uses the sql:is-constant annotation with the value false to indicate that rs:data doesn't map directly to the database. Typically, using the sql:is-constant annotation would sufficiently instruct XML Bulk Load to ignore the rs:data element. But like the Schema element described in the previous paragraph, the rs:data element is from the urn:schemas-microsoft-com:rowset namespace, which is different from the namespace of the root element. Therefore, you need to build a separate mapping schema for the rs:data element. Listing 3 shows the rs:data element's mapping schema, which you can label the data mapping schema. The data element's declaration again specifies the sql:is-constant annotation to indicate that the rs:data element doesn't map directly to the database. The content model for rs:data specifies that an arbitrary number of row elements might be contained within the rs:data element. Then, the top-level mapping schema uses XSD's import instruction to reference the data-mapping schema.
The final step in constructing the set of mapping schemas is to build the row element's mapping schema. But like the s:Schema and rs:data elements, the row element is in a different namespace from its containing element (rs:data) and therefore requires a different mapping schema. Thankfully, this schema is the last one you need to load ADO data.
In the mapping schema that Listing 4 shows—called the row mapping schema—the row element contains a sql:relation annotation that maps row elements to the Orders table, which Listing 5 defines. You map the OrderID and OrderDate attributes to Orders table columns of the same name by using the sql:field annotation. Because the attribute names directly match the column names in the Orders table, the default mapping typically would perform the desired mapping. However, you use the sql:field annotation here to explicitly specify the mapping and to illustrate how the field would be specified if the names didn't directly correspond. The OrderDate attribute also contains the sql:datatype annotation to indicate that a data format conversion is required and to instruct XML Bulk Load to perform a conversion from the XSD dateTime format (ISO8601 standard) to the SQL Server 2000 datetime format. XML Bulk Load automatically performs the conversion upon execution. Finally, you use the XSD import instruction in the data-mapping schema to reference the row-mapping schema.
Using the set of mapping schemas you've constructed, you can load a persisted ADO Recordset in XML format by using XML Bulk Load. Listing 6 shows the VBScript code containing the logic that SQL Server requires to execute the XML Bulk Load. The code creates an XML Bulk Load object by using XML Bulk Load Version 2, included with XML for SQL Server 2000 Web Release 2 (available at http://msdn.microsoft.com/code/default.asp?url=/code/sample.asp?url=/msdn-files/027/001/602/msdncompositedoc.xml). Then, the code builds a connection string to access the database. To run the code, you need to modify the connection string to specify the database on your local system and the access credentials for accessing the database. You can use the SQL script from Listing 5 to create the Orders table that the mapping schemas reference. After setting the location of an error log to which XML Bulk Load will write error messages, the script calls the Execute method, passing to the method the filename of the top-level mapping schema and the filename of the XML document containing the data to be loaded as parameters. By referencing the top-level mapping schema, named BLSchema.xsd, which uses import statements to import the other schemas you constructed, XML Bulk Load loads all schemas in the set. After the Execute method finishes, you can use Query Analyzer to verify that the data was loaded successfully and that proper conversion of OrderDate took place.
About the Author
You May Also Like