Generate XML Schemas Programmatically in .NET

Leverage classes in the System.Xml.Schema namespace and gain control over the schema generation process.

Dan Wahlin

October 30, 2009

9 Min Read
ITPro Today logo in a gray background | ITPro Today

XML Schemas present an excellent way to describe thestructure and types associated with an XML document. For more information, see "Using XML Schemas." As a result, schemas areused ubiquitously throughout the .NET Framework in everything from Web servicesto DataSets to XML resource files. The .NET Framework even includes a tool namedxsd.exe that, through automated processes, allows you to create schemas,convert between schema types (XDR to XSD), generate strongly typed DataSets,and create specialized schema-based classes. Although tools such as xsd.exe canreduce development time in many cases, there may be situations where you needmore control over your schemas. In this article I'll demonstrate how you cangain complete control over the schema generation process by leveraging classesin the System.Xml.Schema namespace. To get the most out of this article, youshould have a good understanding of XML schemas.

Although the System.Xml.Schema namespace plays animportant role in .NET, several of its classes are used to support otherclasses, such as XmlSchemaCollection. As a result, you may not be aware of allthe great schema-oriented features this namespace contains. Classes within theSystem.Xml.Schema namespace can be quite useful when an existing schema needsto be edited or when a schema needs to be created from scratch based on a givendatabase structure, object hierarchy, or XML document.

There are over 60 classes withinthe System.Xml.Schema namespace that represent virtually every aspect ofschemas - from regular expression pattern tags to complex and simple types. Themain class that you'll use to start creating a customized schema is namedXmlSchema; it can be used to create the root of a schema document (andassociated namespaces, attributes, and so on) plus add elements, complex types,and more. Other important classes in this namespace include XmlSchemaElement,XmlSchemaAttribute, XmlSchemaComplexType, and XmlSchemaSimpleType - to name afew.  

 

Create a Schema Generator

Now that you've been introduced to a few of the mainclasses in the System.Xml.Schema namespace, let's examine how to create acustom class named SchemaBuilder that's capable of generating a schema from anexisting XML document. SchemaBuilder contains a single public method namedBuildSchema, whose signature looks like this:

public string BuildSchema(string xml,NestingType type) {}

BuildSchema is capable of creatingtwo different styles of schemas, including Russian doll style (nested, complextypes that mirror the XML document structure), and globally declared complextypes that allow for better type reuse. The style of schema to build isdetermined by passing an enumeration named NestingType to BuildSchema:

public enum NestingType {    RussianDoll,    SeparateComplexTypes}

In addition to passing theNestingType enumeration, the caller of the BuildSchema method also passeseither a string containing the XML document to base the schema upon, or apath/URL pointing to an existing XML document.

Upon being called, BuildSchemacreates a schema root element similar to the one shown here:

The code to accomplish this task isshown in Figure 1. If you look through the code you'll see that it creates anew XmlSchema object and then calls various properties such as Version andElementFormDefault to set attributes on the root element. The qualified schemanamespace (held in a constant named SCHEMA_NAMESPACE) is added by creating aclass named XmlSerializerNamespaces located in the System.Xml.Serializationnamespace.

XmlSchema schema = new XmlSchema();schema.ElementFormDefault = XmlSchemaForm.Qualified;schema.AttributeFormDefault = XmlSchemaForm.Unqualified;schema.Version = "1.0";//Add additional namespaces using the Add() method shown// below if desiredXmlSerializerNamespaces ns = new XmlSerializerNamespaces();ns.Add("xsd", SCHEMA_NAMESPACE);schema.Namespaces = ns;


Figure 1. Whencreating a schema root element, call properties on the XmlSchema class such asVersion and ElementFormDefault. This example adds a qualified schema namespace,sets the schema version, and sets the elementFormDefault andattributeFormDefault attributes.

After the schema root element iscreated, the manner in which the XML document (that's used as the basis for theschema) should be loaded is analyzed. The code that executes this analysisrelies upon the XmlDocument class (in the System.Xml namespace), as shown inFigure 2.

//Begin parsing source XML documentXmlDocument doc = new XmlDocument();try {    //Assume string XML    doc.LoadXml(xml);}catch {    //String XML load failed.   Try loading as a file path    try {        doc.Load(xml);    }    catch {        return "XML document is not well-formed.";    }}XmlElement root = doc.DocumentElement;

Figure 2. Performa few simple tests to determine whether or not an XML string or file path/URLis passed. When strings are passed, the LoadXml method is called. Otherwise,the Load method is called.

After the XML document is loadedand the document's root element is found, the process of creating the differentschema definitions is started by passing the root node (named root) to aprivate method named CreateComplexType:

XmlSchemaElement elem = CreateComplexType(root);

Here's the signature for CreateComplexType:

private XmlSchemaElement CreateComplexType(XmlElement el){}

CreateComplexType is the workhorseof the GenerateSchema class. It recursively walks through the source XMLdocument and identifies all of the element and attribute nodes that should beadded into the schema document. Figure 3 shows a portion of theCreateComplexType method code that identifies the XML elements and attributesin the XML document and creates corresponding schema types.

//Create complexTypeXmlSchemaComplexType ct = new XmlSchemaComplexType();if (el.HasChildNodes) {    //loop through children and place in schema sequence tag    XmlSchemaSequence seq = new XmlSchemaSequence();    foreach (XmlNode node in el.ChildNodes) {    if (node.NodeType == XmlNodeType.Element) {        if (namesArray.BinarySearch(node.Name) < 0) {            namesArray.Add(node.Name);            namesArray.Sort(); //Needed for BinarySearch()            XmlElement tempNode = (XmlElement)node;             XmlSchemaElement sElem = null;            //If node has children or attributes then            //create a new complexType container            if (tempNode.HasChildNodes ||                tempNode.HasAttributes) {                //Recursive call                 sElem = CreateComplexType(tempNode);            else {                //No comlexType needed...add SchemaTypeName                sElem = new XmlSchemaElement();                sElem.Name = tempNode.Name;                if (tempNode.InnerText == null ||                   tempNode.InnerText == String.Empty){                   sElem.SchemaTypeName =                     new XmlQualifiedName("string",                     SCHEMA_NAMESPACE);                } else {                    //Try to detect the appropriate                    //data type for the element                    sElem.SchemaTypeName =                       new XmlQualifiedName(CheckDataType                          (tempNode.InnerText),                          SCHEMA_NAMESPACE);                    }               }                //Detect if node repeats in XML so                //we can handle maxOccurs                if (el.SelectNodes(node.Name).Count > 1) {                    sElem.MaxOccursString = "unbounded";                }                //Add element to sequence tag                seq.Items.Add(sElem);            }        }    }    //Add sequence tag to complexType tag    if (seq.Items.Count > 0) ct.Particle = seq;}if (el.HasAttributes) {    foreach (XmlAttribute att in el.Attributes) {        XmlSchemaAttribute sAtt = new XmlSchemaAttribute();        sAtt.Name = att.Name;        sAtt.SchemaTypeName =          new XmlQualifiedName(CheckDataType(            att.Value),SCHEMA_NAMESPACE);        ct.Attributes.Add(sAtt);    }}

Figure 3. Thiscode walks through the source XML document and creates schema definitions thatmatch up with elements and attributes. If elements are found to have childnodes, CreateComplexType is recursively called to walk through all thedescendants.

The code starts by creating a newXmlSchemaComplexType object. Then, it checks if the current element in the XMLdocument (the root element when the method is initially called) has any childnodes by calling the HasChildNodes property of the XmlElement class. Ifchildren are found, a new XmlSchemaSequence object is created to hold the childelement definitions. After the sequence object is created, each child isenumerated through and processed.  

Because XML is extensible, it'squite possible that several children of a given parent node have the same name.For example, an parent node may have multiple child nodes. Since each child node needs to be defined only once in the schema,an ArrayList named namesArray is used to track child node names that have beendefined; this prevents duplicates from showing up in the schema. Each child(that isn't a duplicate) has an associated XmlSchemaElement object that iscreated to represent it in the schema. This XmlSchemaElement object is added tothe sequence tag with the following code:

seq.Items.Add(sElem);

As attributes are encountered,they're also enumerated through and an associated XmlSchemaAttribute object iscreated to represent the individual attribute. Each XmlSchemaAttribute objectis added to the initial XmlSchemaComplexType object (discussed earlier) throughits Attributes collection.

 

Handle Schema Nesting

Once the complex type element andrelated sequence element are created, an element representing the parent nodeis created and the complex type is assigned to the element (see Figure 4). Thiscode generates the proper nesting of complex types based on the NestingTypeenumeration value passed to the BuildSchema method.

//Now that complexType is created, create element and add//complexType into the element using its SchemaType propertyXmlSchemaElement elem = new XmlSchemaElement();elem.Name = el.Name;if (ct.Attributes.Count > 0 || ct.Particle != null) {//Handle nesting style of schemaif (generationType == NestingType.SeparateComplexTypes) {string typeName = el.Name + "Type";ct.Name = typeName;complexTypes.Add(ct);elem.SchemaTypeName =                new XmlQualifiedName(typeName,null);} else {elem.SchemaType = ct;}} else {if (el.InnerText == null ||          el.InnerText == String.Empty) {    elem.SchemaTypeName =           new XmlQualifiedName("string",SCHEMA_NAMESPACE);} else {elem.SchemaTypeName =               new XmlQualifiedName(CheckDataType(                  element.InnerText),SCHEMA_NAMESPACE);} }return elem;

Figure 4. The codeshown here hooks up to a parent element the complex type created earlier. Thetype of nesting desired by the client is generated by checking the NestingTypeenumeration value passed to the BuildSchema method. If the complex types are tobe separated (as opposed to nested), each complex type is added to anArrayList, named complexTypes, which is later enumerated through to add eachcomplex type definition into the schema.

 

Handle Data Types

You may have noticed a call to a method namedCheckDataType back in Figure 3. This method attempts to determine what datatype should be assigned to an element or attribute type definition based on theelement's inner text or attribute's value. Figure 5 shows the code forCheckDataType; it can easily be extended to support other data type checks asneeded.

private string CheckDataType(string data) {//Int testtry {             Int32.Parse(data);    return "int";} catch {} //Decimal testtry {    Decimal.Parse(data);    return "decimal";} catch {} //DateTime testtry {             DateTime.Parse(data);    return "dateTime";} catch {} //Boolean testif (data.ToLower() == "true" ||             data.ToLower() == "false") {    return "boolean";} return "string";}

Figure 5. TheCheckDataType method attempts to determine what data type should be assigned toa schema element or attribute definition.

After all elements in the sourceXML document are created, processing returns to the BuildSchema method andfinishes the schema by adding the root element definition to the XmlSchemaobject (refer back to Figure 1). Adding the root element definitions to theXmlSchema object involves referencing its Items collection, as shown in Figure6. After the root element definition is added to the schema root tag, theschema is compiled by calling the XmlSchema object's Compile method to see ifany errors exist. Assuming no errors are found, the schema is written to a StringWriterclass, which is returned from the BuildSchema method.

//Add root element definition into the XmlSchema objectschema.Items.Add(elem);//Reverse elements in ArrayList so root complexType//appears first where applicablecomplexTypes.Reverse();//In cases where the user wants to separate out the//complexType tags loop through the complexType ArrayList//and add the types to the schemaforeach(object obj in complexTypes) {    XmlSchemaComplexType ct = (XmlSchemaComplexType)obj;    schema.Items.Add(ct);} //Compile the schema and then write its contents//to a StringWritertry {    schema.Compile(       new ValidationEventHandler(ValidateSchema));    StringWriter sw = new StringWriter();    schema.Write(sw);    return sw.ToString();} catch (Exception exp) {    return exp.Message;}

Figure 6. Afterall the elements and associated complex types have been created, the rootelement is added to the XmlSchema object's Items collection through the Addmethod. The schema is then compiled to see if any errors exist. If none arefound, it's returned from the BuildSchema method.

 

Putting it Together

By using classes found in the System.Xml.Schema namespace,you can see that it's possible to create dynamic schemas from existing XMLdocuments. This same process can be extended to create customized schemas forother sources, such as database tables, classes, and so on. By leveraging theschema classes shown here, any type of XML schema can be generated for use inapplications.

For more information on schemas,check out the document, "XML Schema Part 0: Primer"or the SoftArtisans Knowledge Base article, "Working with XML Schemas:Comparing DTDs and XML Schemas."To view a live demo of the GenerateSchema class in action, visit the XML forASP.NET Developers website.

Note: To run the downloadable code with version 1.1 of.NET you need to set the validateRequest attribute to false in the web.configfile (or on the Page directive). See the .NET SDK for more details.

The sample code in this article is available fordownload.

 

Read more about:

Microsoft
Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like