Writing SAX Applications

Dino Esposito discusses writing a Visual Basic (VB) application that works with Simple API for XML (SAX).

Dino Esposito

April 2, 2001

3 Min Read
ITPro Today logo in a gray background | ITPro Today

In our ongoing exploration of Simple API for XML (SAX), let's look this time at how to write a Visual Basic (VB) application that works with SAX. We won't consider .NET and VB.NET here because the .NET class framework contains specific SAX classes. I'll cover .NET XML features in future columns.

Using SAX and VB, the first step is to add to your VB project a reference to the Microsoft XML Parser (MSXML) 3.0 type library. From the Project menu, select the References menu item, and then select Microsoft XML v3.0 as the library. To start SAX parsing over a given XML file, insert an interactive control (e.g., a button) and associate the following code with its click event:

Dim parser As New SAXXMLReaderDim contentHandler As New ContentHandlerImplSet parser.contentHandler = contentHandlerparser.parseURL (App.Path & "foo.xml")

The SAXXMLReader object, which the Microsoft XML v3.0 library provides, represents the SAX parser. The ContentHandlerImpl object represents the content handler that you must write. In VB, a SAX content handler is a class that features the IVBSAXContentHandler interface. Thus, you need to add a new class to the project and include following line at the beginning of the file:

Implements IVBSAXContentHandler

Then you need to define handlers for all the events that the interface makes available. Of these available events, you should consider two in particular—startElement and characters:

Private Sub IVBSAXContentHandler_startElement( _    strNamespaceURI As String, _    strLocalName As String, _    strQName As String, _    ByVal oAttributes As MSXML2.IVBSAXAttributes)      ' code hereEnd SubPrivate Sub IVBSAXContentHandler_characters( _      strChars As String)      ' code hereEnd Sub

StartElement fires when the parser starts working on a new XML tag. The application receives the namespace URI, the raw tag name, and the fully qualified name—including the namespace prefix, if any. The fourth argument is an object that gathers all the attributes that the element has. The startElement event doesn't tell you anything about the text between the opening and the closing XML tag. To access that piece of information, you use the characters event.

A common problem that you face with SAX parsers is state management. In a typical scenario, you want to process the text of certain elements. Unfortunately, the characters event doesn't tell you anything about the element to which that text belongs. Inevitably, you have to resort to globals to keep track of the last element processed and decide whether you're interested in its characters. Consider the following XML file:

                   Joe            Users                        Jack            Whosthisguy      

Suppose that you want to extract only the clients' names. You must store in a global (g_strTagName) the most recently started element's name and retrieve that name in the characters event handler.

Sub IVBSAXContentHandler_startElement( _    strNamespaceURI As String, _    strLocalName As String, _    strQName As String, _    ByVal oAttributes As MSXML2.IVBSAXAttributes)g_strTagName = strLocalNameEnd Sub

In addition, you have to ensure that you reset the g_strTagName global when the parser finishes with an element. You need a second global (g_strBuf) if you want to concatenate the first and last name into a single string.

Sub IVBSAXContentHandler_characters(strChars As String)    If g_strTagName = "firstname" Then        g_strBuf = strChars & " "    End If    If g_strTagName = "lastname" Then        g_strBuf = g_strBuf & strChars        Form1.List1.AddItem g_strBuf        g_strBuf = ""    End IfEnd Sub

At the end of the process in this example, the application adds the string to a listbox.

A SAX parser is extremely fast because it makes only one pass through the XML document from top to bottom. For efficiency reasons, the parser doesn't track what it did in the previous step. Thus, tracking down state is completely up to you.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like