Putting XPath into Perspective
Explore the basics of XPath, a query language for addressing, sorting, and filtering the elements and the text of an XML document.
January 29, 2001
Think of XML Path Language (XPath) as a general-purpose query language for addressing, sorting, and filtering both the elements and the text of an XML document. Version 3.0 of Microsoft XML Parser (MSXML) supports XPath through both the Extensible Style Language Transformations (XSLT) processor and two methods that extend the standard World Wide Web Consortium (W3C) Document Object Model (DOM): selectNodes and selectSingleNode. MSXML supports the W3C version 1.0 recommendation for XPath, which dates back to November 1999.
The XPath notation is basically declarative. Any XPath expression is a path that identifies information with the given characteristics within an XML document. The path defines a pattern, and the resulting selection includes all the nodes that match the pattern. You express the selection with a notation that emphasizes the hierarchical relationship among the nodes in the XML document's tree, similar to the notation you're accustomed to using with folders and files. For example, the XPath expression "customer/address" means find the "address" element within the "customer" element.
In XPath, contexts are particularly important. An XPath context is the root of the XML subtree where the query occurs. (The file-system counterpart of the context is the current directory.) A context is the single node against which the pattern matching operates. The result of a query depends on the context against which you execute the query. The concept of context adds flexibility to the query process. In fact, XPath queries can retrieve nodes at one particular context and perform other operations starting from a given context. Within MSXML, a Node object represents the context from which you call either the selectNodes or selectSingleNode method.
An XPath query, also known as a pattern, is a string that mimics a path through the XML document nodes. A few shortcuts exist, though, and those of you familiar with the MS-DOS notation will recognize them. For example, a pattern that begins with a period and forward slash (./) refers to the current context. A pattern prefixed with a forward slash (/) uses the root of the document tree as the context for the query.
XPath works with XSLT. More often than not, you simply need the ability to identify and process a group of related nodes. Although XSLT is powerful when it comes to applying templates of code to nodes, XPath supplies the underlying means to identify those nodes. As XML evolves as one of the key languages to communicate with database servers, XPath could become a key language to query XML datasets. XML data streams are hierarchical by nature. When you use them to represent relational data such as recordsets, you're using only half their potential. XPath is a made-to-measure language to query both XML-based recordsets and generic XML streams of data.
In the long run to standardization, XPath seems like the first significant step toward a universal query language to keep up with the universal protocol (HTTP), the universal data description language (XML), and the universal method-call technique (Simple Object Access Protocol—SOAP). As usual, time will tell . . .
About the Author
You May Also Like