table of contents

Practical XML: XPath

XPath is a way to navigate through an XML document, modeled on filesystem paths. An XPath expression consistes of a series of “node tests” that describe how to traverse from one element to another within a document. For example, given the following XML, the path “/foo/bar/baz” selects the single “baz” node:

<foo>
    <bar name='argle'>
        Argle content
    </bar>
    <bar name='bargle'>
        Bargle content
        <baz>Baz content</baz>
    </bar>
</foo>

Like a filesystem path, XPaths can be relative to a given node: the path “bar/baz” could be evaluated from node “foo” to produce the same result as the absolute path above. Also like a filesystem path, “.” refers to the current node, and “..” refers to the parent of the current node. This means that you could start from the first “bar” node, and use the relative path “../bar/baz” to again select the “baz” node — passing through the second “bar” node on the way.

Unlike filesystem paths, however, an XPath can specify a predicate: a logical test that's applied to the set of nodes selected at that point in the path. For example, the XML fragment above has two nodes named “bar”. If you want to ensure that you select only the first, you could use a positional predicate: “/foo/bar[1]”. Or, you could use a predicate that selects nodes with a specific value for the attribute name: “/foo/bar[@name='argle']”. You can also combine predicates: “/foo/bar[2][@name='argle']” is legal, although it won't select any nodes from the sample document.

The XPath specification will tell you all of the variants of an XPath expression. This article is about how to evaluate XPaths in Java. So here's the code to execute the first example above:

Document dom = // however you get the document
XPath xpath = XPathFactory.newInstance().newXPath();
String result = xpath.evaluate("/foo/bar/baz", dom);

Pretty simple, eh? Like everything in the javax.xml hierarchy, it uses a factory: XPath is an interface, and the actual implementation class will depend on your JRE (the Sun 1.5 JDK uses the Apache Xerces implementation). Also, like the other packages in javax.xml, as you start to use more of the features of XPath the complexity (and code required) increases.

Result Types

The first piece of complexity is what, exactly, you get when evaluating an XPath expression. The simple answer is that it returns the list of nodes that match the path and for which all predicates evaluate true, in document order. Yet the example above returns a string value. And what happens with a path like “/foo/bar”, which selects two nodes?

The answer is that the XPath specification requires any result be convertible into a single string value. The value of a single node is the text contained within that node and its descendents, and the string value of a list of nodes is the string value of the first node in that list. In many cases, the string value of an expression is all that you need, which is why the basic evaluate() method exists.

If you need more control over your results, such as getting the actual nodes selected, there's a form of evaluate() that lets you specify the result type; the valid types are defined as constants in XPathConstants. For example, to return the selected nodes:

NodeList nodes = (NodeList)xpath.evaluate("/foo/bar", dom, XPathConstants.NODESET);

Note that the result type name is NODESET but the actual return type is NodeList. Chalk this up to different spec editors and reuse on the part of the Xerces team: NodeList is from the DOM Level 1 spec, and lives in the org.w3c.dom package.

The term “nodeset” is confusing in several ways. First, it is most definitely not a java.util.Set: XPath always returns nodes in document order.

Nor is it a java.util.List. Being defined by the DOM spec means that NodeList provides the methods getLength() and item() rather than anything from the Java collections framework. The Practical XML library provides several work-arounds ranging from a simple asList() method to NodeListIterable, a wrapper that allows NodeList objects to be used directly with JDK 1.5 for-each loops.

Namespaces

Namespaces are perhaps the biggest source of pain in working with XPath. To see why, consider the following XML. It looks a lot like the example at the top of this article, with the addition of a default namespace. You might think that the example XPath expressions would work unchanged. You'd be wrong.

<foo xmlns='http://www.example.com'>
    <bar name='argle'>
        Argle content
    </bar>
    <bar name='bargle'>
        Bargle content
        <baz>Baz content</baz>
    </bar>
</foo>

The reason: XPath is fully namespace aware, and node tests match both localname and namespace. If you don't specify a namespace in the path, the evaluator assumes that the node doesn't have a namespace. Making life difficult, there's no way to explicitly specify a namespace as part of the node test; you must instead use a “qualified name” name (prefix:localname), and provide an external mapping from prefix to namespace.

So, to select node baz, we first have to add a prefix to each node in the expression: “/ns:foo/ns:bar/ns:baz”. And then, we have to give the evaluator a NamespaceContext object to tell it the namespace bound to that prefix:

xpath.setNamespaceContext(
        new SimpleNamespaceResolver("ns", "http://www.example.com"));

SimpleNamespaceResolver is a class from the Practical XML library. It exists because NamespaceContext is another interface used by XPath but defined for a larger audience, and a correct implementation — even for a single binding — runs to several dozen lines of code. SimpleNamespaceResolver handles a single binding along with the required default bindings; the Practical XML library also provides NamespaceResolver, which allows multiple bindings.

One last thing to remember about namespaces and qualified names: the prefix doesn't matter. It's just a way to find the actual namespace URI in a lookup table. The prefixes in your XPath expression don't have to match those in the source document. As shown above, the XPath expression has to use a prefix even when the source XML doesn't. Similarly, the source XML could have a node named “foo:bar”, and the XPath expression could use “baz:bar”. As long as both prefixes resolve to the same URI, the XPath will return the correct element.

Variables

So far, all of the examples have used literal strings containing the XPath expression. Doing this opens the door for hackers: an XPath expression built directly from client data is subject to “XPath injection,” similar to the better-known attacks against SQL. And, just as most SQL injection attacks can be thwarted through use of parameterized prepared statements rather than literal SQL, XPath expressions can be protected through the use of variables.

XPath variables appear in an expression as a dollar-sign followed by a name (which may or may not have a namespace prefix). For example, “/foo/bar[@name=$myvar]”. To provide evaluation-time values for variables, you must attach a variable resolver to your XPath object:

xpath.setXPathVariableResolver(new XPathVariableResolver()
    {
        public Object resolveVariable(QName variableName)
        {

Every time the XPath evaluator finds a variable reference, it will call the resolver for a value. How you implement the resolver is up to you; a simple Map is usually sufficient. While the XPath spec says that an unbound variable is an error condition, the JDK 1.5 implementation simply assumes that it's blank — and normally, that means that the expression won't select anything.

One point bears repeating: the resolver is given a qualified name. The variable in “//bar[@name=$myvar]” is not the same as that in “//bar[@name=$ns:myvar]”: the former has no namespace, while the latter does. If you use variables with namespaces, you must also use a NamespaceContext to resolve their prefixes.

Functions

Functions in an XPath expression look a lot like functions in other programming languages: a name, followed by a list of arguments in parentheses. An argument could be literal text, another function call, or (in many cases) an XPath expression. For example, the following expression uses the built-in translate() function to select a node and return its content as uppercase — provided that it contains only US-ASCII characters:

translate(string(/foo/bar/baz), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')

This expression is, quite frankly, ugly: it buries the node selection in a mass of text. It's also flawed, in that it translates a limited set of characters into another limited set of characters. If we need to uppercase text, it would be better to use the Java method String.toUpperCase(), which will do a locale-correct translation. And with add-on functions we can do just that … although after you see the work involved, you might have second thoughts about doing so.

The first part of implementing a user-defined function is easy: implement the function. Actually, it's not so easy: XPath functions are instances of the XPathFunction interface, which provides a single evaluate() method. You have to figure out the type of each argument and handle it appropriately. For this example, I want to handle both a nodeset and a literal string:

final XPathFunction myFunc = new XPathFunction()
{
    public Object evaluate(List args)
    throws XPathFunctionException
    {
        Object arg = args.iterator().next();
        if (arg instanceof NodeList)
        {
            NodeList nodes = (NodeList)arg;
            if (nodes.getLength() == 0)
                return "";
            else
                return nodes.item(0).getTextContent().toUpperCase();
        }
        else if (arg instanceof String)
        {
            return ((String)arg).toUpperCase();
        }
        else
            throw new XPathFunctionException("invalid argument: " + arg);
    }
};

For the example, I created the function as an anonymous inner class. In a real application, I strongly recommend creating normal, named classes, and unit tests for them.

Once you've implemented your functions, you have to create an XPathFunctionResolver to tell the XPath evaluator to use them:

final QName myFuncName = new QName("foo", "uppercase");

xpath.setXPathFunctionResolver(new XPathFunctionResolver()
    {
        public XPathFunction resolveFunction(QName functionName, int arity)
        {
            if (myFuncName.equals(functionName) && (arity == 1))
                return myFunc;
            else
                return null;
        }
    });

xpath.setNamespaceContext(new SimpleNamespaceResolver("ns", myFuncName.getNamespaceURI()));

There's a lot of stuff happening here so let's jump into the middle, the function resolver itself. Whenever the XPath evaluator sees a reference to a function that's not defined by the XPath spec it calls its resolver, passing the name of the function and the number of arguments that actually appear in the expression (don't blame me for that parameter name, it comes from the JavaDoc). In our case, we can handle only a single argument; if the expression contains more or fewer, we return null to indicate that the function can't be resolved (and this results in the evaluator throwing an exception).

Note that the function name is passed as a QName. All user-defined functions must live in a namespace, and must be referenced using a qualified name. And this means that, in addition to creating a function resolver, you must also bind the namespace.

After you've done all this, you can use an expression that references your function:

xpath.evaluate("ns:uppercase(/foo/bar/baz)", dom)

As you might guess, the Practical XML library has a few classes to make this process easier. There's an implementation of FunctionResolver, as well as an AbstractFunction, and several example function implementations.

To be honest, I believe that XPath functions are often more trouble than they're worth. While there can be a lot of value in functions that are used in predicates, I would think twice before writing a function that does some aggregation of a nodeset. Instead, think about using an XPath to create the nodeset, then executing the function on that nodeset as part of your Java execution flow.

XPathExpression

All of the examples so far have called the XPath.evaluate() method. As you might guess, this has to compile the expression into an internal form before using it. If you're going to be reusing the expression, it makes sense to compile it once:

XPathExpression compiled = xpath.compile("/foo/bar/baz");
// ...
String value = compiled.evaluate(dom);

The XPathExpression object provides all of the evaluate() methods from XPath, but nothing else. You must fully configure the XPath object before you compile an expression from it.

XPathWrapper

Although the XPath interfaces provided with the JDK are individually simple, I find them painful in practice. Particularly when using namespaces. The fact that most of this pain comes from boilerplate code led to the development of XPathWrapper. This class internalizes all of the ancillary objects used by XPath, compiles its expression for reuse, and is constructed using the “builder” pattern:

new XPathWrapper("/ns1:foo/ns2:bar[@ns2:name=$myvar]")
    .bindNamespace("ns1", "http://foo.example.com")
    .bindNamespace("ns2", "http://bar.example.com")
    .bindVariable("myvar", "argle")
    .evaluateAsString(dom2));

JxPath

While this article has focused on the JDK's implementation of XPath, it is not the only implementation available. Of the alternatives, Apache Commons JxPath is notable because it allows XPath expressions to be used with arbitrary object graphs, not just XML. Out of the box it supports bean-style objects, JDK collections, and the context objects provided to J2EE servlets.

While all that is useful, what's nicer is that JxPath is easy to extend. For example, you implement functions using a single class, none of which need to be attached to a namespace. Once you register the class, JxPath will use reflection to find the function.

You can also easily extend JxPath to access arbitrary structured data types. I used it for a project where we stored content in the database as a collection of IDs: each unique string had its own ID, significantly reducing the storage of duplicate data. Since the actual data tables were just collections of IDs, we used JxPath to implement human-readable queries, by adding an override to JxPath's node test and navigation code.

XPath as Debugging Tool

One of the problems of working with a DOM document is tracing issues back to their source. It's one reason that I recommend validating at the time of parsing, because once the parsing is done you've lost any source line or column numbers.

However, you can create an XPath expression that will uniquely identify any node in an XML document, using predicates to differentiate nodes with the same name. There are some caveats when doing so: in particular, keeping track of namespaces and providing any new bindings to the calling code.

Not surprisingly, the Practical XML library provides methods to do just this: DomUtil.getAbsolutePath() returns a path with positional predicates, that can be used to select the same node in the future. There's also DomUtil.getPath(), which returns a path with attribute predicates. This variant is unlikely to be useful for selecting the same node later, but I find it very useful as a debugging tool.

For More Information

The XPath 1.0 spec is actually pretty readable as W3C specs go.

The Practical XML library provides utilities for working with XML in many different ways.

All of the code fragments from this article are found in a single example program.

Copyright © Keith D Gregory, all rights reserved