Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Parse an XML Response with Java and Dom4J

DZone's Guide to

Parse an XML Response with Java and Dom4J

· Java Zone
Free Resource

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

Previously we’ve explored how to parse XML data using NodeJS as well as PHP.  Continuing on the trend of parsing data using various programming languages, this time we’re going to take a look at parsing XML data using the dom4j library with Java.

Now dom4j, is not the only way to parse XML data in Java.  There are many other ways including using the SAX parser.  Everyone will have their own opinions on which of the many to use.

To keep up with my previous two XML tutorials, we’re going to use the following XML data saved in a file called data.xml at the root of the project:

<?xml version='1.0'?>
<business>
    <company>Code Blog</company>
    <owner>Nic Raboy</owner>
    <employees>
        <employee>
            <firstname>Nic</firstname>
            <lastname>Raboy</lastname>
        </employee>
        <employee>
            <firstname>Maria</firstname>
            <lastname>Campos</lastname>
        </employee>
    </employees>
</business>

With our XML content figured out, let’s make sure we structure our project like the following:

  • project root
    • src
      • xmlparser
        • MainDriver.java
    • libs
      • dom4j-1.6.1.jar
    • build.xml
    • data.xml

Based on our project structure, you can probably tell that we’re going to be using Apache Ant for building.  Say what you want about using Ant, but I’m still one of many who still uses it.  Feel free to make changes to Apache Maven or other to better meet your needs.

We’re now ready to crack open our src/xmlparser/MainDriver.java to start adding our parse logic.

package xmlparser;
 
import java.io.*;
import java.util.*;
import org.dom4j.*;
import org.dom4j.io.*;
 
public class MainDriver {
 
    public static void main(String[] args) {
 
    }
    
    public static void printRecursive(Element element) {
 
    }
 
    public static Document readFile(String filename) throws Exception {
 
    }
 
}

To further explain our intentions, the readFile(String filename) function will load the data.xmlfile and return it as a Document object for further parsing.  The printRecursive(Element element)function will iterate through each node of the XML and print it out if it contains text.  All levels of the XML will be iterated through.

So let’s start with readFile(String filename):

public static Document readFile(String filename) throws Exception {
    SAXReader reader = new SAXReader();
    Document document = reader.read(new File(filename));
    return document;
}

Nothing really to the above code.  In fact, I pulled most of it from the dom4j quick-start code.

The printRecursive(Element element) function is where things get more complex:

public static void printRecursive(Element element) {
    for(int i = 0, size = element.nodeCount(); i < size; i++) {
        Node node = element.node(i);
        if(node instanceof Element) {
            Element currentNode = (Element) node;
            if(currentNode.isTextOnly()) {
                System.out.println(currentNode.getText());
            }
            printRecursive(currentNode);
        }
    }
}

Some of the above code was taken from the dom4j quick-start, but the rest is some custom work.  We are basically looking at each node and trying to visit any available children.  If none exist, bail out.  We also only want to print if there is text.

Finally, we’re looking at the main(String[] args) function to bring it all together:

public static void main(String[] args) {
    try {
        Element root = readFile("data.xml").getRootElement();
        printRecursive(root);
    } catch (Exception e) {
        e.printStackTrace();
    }
}

Just like that we’ve printed our each node of our XML document.

In case you’re interested in the build.xml code, it can be seen below:

<project>
 
    <property name="lib.dir" value="libs" />
    <property name="jar.dir" value="build/jar" />
    <property name="jar.name" value="XMLParser.jar" />
 
    <path id="classpath">
        <fileset dir="${lib.dir}" includes="**/*.jar"/>
    </path>
 
    <target name="clean">
        <delete dir="build"/>
    </target>
 
    <target name="compile" depends="clean">
        <mkdir dir="build/classes"/>
        <javac srcdir="src" destdir="build/classes" classpathref="classpath"/>
    </target>
 
    <target name="build" depends="compile">
        <mkdir dir="build/jar"/>
        <jar destfile="${jar.dir}/${jar.name}" basedir="build/classes">
            <zipgroupfileset dir="libs" includes="*.jar"/>
            <manifest>
                <attribute name="Main-Class" value="xmlparser.MainDriver"/>
            </manifest>
        </jar>
    </target>
 
    <target name="run">
        <java jar="${jar.dir}/${jar.name}" fork="true"/>
    </target>
 
    <target name="buildandrun" depends="build, run" />
 
</project>

To test the project you’d just run ant buildandrun from your command prompt or Terminal.  Assuming of course you have Apache Ant configured correctly.

The dom4j library is very thorough so I recommend have a look at the Javadocs that go with it.

Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.

Topics:

Published at DZone with permission of Nic Raboy, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}