DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • How To Get the Comments From a DOCX Document in Java
  • Extracting Data From Very Large XML Files With X-definition
  • What Is Ant, Really?
  • JMeter + Pepper-Box Plugin for Kafka Performance Testing to Choose or Not to Choose?

Trending

  • Enterprise Data Loss Prevention (DLP) Security Policies and Tuning
  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • Data Lake vs. Warehouse vs. Lakehouse vs. Mart: Choosing the Right Architecture for Your Business
  • Tired of Spring Overhead? Try Dropwizard for Your Next Java Microservice
  1. DZone
  2. Data Engineering
  3. Data
  4. StAXON - JSON via StAX

StAXON - JSON via StAX

By 
Christoph Beck user avatar
Christoph Beck
·
Feb. 08, 12 · Interview
Likes (0)
Comment
Save
Tweet
Share
22.5K Views

Join the DZone community and get the full member experience.

Join For Free

XML is for dinosaurs, right? Everybody uses JSON these days. So you do, don’t you? But what about things like XSD, XSLT, JAXB, XPath, etc – is it all evil?

In this article, I’d like to introduce the StAXON project (APL2) which tries to give you the best from both worlds: JSON outside, but XML inside. One benefit from this is that you can integrate JSON with powerful XML-related technologies for free.

StAXON lets you read and write JSON using the Java Streaming API for XML (javax.xml.stream), also known as StAX. More specifically, StAXON provides implementations of the

  • StAX Cursor API (XMLStreamReader and XMLStreamWriter)
  • StAX Event API (XMLEventReader and XMLEventWriter)
  • StAX Factory API (XMLInputFactory and XMLOutputFactory)

for JSON.

You may know the Jettison project, which also has XMLStreamReader and XMLStreamWriter implementations. However, StAXON aims to provide a more comprehensive and consistent solution and tries to avoid some of the issues users are having with Jettison.

Anyway, let’s get started and see what this “anti-aging substance” for XML can do.

Setup

Add the following dependency to your Maven POM file:

<dependency>
    <groupId>de.odysseus.staxon</groupId>
    <artifactId>staxon</artifactId>
    <version>1.0</version>
</dependency>

or get the latest StAXON JAR from the Downloads page and add it to your classpath.

Mapping Convention

The purpose of StAXON’s mapping convention is to generate a more compact JSON. It borrows the "$" syntax for text elements from the Badgerfish convention but attempts to avoid needless text-only JSON objects:

  • Element names become object properties:
    <alice/> <–> {"alice":null}
  • Attributes go in properties whose name begin with "@":
    <alice charlie="david"/> <–> {"alice":{"@charlie":"david"}}
  • Text-only elements go to a simple key/value property:
    <alice>bob</alice> <–> {"alice":"bob"}
  • Otherwise, text content is mapped to the "$" property:
    <alice charlie="david">bob</alice> <–> {"alice":{"@charlie":"david","$":"bob"}}
  • Nested elements go to nested properties:
    <alice><bob>charlie</bob></alice> <–> {"alice":{"bob":"charlie"}}
  • A default namespace declaration goes in the element’s "@xmlns" property:
    <alice xmlns="http://foo.com"/> <–> {"alice":{"@xmlns":"http://foo.com"}}
  • A prefixed namespace declaration goes in the element’s "@xmlns:<prefix>" property:
    <p:alice xmlns:p="http://foo.com/> <–> {"p:alice":{"@xmlns:p":"http://foo.com"}}

Note that the “ugly” '$'-fields and '@'-attributes will only appear when mapping XML elements that take both attributes (including namespace declarations) and text.

Core API

As StAXON is merely a StAX implementation, there’s just a thin API layer to deal with configuration that you have to care about. Everything else is pure StAX.

  • JsonXMLInputFactory extends XMLInputFactory and is used to create JSON stream/event readers
  • JsonXMLOutputFactory extends XMLOutputFactory and is used to create JSON stream/event writers
  • JsonXMLConfig provides a shared configuration interface for JsonXMLInputFactory and JsonXMLOutputFactory
  • JsonXMLConfigBuilder provides a fluent API to build JsonXMLConfig configuration instances

If you know StAX, you’ll notice that there’s little new: just obtain a reader or writer from StAXON and you’re ready to go.

Writing JSON

Create a JSON-based writer:

XMLOutputFactory factory = new JsonXMLOutputFactory();
XMLStreamWriter writer = factory.createXMLStreamWriter(System.out);

Write your document:

writer.writeStartDocument();
writer.writeStartElement("customer");
writer.writeStartElement("name");
writer.writeCharacters("John Doe");
writer.writeEndElement();
writer.writeStartElement("phone");
writer.writeCharacters("555-1111");
writer.writeEndElement();
writer.writeEndElement();
writer.writeEndDocument();
writer.close();

With an XML-based writer, this would have produced something like

<customer><name>John Doe</name><phone>555-1111</phone></customer>

However, with our JSON-based writer, the output is

{"customer":{"name":"John Doe","phone":"555-1111"}}

Reading JSON

Create a JSON-based reader:

String json = "{\"customer\":{\"name\":\"John Doe\",\"phone\":\"555-1111\"}}";

XMLInputFactory factory = new JsonXMLInputFactory();
XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(json));

Read your document:

assert reader.getEventType() == XMLStreamConstants.START_DOCUMENT;
reader.nextTag(); 
assert reader.isStartElement() && "customer".equals(reader.getLocalName());
reader.next();
assert reader.isStartElement() && "name".equals(reader.getLocalName());
reader.next();
assert reader.hasText() && "John Doe".equals(reader.getText());
reader.nextTag();
assert reader.isEndElement();
reader.next();
assert reader.isStartElement() && "phone".equals(reader.getLocalName());
reader.next();
assert reader.hasText() && "555-111".equals(reader.getText());
reader.nextTag();
assert reader.isEndElement();
reader.next();
assert reader.isEndElement();
reader.next();
assert reader.getEventType() == XMLStreamConstants.END_DOCUMENT;
reader.close();

Factory Configuration

The JsonXMLInputFactory and JsonXMLOutputFactory classes can be configured via the standard setProperty(String, Object) API. The factory classes define several constants for properties they support.

However, the JsonXMLConfig interface provides a convenient way to hold the configuration of both - input and output - factories:

JsonXMLConfig config = new JsonXMLConfigBuilder().
    virtualRoot("customer").
    prettyPrint(true).
    build();
XMLInputFactory inputFactory = new JsonXMLInputFactory(config);
...
XMLOutputFactory outputFactory = new JsonXMLOutputFactory(config);
...

Virtual Roots

Set the virtualRoot configuration property to strip the root element from the JSON representation, e.g.

{
  "name" : "John Doe",
  "phone" : "555-1111"
}

As XML requires a single root element, but JSON documents often don’t have one, this is an important feature required to read and write existing JSON formats.

Mastering Arrays

What about JSON arrays? Unfortunately, there’s nothing like this in XML. And to be honest, this causes most of the trouble when writing JSON via an XML API like StAX. Simply omitting the array boundaries would lead to non-unique JSON properties, which is usually not desired.

StAXON provides several ways to deal with JSON arrays. At the core is the idea to leverage XML processing instructions to tell the writer about to start an array: the <?xml-multiple?> processing instruction maps a sequence of XML elements with the same name to a JSON array.

The processing instruction optionally takes the array element tag name (with prefix) as data. There’s no end array hint as StAXON detects the end of an array sequence and closes it automatically.

Consider the following JSON document:

{
  "alice" : {
    "bob" : [ "edgar", "charlie" ],
    "peter" : null
  }
}

In order to get a "bob" array instead of two separate "bob" properties, we need to provide XML events corresponding to

<alice>
  <?xml-multiple?>
  <bob>edgar</bob>
  <bob>charlie</bob>
  <peter/>
</alice>

I.e., with the cursor API, you would just insert

writer.writeProcessingInstruction(JsonXMLStreamConstants.MULTIPLE_PI_TARGET); // <?xml-multiple?>

to start an array.

Initiating Arrays with Element Paths

Sometimes it is not desired or even impossible to generate <?xml-multiple?> processing instruction to control arrays. This may be the case if the actual writing isn’t done by your code, but some other framework like JAXB or similar, and you only provide a stream writer.

Addressing such a scenario, wouldn’t it be nice being able to tell the writer beforehand, which elements should trigger a JSON array? This is where the XMLMultipleStreamWriter and XMLMultipleEventWriter wrappers step in.

E.g., to specify a sequence of bob elements below root element alice as a multiple path:

writer = new XMLMultipleStreamWriter(writer, true, "/alice/bob");

The boolean parameter specifies whether our paths include the root node (alice) from the paths. That is, we could also use

writer = new XMLMultipleStreamWriter(writer, false, "/bob");

To wrap all bob fields into arrays (not just alice children), we can use a relative path, without a leading slash:

writer = new XMLMultipleStreamWriter(writer, false, "bob");

Now we (or some legacy code, framework, …) may write our document, and the writer will take care to trigger the bob array for us.

Triggering Arrays automatically

Finally, if nothing else works for you, you may also let StAXON fully automatically determine array boundaries. Use this only if you cannot provide <?xml-multiple?> processing instructions and cannot provide the paths of the elements that should be wrapped into JSON arrays.

However, using this method has several drawbacks:

  • The writer basically needs to cache the entire document in memory, eating both space and time.
  • The writer will not be able to produce empty arrays or arrays with a single element.

To enable this feature, set the JsonXMLOutputFactory.PROP_AUTO_ARRAY property to true.

Triggering Document Arrays

StAXON’s writer implementation allows you to wrap a sequence of documents into a JSON array. To do this, write the <?xml-multiple?> PI before writing anything else:

writer.writeProcessingInstruction(JsonXMLStreamConstants.MULTIPLE_PI_TARGET);
writer.writeStartDocument(); // first array component
...
writer.writeEndDocument();
writer.writeStartDocument(); // second array component
...
writer.writeEndDocument();
...
writer.close();

The writer.close() call is crucial here, as it will close the JSON array.

Using JAXB

Consider a JAXB-annotated Customer class:

@JsonXML(virtualRoot = true, prettyPrint = true, multiplePaths = "phone")
@XmlRootElement
public class Customer {
    public String name;
    public List<String> phone;
}

The @JsonXML annotation is used to configure the mapping details. In the above example, the customer root element is stripped from the JSON representation, phone elements are wrapped into an array and JSON output is nicely formatted, e.g.

{
  "name" : "John Doe",
  "phone" : [ "555-1111" ]
}

 

Now, the JsonXMLMapper class enables for dead-simple mapping to and from JSON:

/*
 * Create mapper instance.
 */
JsonXMLMapper<Customer> mapper = new JsonXMLMapper<Customer>(Customer.class);

/*
 * Read customer.
 */
InputStream input = getClass().getResourceAsStream("input.json");
Customer customer = mapper.readObject(input);
input.close();

/*
 * Write back to console
 */
mapper.writeObject(System.out, customer);

Using JAX-RS

StAXON provides the staxon-jaxrs module, which enables your RESTful services to serialize/deserialize JAXB-annotated classes to/from JSON. It includes the following JAX-RS @Provider classes:

  • de.odysseus.staxon.json.jaxrs.jaxb.JsonXMLObjectProvider is used to read and write JSON objects
  • de.odysseus.staxon.json.jaxrs.jaxb.JsonXMLArrayProvider is used to read and write JSON arrays

In order to select the StAXON message body readers/writers for your resource, a @JsonXML annotation is required.

When used with JAX-RS, the @JsonXML annotation can be placed on

  • a model type (@XmlRootElement or @XmlType) to configure its serialization and deserialization
  • a JAX-RS resource method to configure serialization of the result type
  • a parameter of a JAX-RS resource method to configure deserialization of the parameter type

If a @JsonXML annotation is present at a model type and a resource method or parameter, the latter will override the model type annotation. If neither is present, StAXON will not handle the resource.

You can find a sample project using Jersey with StAXON here.

Using XPath

XPath is another standard that can be easily adopted for use with JSON.

The Java XPath API (javax.xml.xpath) doesn’t let us provide an XMLStreamReader or similar as a source, but requires a Document Object Model (DOM). Therefore, we need to read our JSON into a DOM first to apply expressions against that DOM. This could be done by performing an XSLT identity transformation to a DOMResult. However, StAXON provides the DOMEventConsumer class to translate XML events to DOM nodes, which should be faster and simpler than leveraging XSLT.

Once we have a DOM, there’s nothing special with applying XPath expressions.

StringReader json = new StringReader("{\"edgar\":\"david\",\"bob\":\"charlie\"}");

/*
 * Our sample JSON has no root element, so specify "alice" as virtual root
 */
JsonXMLConfig config = new JsonXMLConfigBuilder().virtualRoot("alice").build();

/*
 * create event reader
 */
XMLEventReader reader = new JsonXMLInputFactory(config).createXMLEventReader(json);

/*
 * parse JSON into Document Object Model (DOM)
 */
Document document = DOMEventConsumer.consume(reader);

/*
 * evaluate an XPath expression
 */
XPath xpath = XPathFactory.newInstance().newXPath();
System.out.println(xpath.evaluate("//alice/bob", document));

Running the above sample will print charlie to the console.

What else?

In the end, using an XML API to read and write JSON may still look like a compromise, but it may turn out to be a good choice. The availability of a StAX implementation for JSON acts as a door opener to powerful XML related technologies and easily enables for dual-format (XML and JSON) services.

There’s more we can do with StAXON: XSD, XSLT, XQuery, XML-JSON/JSON-XML conversions, to name a few. Please check the Wiki for some of those.

file IO StAX Element Data structure XML Property (programming) API Document

Opinions expressed by DZone contributors are their own.

Related

  • How To Get the Comments From a DOCX Document in Java
  • Extracting Data From Very Large XML Files With X-definition
  • What Is Ant, Really?
  • JMeter + Pepper-Box Plugin for Kafka Performance Testing to Choose or Not to Choose?

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!