How to Query XML Files Using APIs in Java
With XQuery, quickly filter and aggregate XML content with a simple expression. Learn how to easily query files with XQuery expressions at scale.
Join the DZone community and get the full member experience.
Join For FreeDespite some notable shortcomings in the contemporary technology landscape, XML is still a powerful language offering key advantages in complex data storage scenarios.
Compared with a popular data interchange format like JSON, for example, XML’s syntax places a greater emphasis on machine readability over human readability, making its error-checking process more efficient. Most importantly, XML is great at storing unique data types with multiple variables, whereas JSON is optimized for relatively simple and concise object storage. XML’s advantages and disadvantages both stem from the fact that it’s not a dedicated data interchange format like JSON at all; rather, it’s a complex markup language (more similar to HTML) with powerful data interchange capabilities.
When we choose XML to store complex data relationships, we can begin to leverage powerful XML technologies like XQuery to extract and manipulate complex data for various purposes. With a good understanding of XQuery and a consistent, reliable environment to execute our queries against XML files, we can expand the utility of complex XML data.
In this article, we’ll briefly review how XQuery works (with a basic example), and we’ll subsequently learn how to run queries against one or more XML files through a pair of free APIs using complementary Java code examples.
Understanding XQuery
Thanks to XML's characteristic hierarchical data structure, querying content from complex XML files isn’t all that complicated.
At a high level, relationships in XML data are neatly represented in a parent-child structure. This creates an easy path for well-formed, targeted query expressions to navigate through – even when those relationships are highly unique. Any given element in XML syntax (structured like HTML with opening and closing tags, e.g. <example>hello world</example>
) can have its own specific attributes and enclose multiple additional elements or data types within it, and we can use XQuery to efficiently access all of it.
Whether we’re attempting to filter and reuse data from an XML file, aggregate specific XML file data for calculations and reports, or simply search for matching data across multiple XML files stored in a single database, XQuery is up to the task. It’s a fundamental piece of the XML technology family (along with its cousins XPath and XSLT), and it offers versatility in a wide range of XML data-handling scenarios.
Simple XQuery Example
The most basic use case for XQuery is data retrieval, so let’s look at a simple example demonstrating how we might use XQuery to find specific data within an XML file.
Below, we have an example XML file storing information about popular movies broken down by genre. Within each genre element, we have information including a movie title, movie director, year of release, and the price of a movie ticket.
<?xml version="1.0" encoding="UTF-8"?>
<movies>
<movie category="ACTION">
<title lang="en">Inception</title>
<director>Christopher Nolan</director>
<year>2010</year>
<ticket_price>10.00</ticket_price>
</movie>
<movie category="COMEDY">
<title lang="en">The Grand Budapest Hotel</title>
<director>Wes Anderson</director>
<year>2014</year>
<ticket_price>12.50</ticket_price>
</movie>
<movie category="DRAMA">
<title lang="en">The Shawshank Redemption</title>
<director>Frank Darabont</director>
<year>1994</year>
<ticket_price>8.00</ticket_price>
</movie>
<movie category="SCIFI">
<title lang="en">Blade Runner 2049</title>
<director>Denis Villeneuve</director>
<year>2017</year>
<ticket_price>15.00</ticket_price>
</movie>
</movies>
By writing declarative expressions in XQuery (using core constructs like For
, Let
, Where
, Order
, Return
, etc.), we can easily filter through the parent-child relationships in one or more XML file(s) and get the data we need. XQuery leverages XPath to navigate the file(s) in question while applying the above constructs to take specific actions against matching elements.
Let’s say we want to query our above example file to retrieve data about movies that cost more than $10 to see. We could write a simple XQuery string like so:
for $movie in /movies/movie
where number($movie/ticket_price) > 10
return $movie
This would return information about both “The Grand Budapest Hotel” and “Blade Runner."
How and Where To Run XQuery Expressions
To execute XQuery expressions against one or more XML files, we need to run our expression through an XQuery processor. There are a few different processors out there.
We can, for example, enter our XQuery expressions directly into a compatible database storing XML files. For local development projects, we can run XQuery expressions with built-in technologies when we use popular programming languages like Java. We can even search for and leverage online tools to run XQuery expressions against XML files on a one-off basis.
In addition to these options, we can leverage specialized web APIs to abstract XQuery expression processing away from our local server entirely. If our XML files are stored outside of an XQuery-compatible database, this option represents a highly scalable, low-maintenance, and cost-effective solution for querying our XML content.
Further down the page, we’ll look at two free APIs we can use to query one or multiple XML files with a single XQuery expression through a multipart/form-data request.
Demonstration
Using complementary, ready-to-run Java code examples provided below, we can take advantage of two separate APIs optimized to query one or more XML files by passing our file paths and XQuery expressions together in a single request. To authorize our requests, we'll just need a free API key to make a limit of 800 requests per month.
Client SDK Installation
We can begin structuring our API calls by installing the client SDK.
- In our Maven POM file, let’s add a reference to the repository (JitPack is used to dynamically compile the library):
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
- Then, let’s add a reference to the dependency:
<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>
Now we can go about calling each individual API function.
Adding Import Classes and Calling Functions
- Using the first set of code examples below, we can call an API optimized to query a single XML document as input. Provided the XML document is automatically loaded as the default context; to access elements in the document, simply refer to them without a document reference:
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertDataApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
ConvertDataApi apiInstance = new ConvertDataApi();
File inputFile = new File("/path/to/inputfile"); // File | Input XML file to perform the operation on.
String xquery = "xquery_example"; // String | Valid XML XQuery 3.1 or earlier query expression; multi-line expressions are supported
try {
XmlQueryWithXQueryResult result = apiInstance.convertDataXmlQueryWithXQuery(inputFile, xquery);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling ConvertDataApi#convertDataXmlQueryWithXQuery");
e.printStackTrace();
}
- Using the next examples below, we can call an API optimized to query multiple XML documents as input. Please note we can refer to the content of a given document by name (e.g., “movies.xml” or “books.xml”) if we include two named documents. If our input files contain no file name, they will default to file names like “input1.xml”, “input2.xml”, etc.
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertDataApi;
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
ConvertDataApi apiInstance = new ConvertDataApi();
File inputFile1 = new File("/path/to/inputfile"); // File | First input XML file to perform the operation on.
String xquery = "xquery_example"; // String | Valid XML XQuery 3.1 or earlier query expression; multi-line expressions are supported
File inputFile2 = new File("/path/to/inputfile"); // File | Second input XML file to perform the operation on.
File inputFile3 = new File("/path/to/inputfile"); // File | Third input XML file to perform the operation on.
File inputFile4 = new File("/path/to/inputfile"); // File | Fourth input XML file to perform the operation on.
File inputFile5 = new File("/path/to/inputfile"); // File | Fifth input XML file to perform the operation on.
File inputFile6 = new File("/path/to/inputfile"); // File | Sixth input XML file to perform the operation on.
File inputFile7 = new File("/path/to/inputfile"); // File | Seventh input XML file to perform the operation on.
File inputFile8 = new File("/path/to/inputfile"); // File | Eighth input XML file to perform the operation on.
File inputFile9 = new File("/path/to/inputfile"); // File | Ninth input XML file to perform the operation on.
File inputFile10 = new File("/path/to/inputfile"); // File | Tenth input XML file to perform the operation on.
try {
XmlQueryWithXQueryMultiResult result = apiInstance.convertDataXmlQueryWithXQueryMulti(inputFile1, xquery, inputFile2, inputFile3, inputFile4, inputFile5, inputFile6, inputFile7, inputFile8, inputFile9, inputFile10);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling ConvertDataApi#convertDataXmlQueryWithXQueryMulti");
e.printStackTrace();
}
Now we have a simple method for querying content from our XML files on an individual file-by-file basis, or in bulk using multiple files at once.
Summary
By using APIs to process our XQuery strings instead of built-in functions or open-source libraries, we’ve abstracted the query operation away from our servers, minimizing the amount of code we need to run. We’ve introduced a simple solution for locating and aggregating XML data that we don’t have to worry about updating or maintaining in the future.
Of course, it’s important to note that APIs won’t be viable for every project. As such, we need to use our own judgment to determine when and where web API calls are appropriate in our application architecture.
Opinions expressed by DZone contributors are their own.
Comments