DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Convert XLS to XLSX in Java
  • How to Edit a PowerPoint PPTX Document in Java
  • How to Query XML Files Using APIs in Java
  • How To Get the Comments From a DOCX Document in Java

Trending

  • The Cypress Edge: Next-Level Testing Strategies for React Developers
  • Zero Trust for AWS NLBs: Why It Matters and How to Do It
  • Why Documentation Matters More Than You Think
  • Optimize Deployment Pipelines for Speed, Security and Seamless Automation
  1. DZone
  2. Coding
  3. Java
  4. How to Merge Excel XLSX Files in Java

How to Merge Excel XLSX Files in Java

This article discusses programmatically merging XLSX files into a single document as well as open-source and web API solutions for automating this workflow.

By 
Brian O'Neill user avatar
Brian O'Neill
DZone Core CORE ·
Sep. 16, 24 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
6.5K Views

Join the DZone community and get the full member experience.

Join For Free

In this article, we’re going to learn how to increase the efficiency of a common file merging workflow with the help of a web API solution. Specifically, we are going to learn how to merge Excel XLSX files – one of the most common document types we can expect to work within a file-processing automation environment.

Context for Programmatic XLSX File Merging

If we’re writing code to solve problems related to file processing efficiency and automation, there’s a good chance we’re creating and/or expanding applications that deal with large volumes of Excel files. We may, for example, find that there’s no automated workflow currently in place (or, perhaps, an inefficient one) to combine the various unique Excel reports created by each individual department in our organization.  

Monthly, quarterly, and annual reports tend to become disorganized and unmanageable when they are shared with relevant stakeholders as a series of independent files. Resolving this issue by assembling larger multi-worksheet reports often becomes the responsibility of overburdened non-technical employees (i.e., accountants and office administrators) – or it gets handled by too many lines of messy, poorly documented code, which requires constant maintenance and updates to function correctly.

For a common workflow like Excel file merging, the benefits of replacing manual and/or inefficient code processes with a web API boil down to simplicity and efficiency. Calling an external cloud service tends to reduce the amount of code maintenance we need to do, and it reduces the load we place on internal systems by abstracting most of the processing – including local memory allocation, which can be problematic with larger Excel files – to an external server.  On top of that, it helps establish a uniform formatting standard for all our XLSX merging operations, ensuring we can build downstream file processing workflows around consistent, predictable results.  

We’ll look at calling a web API solution with Java code examples towards the end of this article, but first, we’ll talk about XLSX file structure in the context of file merging, and we’ll briefly review a common open-source Java API that can be used locally for the same purpose.

How XLSX File Merging Works

Combining two Excel files programmatically isn’t as complicated as interacting directly with data sets and other multimedia objects stored within the zipped Open Office XML file structure (something I’ve gone over in previous articles). Still, it’s worth understanding the process at a high level before selecting from APIs and open-source libraries that streamline the operation for us.

Each individual worksheet in an XLSX document (or “workbook”) is stored as its own XML file within the xl/worksheets/ folder. The relationship between these files – which, very importantly in the context of file merging operations, includes the exact order worksheets should appear in when we open the XLSX document – is described in a high-level metadata file, typically xl/workbook.xml. This stores the custom worksheet names users view when they open an XLSX document, and it maps those worksheet names to the worksheet contents themselves. This metadata file also contains crucial information about the individual styling and formulas associated with each worksheet, which we’ll need for file merges if we don’t want carefully tailored spreadsheets to reset to plain, unformatted text with missing formulas.

Programmatically merging XLSX files means accessing worksheet information through the zipped XLSX folder path, retrieving all worksheet XML files in the order specified by the high-level metadata file, retrieving styling and formula metadata for each file, and copying those XML files into a new XLSX file. The new XLSX file should carry the worksheet XML references in a specific order that mirrors the order the files were originally loaded; otherwise, we’ll compromise the structure of the original documents when we merge multiple.

Open Source XLSX File Merging in Java

For Java developers interested in doing a little more hands-on coding work, most of the heavy lifting for this operation can be accomplished via Apache POI, a popular Java API for Microsoft Documents. As with most Apache products, the benefits of using Apache POI stem from leveraging a well-documented, open-source solution with simple classes (like XSSFWorkbookin this case). At a high level, you can effectively load Excel files via FileInputStream, represent those files as XSSFWorkbook objects, and create a separate XSSFWorkbook object to store the merged worksheets. Downstream, the copySheet method can read through all source sheets and replicate data contained within them in the new object, and the output can be written to a new workbook via FileOutputStream.  

Beyond the significant hands-on coding this process requires, however, memory allocation may be a concern for large-scale production environments. All data is generally loaded into and managed within JVM heap memory, which can create some problems when larger XLSX files are involved.  As discussed earlier on, one of the primary benefits of leveraging a web API is the ability to abstract memory to an external server, which offloads a considerable portion of the file processing requirements. Conversely, web APIs aren't always a viable option for every environment. We’ll look at a web API solution for handling file merging now.

Demonstration

We’ll look at two different iterations of a web API solution that are optimized to merge XLSX files, and we’ll walk through code examples to call each method correctly. One is designed to handle no more than two files at a time, while the other can handle 10+ files at once. Both are free to use with a free API key.

To structure our API calls, we’ll start by adding the client library to our Maven project.  Let’s first add the JitPack repository reference to our pom.xml file:

XML
 
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>


After that, let’s add the API client dependency reference to pom.xml:

XML
 
<dependencies>
<dependency>
    <groupId>com.github.Cloudmersive</groupId>
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
    <version>v4.25</version>
</dependency>
</dependencies>


Next, let’s add import classes to the top of our file:

Java
 
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.MergeDocumentApi;


And after that, we’ll initialize the default API client:

Java
 
ApiClient defaultClient = Configuration.getDefaultApiClient();


In our next step, we’ll configure API key authorization.  After we’ve retrieved the authorization instance, we’ll use the setApiKey method to capture our actual key:

Java
 
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");


At this point, we can choose whether to merge 2 Excel documents or 10+ documents in our request. In either case, we’ll replace the example file paths shown in each snippet below with our actual XSLX file paths.

To merge two documents, we can use the below code:

Java
 
MergeDocumentApi apiInstance = new MergeDocumentApi();
File inputFile1 = new File("/path/to/inputfile"); // File | First input file to perform the operation on.
File inputFile2 = new File("/path/to/inputfile"); // File | Second input file to perform the operation on (more than 2 can be supplied).
try {
    byte[] result = apiInstance.mergeDocumentXlsx(inputFile1, inputFile2);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling MergeDocumentApi#mergeDocumentXlsx");
    e.printStackTrace();
}


And to merge 10+, we can use the following code instead:

Java
 
MergeDocumentApi apiInstance = new MergeDocumentApi();
File inputFile1 = new File("/path/to/inputfile"); // File | First input file to perform the operation on.
File inputFile2 = new File("/path/to/inputfile"); // File | Second input file to perform the operation on.
File inputFile3 = new File("/path/to/inputfile"); // File | Third input file to perform the operation on.
File inputFile4 = new File("/path/to/inputfile"); // File | Fourth input file to perform the operation on.
File inputFile5 = new File("/path/to/inputfile"); // File | Fifth input file to perform the operation on.
File inputFile6 = new File("/path/to/inputfile"); // File | Sixth input file to perform the operation on.
File inputFile7 = new File("/path/to/inputfile"); // File | Seventh input file to perform the operation on.
File inputFile8 = new File("/path/to/inputfile"); // File | Eighth input file to perform the operation on.
File inputFile9 = new File("/path/to/inputfile"); // File | Ninth input file to perform the operation on.
File inputFile10 = new File("/path/to/inputfile"); // File | Tenth input file to perform the operation on.
try {
    byte[] result = apiInstance.mergeDocumentXlsxMulti(inputFile1, inputFile2, inputFile3, inputFile4, inputFile5, inputFile6, inputFile7, inputFile8, inputFile9, inputFile10);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling MergeDocumentApi#mergeDocumentXlsxMulti");
    e.printStackTrace();
}


We’ve now invoked the merge method and handled exceptions (we’ll catch ApiException for API-related issues like invalid inputs, network failures, or server errors). We’ll receive the resulting byte[] arrays from mergeDocumentXlsx and/or mergeDcoumentXlsxMulti respectively, and we can wrap up our workflow by creating a new XLSX file or passing those file bytes downstream in our workflow for various other processes.

Conclusion

In this article, we provided some context for Excel XLSX file merging operations, and we briefly reviewed how XLSX file merging occurs according to XLSX file structure. Further, we discussed the pros and cons of using open-source Java APIs for file merging, and we learned how to call a web API using Java code examples that streamline file merging even more efficiently.

API Apache POI Web API XML Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • How to Convert XLS to XLSX in Java
  • How to Edit a PowerPoint PPTX Document in Java
  • How to Query XML Files Using APIs in Java
  • How To Get the Comments From a DOCX Document in Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!