DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • How to Merge Excel XLSX Files in Java
  • How to Introduce a New API Quickly Using Micronaut
  • How to Merge HTML Documents in Java
  • Designing a Java Connector for Software Integrations

Trending

  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • Implementing Explainable AI in CRM Using Stream Processing
  • Securing the Future: Best Practices for Privacy and Data Governance in LLMOps
  • Cloud Security and Privacy: Best Practices to Mitigate the Risks
  1. DZone
  2. Coding
  3. Java
  4. How to Edit a PowerPoint PPTX Document in Java

How to Edit a PowerPoint PPTX Document in Java

Learn how PowerPoint PPTX files are structured and discover popular open-source and web-API solutions for programmatically editing PPTX content in Java.

By 
Brian O'Neill user avatar
Brian O'Neill
DZone Core CORE ·
Jan. 20, 25 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
5.6K Views

Join the DZone community and get the full member experience.

Join For Free

Building applications for programmatically editing Open Office XML (OOXML) documents like PowerPoint, Excel, and Word has never been easier to accomplish. Depending on the scope of their projects, Java developers can leverage open-source libraries in their code — or plugin-simplified API services — to manipulate content stored and displayed in the OOXML structure.

Introduction

In this article, we’ll specifically discuss how PowerPoint Presentation XML (PPTX) files are structured, and we’ll learn the basic processes involved in navigating and manipulating PPTX content. We’ll transition into talking about a popular open-source Java library for programmatically manipulating PPTX files (specifically, replacing instances of a text string), and we’ll subsequently explore a free third-party API solution that can help simplify that process and reduce local memory consumption.

How are PowerPoint PPTX Files Structured?

Like all OOXML files, PowerPoint PPTX files are structured as ZIP archives containing a series of hierarchically organized XML files. They’re essentially a series of directories, most of which are responsible for storing and arranging the resources we see when we open presentations in the PowerPoint application (or any PPTX file reader).  

PPTX archives start with a basic root structure, where the various content types we see in a PowerPoint (e.g., multimedia content) are neatly defined. The heart of a PPTX document resides at the directory level, with components like slides (e.g., firstSlide.xml, secondSlide.xml, etc.), slide layouts (e.g., templates), slide masters (e.g., global styles and placeholders), and other content (e.g., charts, media, and themes) clearly organized. The relationships between interdependent components in a PPTX file are stored in .rels XML files within the _rels directory. These relationship files automatically update when changes are made to slides or other content.

With this file structure in mind, let’s imagine we wanted to manually replace a string of text within a PowerPoint slide without opening the file in PowerPoint or any other PPTX reader. To do that, we would first convert the PPTX archive to a ZIP file (with a .zip extension) and unzip its contents. After that, we would check the ppt/presentation.xml file, which lists the slides in order, and we would then navigate to the ppt/slides/ directory to locate our target slide (e.g., secondSlide.xml). To modify the slide, we would open secondSlide.xml, locate the text run we needed (typically structured as <a:t> “string” </a:t> within an <a:r></a:r> tag), and replace the text content with a new string. We would then check the _rels directory to ensure the slide relationships remained intact; after that, we would repackage the file as a ZIP archive and reintroduce a .pptx extension. All done!

Changing PPTX Files Programmatically in Java

To handle the exact same process in Java, we would have to consider a few different possibilities depending on the context. Obviously, nobody wants to manually map the entire OOXML structure to a custom Java program on the fly — so we’d have to determine whether using an open-source library or a plug-and-play API service would make more sense based on our project constraints.  

If we chose the open-source route, Apache POI would be a great option. Apache POI is an open-source Java API designed specifically to help developers work with Microsoft documents, including PowerPoint PPTX (and also Excel XLSX, Word DOCX, etc.).  

For a project concerned with PPTX files, we would first import relevant Apache POI classes for a PowerPoint project (e.g., XMLSlideShow, XSLFSlide, and XSLFTextShape). We would then load the PPTX file using the XMLSlideShow class, invoke the getSlides() method, filter text content with the XSLFTextShape class, and invoke the getText() and setText() methods to replace a particular string.

This would work just fine, but it's worth noting that the challenge with using an open-source library like Apache POI is the way memory is handled.  Apache POI loads all data into local memory, and although there are some workarounds — e.g., increasing JVM heap size or implementing stream-based APIs — we’re likely consuming a ton of resources dealing with large PPTX files at scale.   

Leveraging a Third-Party API Solution

If we can’t handle a PPTX editing workflow locally, we might benefit from a cloud-based API solution.   This type of solution offloads the bulk of our file processing to an external server and returns the result, reducing overhead. As a side benefit, it also simplifies the process of structuring our string replacement request. We’ll look at one API solution below.

The below ready-to-run example Java code can be used to call a free web API that replaces all instances of a string found in a PPTX document. The API is free to use with a free API key, and the parameters are extremely straightforward to work with.

To structure our API call, we’ll begin by incorporating the client library in our Maven project. We’ll add the following (JitPack) repository reference to our pom.xml:

XML
 
<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>


Next, we’ll add the below dependency reference to our pom.xml:

XML
 
<dependencies>
<dependency>
    <groupId>com.github.Cloudmersive</groupId>
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
    <version>v4.25</version>
</dependency>
</dependencies>


With that out of the way, we’ll now copy the below import classes and add them to the top of our file:

Java
 
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.EditDocumentApi;


Now, we’ll use the below code to initialize the API client and subsequently configure API key authorization. The setAPIKey()method will capture our API key string:

Java
 
ApiClient defaultClient = Configuration.getDefaultApiClient();

// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");


Finally, we’ll use the code below to instantiate the API client, configure the replacement operation, execute the replacement process (returning a byte[] array), and catch/log errors:

Java
 
EditDocumentApi apiInstance = new EditDocumentApi();
ReplaceStringRequest reqConfig = new ReplaceStringRequest(); // ReplaceStringRequest | Replacement document configuration input
try {
    byte[] result = apiInstance.editDocumentPptxReplace(reqConfig);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling EditDocumentApi#editDocumentPptxReplace");
    e.printStackTrace();
}


The JSON below defines the structure of our request; we’ll use this in our code to configure the parameters of our string replacement operation.  

JSON
 
{
  "InputFileBytes": "string",
  "InputFileUrl": "string",
  "MatchString": "string",
  "ReplaceString": "string",
  "MatchCase": true
}


We can prepare a PPTX document for this API request by reading the file into a byte array and converting it to a Base64-encoded string.

Conclusion

In this article, we discussed the way PowerPoint PPTX files are structured and how that structure lends itself to straightforward PowerPoint document editing outside of a PPTX reader. We then suggested the Apache POI library as an open-source solution for Java developers to programmatically replace strings in PPTX files, before also exploring a free third-party API solution for handling the same process at less local memory cost.

As a quick final note — for anyone interested in similar articles focused on Excel XLSX or Word DOCX documents, I’ve covered those topics in prior articles over the years.

API Apache POI Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • How to Merge Excel XLSX Files in Java
  • How to Introduce a New API Quickly Using Micronaut
  • How to Merge HTML Documents in Java
  • Designing a Java Connector for Software Integrations

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!