DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • How to Convert Between PDF and TIFF in Java
  • How to Split PDF Files into Separate Documents Using Java
  • How to Get Plain Text From Common Documents in Java
  • How to Change PDF Paper Sizes With an API in Java

Trending

  • Build a Simple REST API Using Python Flask and SQLite (With Tests)
  • Introducing Graph Concepts in Java With Eclipse JNoSQL
  • Enforcing Architecture With ArchUnit in Java
  • The Evolution of Scalable and Resilient Container Infrastructure
  1. DZone
  2. Data Engineering
  3. Data
  4. Get or Set PDF Metadata in Java

Get or Set PDF Metadata in Java

If you mold the metadata of your PDFs to allow for SEO optimization via keywords, you will be able to increase the online searchability of your document.

By 
Brian O'Neill user avatar
Brian O'Neill
DZone Core CORE ·
Feb. 24, 21 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
4.3K Views

Join the DZone community and get the full member experience.

Join For Free

Introduction

Due to their fixed and presentable nature, PDF files are widely used in web applications by both users and businesses. Each of these files contains ‘metadata,' which essentially translates to data about data. PDF metadata contains supplementary information on the document, such as its author, subject, title, creation date, and more. If a PDF file was initially created by a transfer from an original source document (i.e. DocX, PPT, etc.) additional information, such as the file size and whether the file has been optimized for the Web, is automatically added as well.

So why is this PDF metadata relevant to your business? If you have PDF documents that are accessible on your website or application, the metadata can enable search engines to easily locate the documents. Therefore, if you mold the metadata of your PDFs to allow for keywords that may be picked up by search engines, you will be able to increase the searchability of your document.

The following APIs will allow you to extract or set metadata from PDF files and either edit or leverage the information to meet your business’s needs. 

Tutorial

To begin, we first need to install the Maven SDK by adding a reference to the repository in pom.xml:

Java
 




x


 
1
<repositories>
2
    <repository>
3
        <id>jitpack.io</id>
4
        <url>https://jitpack.io</url>
5
    </repository>
6
</repositories>



Then we can add a reference to the dependency:

Java
 




xxxxxxxxxx
1


 
1
<dependencies>
2
<dependency>
3
    <groupId>com.github.Cloudmersive</groupId>
4
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
5
    <version>v3.90</version>
6
</dependency>
7
</dependencies>



Once the installation is complete, we can add the imports to the top of the controller and configure the API key. If you don’t already have an API key, you can register for a free account on the Cloudmersive website to retrieve it. 

Java
 




xxxxxxxxxx
1
15


 
1
// Import classes:
2
//import com.cloudmersive.client.invoker.ApiClient;
3
//import com.cloudmersive.client.invoker.ApiException;
4
//import com.cloudmersive.client.invoker.Configuration;
5
//import com.cloudmersive.client.invoker.auth.*;
6
//import com.cloudmersive.client.EditPdfApi;
7

          
8
ApiClient defaultClient = Configuration.getDefaultApiClient();
9

          
10
// Configure API key authorization: Apikey
11
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
12
Apikey.setApiKey("YOUR API KEY");
13
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
14
//Apikey.setApiKeyPrefix("Token");



If you simply wish to gather metadata from the PDF, the following function will perform the action for you—all you need for input is the target PDF file. 

Java
 




xxxxxxxxxx
1
10
9


 
1
EditPdfApi apiInstance = new EditPdfApi();
2
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
3
try {
4
    PdfMetadata result = apiInstance.editPdfGetMetadata(inputFile);
5
    System.out.println(result);
6
} catch (ApiException e) {
7
    System.err.println("Exception when calling EditPdfApi#editPdfGetMetadata");
8
    e.printStackTrace();
9
}



However, if you are looking to edit/set metadata for a PDF document, you will use the following API function instead:

Java
 




xxxxxxxxxx
1
10
9


 
1
EditPdfApi apiInstance = new EditPdfApi();
2
SetPdfMetadataRequest request = new SetPdfMetadataRequest(); // SetPdfMetadataRequest | 
3
try {
4
    byte[] result = apiInstance.editPdfSetMetadata(request);
5
    System.out.println(result);
6
} catch (ApiException e) {
7
    System.err.println("Exception when calling EditPdfApi#editPdfSetMetadata");
8
    e.printStackTrace();
9
}



In order for the above operation to run smoothly, be sure to input the desired metadata information as well:

Java
 




xxxxxxxxxx
1
15


 
1
{
2
  "InputFileBytes": "string",
3
  "MetadataToSet": {
4
    "Successful": true,
5
    "Title": "string",
6
    "Keywords": "string",
7
    "Subject": "string",
8
    "Author": "string",
9
    "Creator": "string",
10
    "DateModified": "2021-02-22T17:38:53.962Z",
11
    "DateCreated": "2021-02-22T17:38:53.962Z",
12
    "PageCount": 0
13
  }
14
}



In conclusion, we hope that the tools provided in this tutorial can assist in optimizing PDF metadata for your personal or business requirements.

PDF Metadata Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • How to Convert Between PDF and TIFF in Java
  • How to Split PDF Files into Separate Documents Using Java
  • How to Get Plain Text From Common Documents in Java
  • How to Change PDF Paper Sizes With an API in Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!