DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How to Convert Between PDF and TIFF in Java
  • How to Split PDF Files into Separate Documents Using Java
  • How to Get Plain Text From Common Documents in Java
  • How to Change PDF Paper Sizes With an API in Java

Trending

  • Genkit Middleware: Intercept, Extend, and Harden your Gen AI Pipelines
  • Hallucination Has Real Consequences — Lessons From Building AI Systems
  • Ten Years of Beam: From Google's Dataflow Paper to 4 Trillion Events at LinkedIn
  • Stop Guessing, Start Seeing: A Five -Layer Framework for Monitoring Distributed Systems
  1. DZone
  2. Data Engineering
  3. Data
  4. Get or Set PDF Metadata in Java

Get or Set PDF Metadata in Java

If you mold the metadata of your PDFs to allow for SEO optimization via keywords, you will be able to increase the online searchability of your document.

By 
Brian O'Neill user avatar
Brian O'Neill
DZone Core CORE ·
Feb. 24, 21 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
4.4K Views

Join the DZone community and get the full member experience.

Join For Free

Introduction

Due to their fixed and presentable nature, PDF files are widely used in web applications by both users and businesses. Each of these files contains ‘metadata,' which essentially translates to data about data. PDF metadata contains supplementary information on the document, such as its author, subject, title, creation date, and more. If a PDF file was initially created by a transfer from an original source document (i.e. DocX, PPT, etc.) additional information, such as the file size and whether the file has been optimized for the Web, is automatically added as well.

So why is this PDF metadata relevant to your business? If you have PDF documents that are accessible on your website or application, the metadata can enable search engines to easily locate the documents. Therefore, if you mold the metadata of your PDFs to allow for keywords that may be picked up by search engines, you will be able to increase the searchability of your document.

The following APIs will allow you to extract or set metadata from PDF files and either edit or leverage the information to meet your business’s needs. 

Tutorial

To begin, we first need to install the Maven SDK by adding a reference to the repository in pom.xml:

Java
 




x


 
1
<repositories>
2
    <repository>
3
        <id>jitpack.io</id>
4
        <url>https://jitpack.io</url>
5
    </repository>
6
</repositories>



Then we can add a reference to the dependency:

Java
 




xxxxxxxxxx
1


 
1
<dependencies>
2
<dependency>
3
    <groupId>com.github.Cloudmersive</groupId>
4
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
5
    <version>v3.90</version>
6
</dependency>
7
</dependencies>



Once the installation is complete, we can add the imports to the top of the controller and configure the API key. If you don’t already have an API key, you can register for a free account on the Cloudmersive website to retrieve it. 

Java
 




xxxxxxxxxx
1
15


 
1
// Import classes:
2
//import com.cloudmersive.client.invoker.ApiClient;
3
//import com.cloudmersive.client.invoker.ApiException;
4
//import com.cloudmersive.client.invoker.Configuration;
5
//import com.cloudmersive.client.invoker.auth.*;
6
//import com.cloudmersive.client.EditPdfApi;
7

          
8
ApiClient defaultClient = Configuration.getDefaultApiClient();
9

          
10
// Configure API key authorization: Apikey
11
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
12
Apikey.setApiKey("YOUR API KEY");
13
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
14
//Apikey.setApiKeyPrefix("Token");



If you simply wish to gather metadata from the PDF, the following function will perform the action for you—all you need for input is the target PDF file. 

Java
 




xxxxxxxxxx
1
10
9


 
1
EditPdfApi apiInstance = new EditPdfApi();
2
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
3
try {
4
    PdfMetadata result = apiInstance.editPdfGetMetadata(inputFile);
5
    System.out.println(result);
6
} catch (ApiException e) {
7
    System.err.println("Exception when calling EditPdfApi#editPdfGetMetadata");
8
    e.printStackTrace();
9
}



However, if you are looking to edit/set metadata for a PDF document, you will use the following API function instead:

Java
 




xxxxxxxxxx
1
10
9


 
1
EditPdfApi apiInstance = new EditPdfApi();
2
SetPdfMetadataRequest request = new SetPdfMetadataRequest(); // SetPdfMetadataRequest | 
3
try {
4
    byte[] result = apiInstance.editPdfSetMetadata(request);
5
    System.out.println(result);
6
} catch (ApiException e) {
7
    System.err.println("Exception when calling EditPdfApi#editPdfSetMetadata");
8
    e.printStackTrace();
9
}



In order for the above operation to run smoothly, be sure to input the desired metadata information as well:

Java
 




xxxxxxxxxx
1
15


 
1
{
2
  "InputFileBytes": "string",
3
  "MetadataToSet": {
4
    "Successful": true,
5
    "Title": "string",
6
    "Keywords": "string",
7
    "Subject": "string",
8
    "Author": "string",
9
    "Creator": "string",
10
    "DateModified": "2021-02-22T17:38:53.962Z",
11
    "DateCreated": "2021-02-22T17:38:53.962Z",
12
    "PageCount": 0
13
  }
14
}



In conclusion, we hope that the tools provided in this tutorial can assist in optimizing PDF metadata for your personal or business requirements.

PDF Metadata Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • How to Convert Between PDF and TIFF in Java
  • How to Split PDF Files into Separate Documents Using Java
  • How to Get Plain Text From Common Documents in Java
  • How to Change PDF Paper Sizes With an API in Java

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook