Merging PDF’s with PDFBox
Merging Portable Document Format documents using PDFBox couldn’t be simpler.
Join the DZone community and get the full member experience.
Join For FreeMerging Portable Document Format documents using PDFBox couldn’t be simpler. The developer(s) of PDFBox has taken care of all of the hard work and encapsulated it in one class of their Application Programming Interface. All you need to do is use it.
The class I am referring to is the PDFMergerUtility class. This class provides everything you need to take multiple single or multi page PDF documents and merge them into one PDF document. Below I will go over the simple steps of using this class to merge all PDF’s located in a directory without having to pass each file as an argument.
The first step is to initialize the class as follows:
PDFMergerUtility mergePdf = new PDFMergerUtility();
With the class initialized we can start to use it to merge our PDF’s. The next step in our process is to read and store the two arguments that gets passed into our application for later use. When invoking our utility from the command line we expect two arguments to be passed in, the first, the folder that contains the documents and the second, the file name of the final merged PDF. We store these arguments as two String variables:
String folder = args[0];String destinationFileName = args[1];
The next step is to get hold of all of the files in the directory that was passed to our utility and store them as a String variable called folder. For this I wrote a small method that uses the java.io.File
class.
private static String[] getFiles(String folder) throws IOException{File _folder = new File(folder);String[] filesInFolder; if(_folder.isDirectory()){filesInFolder = _folder.list();return filesInFolder;}else{throw new IOException("Path is not a directory");}}
The first thing we check is that the directory passed to us is in fact a directory. If not, we throw an IOException with the message Path is not a directory. After we verified that this is a directory we use the list()
function from the java.io.File
class to get the files from the directory. The list()
method returns an array of all of the files in the directory. We store this in a String array and return this array to the caller.
String[] filesInFolder; if(_folder.isDirectory()){filesInFolder = _folder.list();return filesInFolder;}
Because the final steps of our utility can possibly cause one of two exception two be thrown, we will enclose it within a try/catch block. The first thing we do inside our try block is to store the size of the array as an int variable called numberOfFiles, we will be using this inside our for loop a little later. Next we store our files in a String[]
called, you guessed it, files. Armed with this information we can go ahead and loop through our array of files. The reason why we need to loop through our files is because we need to add them to the source of the PDFMergeUtility using it’s addSource
function.
The for loop is then also where we will be making use of the first of our two variables, numberOfFiles.
for(int i = 0; i < numberOfFiles; i++)
Inside the loop we add each file to the PDFMergeUtility’s source using the following line of code:
mergePdf.addSource(folder + File.separator + files[i]);
The only steps left for us is to set the file name and location of the merged document and then call the PDFMergeUtility’s mergeDocuments()
method.
mergePdf.setDestinationFileName(folder + File.separator + destinationFileName);mergePdf.mergeDocuments();
To close of our try block we catch the two possible exception that could be thrown by the methods used inside the try block. These are the COSVisitorException and an IOException. With this done our utility is complete! I hope you enjoyed this tutorial and find the utility useful. You can download the complete source here and use it as you see fit. Please feel free to post your comments as to how this utility can be improved and expanded upon.
Published at DZone with permission of Schalk Neethling. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments