How to Generate Server-Side PDF Reports With Puppeteer, D3, and Handlebars

Looking for a way to create a design-heavy, data-driven, beautifully styled PDF report—server-side with similar tools to what you are already using on the front-end? Stop your search. You’ve come to the right place.

Carlos Lantigua

Aug. 10, 20 · Tutorial

Likes (3)

Comment

Save

11.5K Views

Looking for a way to create a design-heavy, data-driven, beautifully styled PDF report—server-side with similar tools to what you are already using on the front-end? Stop your Google search. You’ve come to the right place. I was in the same boat as you a few months ago while helping a client with this exact problem. In order to accomplish this feat, I developed a four-step solution using Puppeteer, D3, and handlebars. In this post, I’ll give you step by step instructions on creating server-side pdf reports. Let’s dive in.

An example of a PDF page generated using this method.

In this post, we’ll cover:

Setting up Puppeteer and Handlebars
Creating a generator to make our PDF
Building out a handlebars template
Adding the finishing touches

The CHallenges of Creating These PDF Reports:

Because we’re using a template framework to access standard web technologies along with Puppeteer to manage the PDF, we’ll need to think about these things during development:

Pages will manually need to be constrained.
We won’t have access to CSS media props other than “screen.” (no “page-break-after” or the print media type)
We won’t be able to use dev tools to debug irregularities once the PDF is compiled and rendered.
Puppeteer itself adds extra build time and size to your deployments.
Generating a report can take a while depending on file size.

For this example, let’s assume we already have the base of our project up and running Node/Express, and some type of ORM and DB solutions. We’re all set to feed our sweet, sweet data into a report.

The Tools We Need to Make This Happen

Handlebars

HTML templating framework from the Mustache family. This allows for Partial templating (fancy talk for components) and custom and built-in helper functionality to expand on our logic.

    Shell
   
          x
         
npm install handlebars

Example using partials and built-in blocks

    HTML
   
xxxxxxxxxx

{{#each poleComparison as |page|}}
<div class="page">
  {{#each page.pairs as |polePair|}}
    {{> comparison-header polePair=polePair }}
        <div class="comparison-tables">
            {{> comparison-body polePair=polePair }}
        </div>
  {{/each}}
  {{> footer @root }}
</div>
{{/each}}

Puppeteer

A node library that will provide us access to a chrome headless instance for generating the PDF based on our compiled Handlebars templates.

    Shell
   
xxxxxxxxxx

npm install puppeteer

A list of use cases:

Generate screenshots and PDFs of pages.
Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. “SSR” (Server-Side Rendering)).
Create an up-to-date, automated testing environment.
Test Chrome Extensions.

D3 (Data-Driven Documents)

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.

    HTML
   
xxxxxxxxxx

<script src="https://d3js.org/d3.v5.min.js"></script>

Step One: Setting Up Puppeteer & Handlebars

First, we’ll create a directory for our PDF then import the required modules. This will be a JavaScript file that we’ll place within the server-side structure of our application. We can call this generatePDF.js for convenience.

    JavaScript
   
xxxxxxxxxx

const puppeteer = require("puppeteer"); 
const hbs = require("handlebars");

Next, we’ll need to let handlebars compile our template. We will create a compile function which will locate the .hbs file and use the Handlebar's built-in compile method to do this.

    JavaScript
   
xxxxxxxxxx

const puppeteer = require("puppeteer");
const hbs = require("handlebars");
 
const compile = async (templateName, data) => {
    const filePath = path.join(__dirname, "templates", `${templateName}.hbs`);
    if (!filePath) {
        throw new Error(`Could not find ${templateName}.hbs in generatePDF`);
    }
    const html = await fs.readFile(filePath, "utf-8");
    return hbs.compile(html)(data);
};

This method allows us to also inject the data that we will be using into our template.

Finally, we’ll want to set up our generatePDF function. Its job will be to open a Puppeteer headless chromium instance to convert our template into PDF format.

    JavaScript
   
xxxxxxxxxx

let browser; 
const generatePDF = async (fileName, data) => {
    try {
        if (!browser) {
            browser = await puppeteer.launch({
                args: [
                "--no-sandbox",
                "--disable-setuid-sandbox",
                "--disable-dev-shm-usage"
                ],
                headless: true,
            })
        }
    } catch (err) {
        ...

We’ve passed some configuration options to our Puppeteer browser that will make it headless and lightweight. We also don’t want multiple browsers to be open at the same time, this can cause performance issues when generating multiple reports.

Next, we’ll be creating a new incognito browser context. We’ll use this instead of the usual context method because it won’t share cookies/cache with other browser contexts. This is helpful for other features of Puppeteer but won’t be needed for this process.

    JavaScript
   
xxxxxxxxxx

                ],
                headless: true,
            })
        }
        const context = await browser.createIncognitoBrowserContext();
        const page = await context.newPage();
        const content = await compile(fileName, data);
 
    } catch (err) {
        ...
    }
}

Now we’ll set up our content and tell puppeteer to wait until everything is loaded before rendering the PDF.

    JavaScript
   
xxxxxxxxxx

        const content = await compile(fileName, data);
 
        await page.goto(`data: text/html, ${content}`, { 
            waitUntil: "networkidle0" 
        });
        await page.setContent(content);
        await page.emulateMedia("screen");
 
    } catch (err) {
        ...
    }
}

* page.goto takes a URL string and config options. We won’t be traveling to a URL, instead we’ll be utilizing our compiled html

* emulateMedia changes the CSS media type used on the page. We’ll want our media type to reflect the CSS used for screens.

We’ll get to set our page format so that Puppeteer knows how to render. Keep in mind that Puppeteer has no concept of where we want to split our actual content (that will be handled later through our templates CSS).

    JavaScript
   
xxxxxxxxxx

        await page.emulateMedia("screen");
 
        const pdf = await page.pdf({
            format: "A4",
            printBackground: true,
        });
 
        await context.close();
        return pdf;
 
    } catch (err) {
        ...
    }
}

Step Two: set Up Our Handlebars Templates

We’ll start by creating our first handlebars template file for our report. Notice that the syntax looks and acts just like regular HTML.

    Plain Text
   
xxxxxxxxxx

our_report.hbs

    HTML
   
xxxxxxxxxx

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Our Cool PDF Report</title>
  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0" />
  <script src="https://d3js.org/d3.v5.min.js"></script>
</head>
  <body>
      <div>
        <p>Hello World<p>
      </div>
  </body>

Let's have Our Data Brought Into Our Template

We can use some handlebars built-in blocks to help us interact with the data that we injected earlier in our compile function. We can use the “with” block to gain context to the data that we need, then an “each” block to iterate over it.

    HTML
   
xxxxxxxxxx

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Our Cool PDF Report</title>
  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0" />
  <script src="https://d3js.org/d3.v5.min.js"></script>
</head>
  <body>
      <div>
        {{#with Data as |myData| }}
         {{#each myData.text as |text| }}
         <p>{{text}}<p>
         {{/each}}
        {{/with }}
      </div>
  </body>

Now We Can Add Some Process to Generate Our PDF

Now in our Node app we can use our generatePDF function to create our PDF. This would be the time for you to decide what you ultimately want to do with the report. You could store it in your database, serve it to the client-side, or stash it into an S3 bucket. There’s a lot of freedom here depending on what your application’s needs.

    JavaScript
   
xxxxxxxxxx

const generatePDF = require('./generatePDF.js');
 
const generateReportWithData = reportData => {
 return generatePDF("our_report", reportData);
  }

If you have different types of reports, we can take this opportunity to toss in a switch statement and some logic to decide which report to generate.

Step Three: Build Out a Handlebars Template

Now we can set up our template styling. We’ll create a file called style.hbs. I like to set up global variables CSS variables to keep me honest with a lot of my styling. pt is a recommended unit for printable documents and I found that px didn’t always work so well for text. I also found that em units translate better for letter spacing than pixels. This made it easier to match the design kerning/letter-spacing when converting the values.

Building Out The Page Constraints

    Plain Text
   
xxxxxxxxxx

style.hbs

    JavaScript
   
xxxxxxxxxx

:root {
    --font-s-small: 8pt;
    --font-s-normal: 10pt;
    --font-s-mid: 12pt;
    --font-s-large: 14pt;
    /* Kerning */
    --ltr-spc-200: 0.2em;
    --ltr-spc-100: 0.1em;
    --ltr-spc-020: 0.02em;
    --ltr-spc-025: 0.025em;
}

If you recall, we talked about how Puppeteer has no context on when we want to split up our document into pages. It will generate the PDF and break up the pages appropriate regardless of where our content sits. This means that our content will just spill over to the next page automatically when it overflows, and we don’t want that as we would rather be in control. We’ll add some styling to let our HTML body continue forever and a page container which will match the constraints of an A4 formatted page. If you are using a different format, you’ll need to plug the numbers for that in the height and width of the page container.

    HTML
   
xxxxxxxxxx

<style>
html, body {
    height: 100%;
    margin: 0;
    padding: 0;
    }
.page {
    background: white;
    display: block;
    margin: 0 auto;
    margin-bottom: 8.5em;
    /* Size = A4 */
    width: 21cm;
    height: 29.7cm;
    padding: 5em 30px 0 30px;
    position: relative;

In the File Where You Setup Puppeteer and Handlebars

Since we created a style.hbs file, we’ll want to register as a partial so that we can just plug it into our template. This way we won’t have to jam all of our styles into our main template file and can reuse the code if we need to.

    JavaScript
   
xxxxxxxxxx

hbs.registerPartial("style", fs.readFileSync(
    path.join(__dirname, "/path/to/style.hbs"), "utf-8"),
);

Now that it has been registered as a handlebars partial, we can simply bring it into our template.

    HTML
   
xxxxxxxxxx

{{> styles }}
</head>
  <body>
      <div class="page">
        {{#with Data as |myData| }}
         {{#each myData.text as |text| }}
         <p>{{text}}<p>
         {{/each}}
        {{/with }}
      </div>
  </body>

Step Four: Adding in Some D3

We already brought in the D3 CDN link into our template header

    HTML
   
xxxxxxxxxx

<script src="https://d3js.org/d3.v5.min.js"></script>

Now it’s time to create a partial for our D3 script. We’ll do this using the same method that we used to create the style.hbs partial. Register a d3_script.hbs file as a Handlebars helper.

    JavaScript
   
xxxxxxxxxx

hbs.registerPartial("d3_script", fs.readFileSync(
    path.join(__dirname, "/path/to/d3_script.hbs"), "utf-8"),
);

Then we can drop it into our main template as needed. Also note the canvas anchor div used in the template to give our D3 a foundation to start from.

    Plain Text
   
xxxxxxxxxx

my_template.hbs

    HTML
   
xxxxxxxxxx

     {{/each}}
    {{/with }}
  </div>
  <div class="canvas"></div>
</body>
 
<script type="text/javascript">
const svg = d3
    .select('#canvas')
    .append('svg')
    .attr('viewBox', [-width / 2, -height / 2, width, height]);
</script>

And there you have it. Let me know your thoughts on this solution to creating server-side pdf reports, and if you run into any issues, feel free to contact me.

PDF Template JavaScript library HTML Plain text

Published at DZone with permission of Carlos Lantigua. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending