{{announcement.body}}
{{announcement.title}}

Downloading GCP Docs as a PDF

DZone 's Guide to

Downloading GCP Docs as a PDF

Here's a handy open sourced Node app that lets you compile and download GCP docs into PDFs.

· Open Source Zone ·
Free Resource

Google Cloud Platform has done a nice job documenting its products (of which they have many). But to go through the documentation, we have only one option — visit the website and click through all the links in the left-hand navigation pane. Although that's still kind of handy, it can be a bit cumbersome, especially for cases like offline reading.

If you are like me and have felt that pain, then we might have a solution. I have created a small Node.js app to download GCP documentation by product as a PDF. The source code for the app is here. It's fairly simple to use, just clone the repo and run npm linkOnce you have done that, you just run the gcp-docs command and answer a few questions.

Image title

The app uses Puppeteer, which is a ‘Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol.’ So for a given product, it goes to the docs page, e.g. https://cloud.google.com/iam/docs/, and save the navigation links in a JSON file. Then it visits all the links and saves each page as a PDF. In the end, it merges all these PDFs into one PDF and deletes individual files and containing directory. Here are the output logs from the app:

Image title

As you can see, the app excludes a few links, like the starting page of a section that contains links for the content of the section. It also excludes links that are not from cloud.google.com and have ref or reference in the link (there are a lot of them in some products and they are not that useful in my opinion).

Hope it helps a few of you, as it did help me.

Credit: A special shout out to Amit Malhotra for polishing the app and getting his hands dirty fixing a lot of bugs.

Topics:
open source ,gcp ,documentation ,pdf ,tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}