Over a million developers have joined DZone.

Transforming HTML with Node.js and jQuery

DZone's Guide to

Transforming HTML with Node.js and jQuery

· Web Dev Zone
Free Resource

Start coding today to experience the powerful engine that drives data application’s development, brought to you in partnership with Qlik.

The npm module jsdom enables you to use jQuery to examine and transform HTML on Node.js. This post explains how.

The Basics

As a tool for processing HTML, Node.js offers an important foundation: It can download or upload data and it can read or write to disks [1]. What it lacks is the ability to parse and transform HTML. Luckily, the jQuery framework is ideally suited for this task. The jsdom module implements the HTML DOM on top of Node.js, which is everything that jQuery needs to run on that platform. To install it, use the node package manager:

npm install jsdom

jsdom is very easy to use:

var htmlSource = fs.readFileSync("dummy.html", "utf8");
    call_jsdom(htmlSource, function (window) {
    var $ = window.$;

    var title = $("title").text();


Above, we first read html source from disk into a string, then we invoke jsdom with that source. It calls us back when everything is finished, with a window object. The function call_jsdom ensured that jQuery is already loaded “into” that window, so we only need to access window.$ and work with jQuery as we would in a browser: The document does not yet have a heading, so we read the title and put it into the empty h1 tag. Finally, we log the transformed HTML to the console. You can download the project jsdom_demo to try it out; run transform.js on the shell, either directly or via Node.js. The input is:

<!doctype html>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
        <title>My document</title>

The output is:

<!doctype html>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
        <title>My document</title>
        <h1>My document</h1>
<script src="jquery-1.7.1.min.js"></script></html>


Keeping the structure of the source code. The original source code will be changed in several ways: Closing tags will be added (e.g. to close a <p> tag) and loading jQuery causes a script tag to be added (see output above). A possible work-around for transforming HTML (as opposed to extracting data) is to not work with a complete document. Instead, one can use $() to work with an HTML fragment that is separate from the document:

var fragment = $("<ul><li>item</li></ul>");

Seeing thrown exceptions. jsdom catches all exceptions. Unfortunately that catching extends to its callbacks. For example, the following is a function that we have called previously.

function call_jsdom(source, callback) {
        [ 'jquery-1.7.1.min.js' ],  // (*)
        function(errors, window) {  // (**)
                function () {
                    if (errors) {
                        throw new Error("There were errors: "+errors);

jsdom swallows all exceptions thrown inside the callback at (**), including in any functions that it calls. To escape that effect, you can use process.nextTick() to add a function to the event loop queue. It will be executed after the current code is finished.

Loading jQuery from a file. The examples in the jsdom readme load jQuery from a URL, causing internet traffic each time the code is run. A solution is to put a copy of jQuery next to the script and specify a file path instead of a URL, as seen above at (*).

Using jQuery multiple times. Do you have to invoke call_jsdom (or jsdom.env) every time you want to use jQuery? No, you can store window somewhere and use it again later. The initial startup is only callback-based to accommodate asynchronous script loading.

Conclusion: What is this good for?

When you are faced with having to parse or transform HTML, you realize just how great a transformation language jQuery is. Even more so, because its documentation is so well done, perfect for casual users. The solution described above is ideal for extracting information from HTML. Changing existing HTML requires more care.

Source: http://www.2ality.com/2012/02/jsdom.html

Create data driven applications in Qlik’s free and easy to use coding environment, brought to you in partnership with Qlik.


Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}