Over a million developers have joined DZone.

Parsing JavaScript with JavaScript

· Web Dev Zone

Start coding today to experience the powerful engine that drives data application’s development, brought to you in partnership with Qlik.

Over the weekend I started working on llamaduck- a simple tool that aims to figure out whether your code will run on the newly released node 0.6.0. Eventually it might be able to perform other compatibility assessment tasks as well, but I’m focusing on simple stuff first.

Or at least I thought it was simple.

The list of API changes since 0.4.x doesn’t seem that long and it should be easy enough to digest. But as it turns out, I spent almost all of Sunday just figuring out how to turn javascript into a beautiful analyzable AST.

If you don’t know what an AST is – it’s a so called abstract syntax tree, which means it should look identical regardless of what the actual syntax is. Although it will differ for actually different languages. So a coffeescript AST should look the same as JavaScript, but Python’s will differ.

My research came up with three options:

  1. Take a parser generator and a JavaScript grammar, hope for the best
  2. JSLint has a parser … somewhere around line 2000
  3. Uglify-JS supposedly has a parser too

The only viable option was uglify-js. It’s a neatly packaged node.js module that does a bit more than I need, but at least it’s got an easy to use parser with an exposed api interface.

Score!

Here’s an example of a file that outputs its own AST to give you a feel for what I’m talking about:

var parser = require('uglify-js').parser;
var util = require('util');
 
(function get_ast (path, callback) {
    require('fs').readFile(path, 'utf-8', function (err, data) {
        if (err) throw err;
 
        callback(parser.parse(data));
    });
})('./example.js', function (data) {
    console.log(util.inspect(data, true, null));
});

The file parses itself and outputs a tree encoded as a javascript array (scroll past the insanity, there’s a bit more text there):

[ 'toplevel',
  [ [ 'var',
      [ [ 'parser',
          [ 'dot',
            [ 'call',
              [ 'name', 'require', [length]: 2 ],
              [ [ 'string', 'uglify-js', [length]: 2 ],
                [length]: 1 ],
              [length]: 3 ],
            'parser',
            [length]: 3 ],
          [length]: 2 ],
        [length]: 1 ],
      [length]: 2 ],
    [ 'var',
      [ [ 'util',
          [ 'call',
            [ 'name', 'require', [length]: 2 ],
            [ [ 'string', 'util', [length]: 2 ], [length]: 1 ],
            [length]: 3 ],
          [length]: 2 ],
        [length]: 1 ],
      [length]: 2 ],
    [ 'stat',
      [ 'call',
        [ 'function',
          'get_ast',
          [ 'path', 'callback', [length]: 2 ],
          [ [ 'stat',
              [ 'call',
                [ 'dot',
                  [ 'call',
                    [ 'name', 'require', [length]: 2 ],
                    [ [ 'string', 'fs', [length]: 2 ], [length]: 1 ],
                    [length]: 3 ],
                  'readFile',
                  [length]: 3 ],
                [ [ 'name', 'path', [length]: 2 ],
                  [ 'string', 'utf-8', [length]: 2 ],
                  [ 'function',
                    null,
                    [ 'err', 'data', [length]: 2 ],
                    [ [ 'if',
                        [ 'name', 'err', [length]: 2 ],
                        [ 'throw',
                          [ 'name', 'err', [length]: 2 ],
                          [length]: 2 ],
                        undefined,
                        [length]: 4 ],
                      [ 'stat',
                        [ 'call',
                          [ 'name', 'callback', [length]: 2 ],
                          [ [ 'call',
                              [ 'dot',
                                [ 'name', 'parser', [length]: 2 ],
                                'parse',
                                [length]: 3 ],
                              [ [ 'name', 'data', [length]: 2 ], [length]: 1 ],
                              [length]: 3 ],
                            [length]: 1 ],
                          [length]: 3 ],
                        [length]: 2 ],
                      [length]: 2 ],
                    [length]: 4 ],
                  [length]: 3 ],
                [length]: 3 ],
              [length]: 2 ],
            [length]: 1 ],
          [length]: 4 ],
        [ [ 'string', './example.js', [length]: 2 ],
          [ 'function',
            null,
            [ 'data', [length]: 1 ],
            [ [ 'stat',
                [ 'call',
                  [ 'dot',
                    [ 'name', 'console', [length]: 2 ],
                    'log',
                    [length]: 3 ],
                  [ [ 'call',
                      [ 'dot',
                        [ 'name', 'util', [length]: 2 ],
                        'inspect',
                        [length]: 3 ],
                      [ [ 'name', 'data', [length]: 2 ],
                        [ 'name', 'true', [length]: 2 ],
                        [ 'name', 'null', [length]: 2 ],
                        [length]: 3 ],
                      [length]: 3 ],
                    [length]: 1 ],
                  [length]: 3 ],
                [length]: 2 ],
              [length]: 1 ],
            [length]: 4 ],
          [length]: 2 ],
        [length]: 3 ],
      [length]: 2 ],
    [length]: 3 ],
  [length]: 2 ]

Conclusion

Now we have a simple tree we can recursively analyze and look for incompatibilities. But before anything really practical can be done I need to figure out how to track variable scope. That’s really the hard bit because the code needs to check when variables become a critical section and then confirm that they do in fact eventually get used in a critical way.

But once that nut is cracked llamaduck will be a neat little tool useful for many things.

If you’ve got some coding inclination, I’d love a helping hand over at the llamaduck github repo.

 

From http://swizec.com/blog/parsing-javascript-with-javascript/swizec/2909

Create data driven applications in Qlik’s free and easy to use coding environment, brought to you in partnership with Qlik.

Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}