Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing JavaScript and Solving the Halting Problem: Part I

DZone's Guide to

Optimizing JavaScript and Solving the Halting Problem: Part I

If you're calling JavaScript a lot to do something like, say, parsing documents, then making sure it performs well and performs well consistently is of crucial importance. In this post, we take a look at how RavenDB tackles this issue.

· Performance Zone
Free Resource

RavenDB is a JSON document database, and the natural way to process such documents is with JavaScript. Indeed, there is quite a lot of usage of JS scripts inside RavenDB. They are used for ETL, Subscription filtering, patching, resolving conflicts, managing administrative tasks, and probably other stuff that I’m forgetting.

The key problem that we have here is that some of the places where we need to call out to JavaScript are called a lot. A good example would be a request to patch a request on a query. The result can be a single document modified 50 million times. If this is the case, given our current performance profile, it turns out that the cost of evaluating JavaScript is killing our performance.

We are using Jint as the environment that runs our scripts. It works, it is easy to understand and debug and it is an interpreter. That means that it is more focused on correctness than performance. Over the years, we were able to extend it a bit to do all sorts of evil things to our scripts, but the most important thing is that it isn’t actually executing machine code directly, it is always Jint code running and handling everything.

Why is that important? Well, these scripts that we are talking about can be quite evil. I have seen anything from 300 KB of script (yes, that is 300 KB to modify a document that was considerably smaller) to just plain O(N^6) algorithms (document has 6 collections iterate on each of them while iterating on each of the others). These are just the complex ones. The evil ones do things like this:

// let see how fast we can spin

while (true){}

// how deep can we go?

(function blow() { blow(); })();

// all you mem are belong to us

var s = ' ';
while (true) { s+= ' '; }

We have extended Jint to count the number of operations that abort after a certain limit has passed as well as prevent stack overflow attacks. This means that it is much easier to deal with such things. They're just going to run for a while and then abort.

Of course, there is the performance issue of running an interpreter. We turned our eyes to Jurrasic, which took a very different approach. It generates IL on the fly and then executes it. That means that as far as we are concerned, most operations are going to end up running as fast as the rest of our code. Indeed, benchmarks show that Jurrasic is significantly faster (as in, an order of magnitude or more). In our own scenario, we saw anything from simply doubling the speed to orders of magnitude performance improvement.

However, that doesn’t help very much. The way Jurrasic works, code like the one above is going to hang or die. In fact, Jurassic documentation calls this out explicitly as an issue and recommends dedicating a thread or a thread pool for this and calling Thread.Abort if needed. That is not acceptable for us. Unfortunately, trying to fix this in a smart way takes us to the halting problem, and I think we already solved that mess.

This limiting issue was the reason why we kept using Jint for a long time. Today we finally had a breakthrough and were able to fix this issue. Here is the PR, but it tells only part of the story, the rest I’ll leave for tomorrow’s post.

Topics:
javascript ,performance ,halting ,database performance

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}