Using Web Workers to Improve Performance of Image Manipulation
Join the DZone community and get the full member experience.
Join For FreeToday I would like to talk about picture manipulation. Not the Direct2D way I used in my previous article but the pure JavaScript way.
The test case
The test application is simple. On the left a picture to manipulate and on the right the updated result (a sepia tone effect is applied):
The page itself is simple and is described as follow:
<!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <title>PictureWorker</title> <link href="default.css" rel="stylesheet" /> </head> <body id="root"> <div id="sourceDiv"> <img id="source" src="mop.jpg" /> </div> <div id="targetDiv"> <canvas id="target"></canvas> </div> <div id="log"></div> </body> </html>
The overall process to apply a sepia tone effect requires you to compute a new value for every pixel of the picture:
finalRed= (red * 0.393) + (green * 0.769) + (blue * 0.189); finalGreen = (red * 0.349) + (green * 0.686) + (blue * 0.168); finalBlue= (red * 0.272) + (green * 0.534) + (blue * 0.131);
To make it more realistic I added a bit of random in the formula so the final JavaScript code to apply to every pixel is:
function noise() { return Math.random() * 0.5 + 0.5; }; function colorDistance(scale, dest, src) { return (scale * dest + (1 - scale) * src); }; var processSepia = function (pixel) { pixel.r = colorDistance(noise(), (pixel.r * 0.393) + (pixel.g * 0.769) + (pixel.b * 0.189), pixel.r); pixel.g = colorDistance(noise(), (pixel.r * 0.349) + (pixel.g * 0.686) + (pixel.b * 0.168), pixel.g); pixel.b = colorDistance(noise(), (pixel.r * 0.272) + (pixel.g * 0.534) + (pixel.b * 0.131), pixel.b); };
Brutal force
Obviously the very first solution can consist to the use of a brutal approach with a function that apply the previous code on every pixel.
To get access to the pixels, you can use the canvas context with the following code:
var source = document.getElementById("source"); source.onload = function () { var canvas = document.getElementById("target"); canvas.width = source.clientWidth; canvas.height = source.clientHeight; tempContext.drawImage(source, 0, 0, canvas.width, canvas.height); var canvasData = tempContext.getImageData(0, 0, canvas.width, canvas.height); var binaryData = canvasData.data; }
The binaryData object contains an array of every pixel and can be used to quickly read or write data directly to the canvas.
So with this in mind, we can apply the whole effect with the following code:
var source = document.getElementById("source"); source.onload = function () { var start = new Date(); var canvas = document.getElementById("target"); canvas.width = source.clientWidth; canvas.height = source.clientHeight; if (!canvas.getContext) { log.innerText = "Canvas not supported. Please install a HTML5 compatible browser."; return; } var tempContext = canvas.getContext("2d"); var len = canvas.width * canvas.height * 4; tempContext.drawImage(source, 0, 0, canvas.width, canvas.height); var canvasData = tempContext.getImageData(0, 0, canvas.width, canvas.height); var binaryData = canvasData.data; processSepia(binaryData, len); tempContext.putImageData(canvasData, 0, 0); var diff = new Date() - start; log.innerText = "Process done in " + diff + " ms (no web workers)"; }
The processSepia function is just an variation of the previous one:
var processSepia = function (binaryData, l) { for (var i = 0; i < l; i += 4) { var r = binaryData[i]; var g = binaryData[i + 1]; var b = binaryData[i + 2]; binaryData[i] = colorDistance(noise(), (r * 0.393) + (g * 0.769) + (b * 0.189), r); binaryData[i + 1] = colorDistance(noise(), (r * 0.349) + (g * 0.686) + (b * 0.168), g); binaryData[i + 2] = colorDistance(noise(), (r * 0.272) + (g * 0.534) + (b * 0.131), b); } };
With this solution, on my Intel Extreme processor (12 cores), the main process takes 150ms and obviously only use one processor:
Adding web workers
The best thing you can do when dealing with SIMD (single instruction multiple data) is to use a parallelization approach. Especially when you want to work with low-end hardware (such as phone devices) with limited resources.
With JavaScript, to enjoy the power of parallelization, you have to use the Web Workers (my friend David Rousset wrote an excellent paper on this subject: http://blogs.msdn.com/b/davrous/archive/2011/07/15/introduction-to-the-html5-web-workers-the-javascript-multithreading-approach.aspx).
Picture processing is a really good candidate for parallelization because (in the case of sepia tone) every processing is independent and so the following approach is possible:
To do so, first of all you have to create a tools.js file to be used as a reference by other scripts:
function noise() { return Math.random() * 0.5 + 0.5; }; function colorDistance(scale, dest, src) { return (scale * dest + (1 - scale) * src); }; var processSepia = function (binaryData, l) { for (var i = 0; i < l; i += 4) { var r = binaryData[i]; var g = binaryData[i + 1]; var b = binaryData[i + 2]; binaryData[i] = colorDistance(noise(), (r * 0.393) + (g * 0.769) + (b * 0.189), r); binaryData[i + 1] = colorDistance(noise(), (r * 0.349) + (g * 0.686) + (b * 0.168), g); binaryData[i + 2] = colorDistance(noise(), (r * 0.272) + (g * 0.534) + (b * 0.131), b); } };
The processSepia function will be applied to every bunch of the picture by a dedicated worker. The code of each worker is included in a pictureprocessor.js file:
importScripts("tools.js"); self.onmessage = function (e) { var canvasData = e.data.data; var binaryData = canvasData.data; var l = e.data.length; var index = e.data.index; processSepia(binaryData, l); self.postMessage({ result: canvasData, index: index }); };
The main point here is that the canvas data (actually a part of it according to the current block to process) is cloned by JavaScript and passed to the worker. The worker is not working on the initial source but on a copy of it (using a specified algorithm: the structured clone algorithm). The copy itself is really quick and limited to a specific part of the picture.
The main client page (default.js) has to create 4 workers and give them the right part of the picture. Then every worker will callback a function in the main thread using the messaging API (postMessage / onmessage) to give back the result:
var source = document.getElementById("source"); source.onload = function () { var start = new Date(); var canvas = document.getElementById("target"); canvas.width = source.clientWidth; canvas.height = source.clientHeight; // Testing canvas support if (!canvas.getContext) { log.innerText = "Canvas not supported. Please install a HTML5 compatible browser."; return; } var tempContext = canvas.getContext("2d"); var len = canvas.width * canvas.height * 4; // Drawing the source image into the target canvas tempContext.drawImage(source, 0, 0, canvas.width, canvas.height); // If workers are not supported if (!window.Worker) { // Getting all the canvas data var canvasData = tempContext.getImageData(0, 0, canvas.width, canvas.height); var binaryData = canvasData.data; // Processing all the pixel with the main thread processSepia(binaryData, len); // Copying back canvas data to canvas tempContext.putImageData(canvasData, 0, 0); var diff = new Date() - start; log.innerText = "Process done in " + diff + " ms (no web workers)"; return; } // Let say we want to use 4 workers var workersCount = 4; var finished = 0; var segmentLength = len / workersCount; // This is the length of array sent to the worker var blockSize = canvas.height / workersCount; // Height of the picture chunck for every worker // Function called when a job is finished var onWorkEnded = function (e) { // Data is retrieved using a memory clone operation var canvasData = e.data.result; var index = e.data.index; // Copying back canvas data to canvas tempContext.putImageData(canvasData, 0, blockSize * index); finished++; if (finished == workersCount) { var diff = new Date() - start; log.innerText = "Process done in " + diff + " ms"; } }; // Launching every worker for (var index = 0; index < workersCount; index++) { var worker = new Worker("pictureProcessor.js"); worker.onmessage = onWorkEnded; // Getting the picture var canvasData = tempContext.getImageData(0, blockSize * index, canvas.width, blockSize); // Sending canvas data to the worker using a copy memory operation worker.postMessage({ data: canvasData, index: index, length: segmentLength }); } };
Using this technique, the complete process lasts only 80ms (from 150ms) on my computer and obviously uses 4 processors:
On my low-end hardware (based on dual core system), the process falls to 500ms (from 900ms).
The final code is available here: http://www.catuhe.com/msdn/pictureworkers.zip
And the live version is right there: http://www.catuhe.com/msdn/workers/default.html
(For comparison, the no web workers version: http://www.catuhe.com/msdn/workers/defaultnoworker.html)
A important point to note is that on recent computers the difference can be thin or even in favor of the code without workers. The overhead of the memory copy must be balanced by a complex code used by the workers. The sepia tone could not be enough in some cases.
However, the web workers will really be useful on low-end hardware.
Porting to Windows 8
Finally I was not able to resist to the pleasure of porting my JavaScript code to create a Windows 8 application. It took me about 10 minutes to create a blank JavaScript project and copy/paste the JavaScript code inside (feel the power of native JavaScript code for Windows 8!)
So feel free to grab the Windows 8 app code here: http://www.catuhe.com/msdn/Win8PictureWorkers.zip
Published at DZone with permission of David Catuhe, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments