Understanding execFile, spawn, exec, and fork in Node.js
In this article, author Can Ho breaks down four different methods for executing external applications in Node.js, providing examples along the way.
Join the DZone community and get the full member experience.
Join For FreeIn Node, the child_process module provides four different methods for executing external applications:
1. execFile
2. spawn
3. exec
4. fork
All of these are asynchronous. Calling these methods will return an object which is an instance of the ChildProcess class.
The right method depends on what we need. Let's take a detailed look at these.
1. execFile
What?
Executes an external application, given optional arguments and callback with the buffered output after the application exits.
child_process.execFile(file[, args][, options][, callback])
- file <String> The name or path of the executable file to run
- args <Array> List of string arguments
- options <Object>
- cwd <String> Current working directory of the child process
- env <Object> Environment key-value pairs
- encoding <String> (Default: ‘utf8’)
- timeout <Number> (Default: 0)
- maxBuffer <Number> largest amount of data (in bytes) allowed on stdout or stderr – if exceeded child process is killed (Default:200\*1024)
- killSignal <String> (Default: ‘SIGTERM’)
- uid <Number> Sets the user identity of the process. (See setuid(2).)
- gid <Number> Sets the group identity of the process. (See setgid(2).)
- callback <Function> called with the output when process terminates
- Return: <ChildProcess>
How?
In the below example, the node program will be executed with argument “–version”. When the external application exists, the callback function is called. The callback function contains the stdout and stderr output of the child process. The output stdout from the external application is buffered internally.
Running the below code will print out the current node version.
const execFile = require('child_process').execFile;
const child = execFile('node', ['--version'], (error, stdout, stderr) => {
if (error) {
console.error('stderr', stderr);
throw error;
}
console.log('stdout', stdout);
});
How does node know where to find the external application?
It uses the PATH environment variable which specifies a set of directories where executable programs are located. If an external application exists on a PATH environment, it can be located without needing an absolute or relative path to the application.
When?
execFile is used when we just need to execute an application and get the output. For example, we can use execFile to run an image-processing application like ImageMagick to convert an image from PNG to JPG format and we only care if it succeeds or not. execFile should not be used when the external application produces a large amount of data and we need to consume that data in real time manner.
2. spawn
What?
The spawn method spawns an external application in a new process and returns a streaming interface for I/O.
child_process.spawn(command[, args][, options])
- command <String> The command to run
- args <Array> List of string arguments
- options <Object>
- cwd <String> Current working directory of the child process
- env <Object> Environment key-value pairs
- stdio <Array> | <String> Child’s stdio configuration. (See options.stdio)
- detached <Boolean> Prepare child to run independently of its parent process. Specific behavior depends on the platform, see options.detached)
- uid <Number> Sets the user identity of the process. (See setuid(2).)
- gid <Number> Sets the group identity of the process. (See setgid(2).)
- shell <Boolean> | <String> If true, runs command inside of a shell. Uses ‘/bin/sh’ on UNIX, and ‘cmd.exe’ on Windows. A different shell can be specified as a string. The shell should understand the -c switch on UNIX, or /s /c on Windows. Defaults to false (no shell).
- return: <ChildProcess>
How?
const spawn = require('child_process').spawn;
const fs = require('fs');
function resize(req, resp) {
const args = [
"-", // use stdin
"-resize", "640x", // resize width to 640
"-resize", "x360<", // resize height if it's smaller than 360
"-gravity", "center", // sets the offset to the center
"-crop", "640x360+0+0", // crop
"-" // output to stdout
];
const streamIn = fs.createReadStream('./path/to/an/image');
const proc = spawn('convert', args);
streamIn.pipe(proc.stdin);
proc.stdout.pipe(resp);
}
In the Node.js function above (an express.js controller function), we read an image file using a stream. Then, we use spawn method to spawn convert program (see imagemagick.org). Then, we feed ChildProcess proc with the image stream. As long as the proc object produces data, we write that data to the resp (which is a Writable stream) and users can see the image immediately without having to wait for the whole image to convert (resized).
When?
As spawn returns a stream based object, it’s great for handling applications that produce large amounts of data or for working with data as it reads in. As it’s stream based, all stream benefits apply as well:
- Low memory footprint
- Automatically handle back-pressure
- Lazily produce or consume data in buffered chunks.
- Evented and non-blocking
- Buffers allow you to work around the V8 heap memory limit
3. exec
What?
This method will spawn a subshell and execute the command in that shell and buffer generated data. When the child process completes, callback function will be called with:
- buffered data when the command executes successfully
- error (which is an instance of Error) when the command fails
child_process.exec(command[, options][, callback])
- command <String> The command to run, with space-separated arguments
- options <Object>
- cwd <String> Current working directory of the child process
- env <Object> Environment key-value pairs
- encoding <String> (Default: ‘utf8’)
- shell <String> Shell to execute the command with (Default: ‘/bin/sh’ on UNIX, ‘cmd.exe’ on Windows, The shell should understand the -c switch on UNIX or /s /c on Windows. On Windows, command line parsing should be compatible withcmd.exe.)
- timeout <Number> (Default: 0)
- maxBuffer <Number> largest amount of data (in bytes) allowed on stdout or stderr – if exceeded child process is killed (Default:200\*1024)
- killSignal <String> (Default: ‘SIGTERM’)
- uid <Number> Sets the user identity of the process. (See setuid(2).)
- gid <Number> Sets the group identity of the process. (See setgid(2).)
- callback <Function> called with the output when process terminates
- Return: <ChildProcess>
Comparing to execFile and spawn, exec doesn’t have an args argument because exec allows us to execute more than one command on a shell. When using exec, if we need to pass arguments to the command, they should be part of the whole command string.
How?
Following code snippet will print out recursively all items under current folder:
const exec = require('child_process').exec;
exec('for i in $( ls -LR ); do echo item: $i; done', (e, stdout, stderr)=> {
if (e instanceof Error) {
console.error(e);
throw e;
}
console.log('stdout ', stdout);
console.log('stderr ', stderr);
});
When running command in a shell, we have access to all functionality supported by that shell such as pipe, redirect..
const exec = require('child_process').exec;
exec('netstat -aon | find "9000"', (e, stdout, stderr)=> {
if (e instanceof Error) {
console.error(e);
throw e;
}
console.log('stdout ', stdout);
console.log('stderr ', stderr);
});
In above example, Node will spawn a subshell and execute the command “netstat -aon | find “9000”” in that subshell. The command string includes two commands:
- netstat -aon: netstat command with argument -aon
- find “9000”: find command with argument 9000
The first command will display all active TCP connections(-a), process id (-o), ports, and addresses (expressed numerically -n) on which the computer is listening. The output of this command will feed into the second command which finds the process with port id 9000. On success, the following line will print out:
TCP 0.0.0.0:9000 0.0.0.0:0 LISTENING 11180
When?
exec should be used when we need to utilize shell functionality such as pipe, redirects, backgrounding…
Notes
- The exec will execute the command in a shell which maps to /bin/sh (linux) and cmd.exe (windows)
- Executing a command in a shell using exec is great. However, exec should be used with caution as shell injection can be exploited. Whenever possible, execFile should be used as invalid arguments passed to execFile will yield an error.
4. fork
What?
The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. Like child_process.spawn(), a ChildProcess object is returned. The returned ChildProcess will have an additional communication channel built-in that allows messages to be passed back and forth between the parent and child.
The fork method will open an IPC channel allowing message passing between Node processes:
- On the child process, process.on(‘message’) and process.send(‘message to parent’) can be used to receive and send data
- On the parent process, child.on(‘message’) and child.send(‘message to child’) are used
Each process has it’s own memory, with their own V8 instances assuming at least 30ms start up and 10mb each.
child_process.fork(modulePath[, args][, options])
- modulePath <String> The module to run in the child
- args <Array> List of string arguments
- options <Object>
- cwd <String> Current working directory of the child process
- env <Object> Environment key-value pairs
- execPath <String> Executable used to create the child process
- execArgv <Array> List of string arguments passed to the executable (Default: process.execArgv)
- silent <Boolean> If true, stdin, stdout, and stderr of the child will be piped to the parent, otherwise they will be inherited from the parent, see the ‘pipe’ and ‘inherit’ options for child_process.spawn()‘s stdio for more details (Default:false)
- uid <Number> Sets the user identity of the process. (See setuid(2).)
- gid <Number> Sets the group identity of the process. (See setgid(2).)
- Return: <ChildProcess>
How?
//parent.js
const cp = require('child_process');
const n = cp.fork(`${__dirname}/sub.js`);
n.on('message', (m) => {
console.log('PARENT got message:', m);
});
n.send({ hello: 'world' });
//sub.js
process.on('message', (m) => {
console.log('CHILD got message:', m);
});
process.send({ foo: 'bar' });
When?
Since Node's main process is single threaded, long-running tasks like computation will tie up the main process. As a result, incoming requests can’t be serviced and the application becomes unresponsive. Off-loading long-running tasks from the main process by forking a new Node process allows the application to serve incoming requests and stay responsive.
Related Refcard:
Published at DZone with permission of Can Ho, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments