Using Linux Perf Tools
The Performance Analysis Tool for Linux (perf) is a powerful tool to profile applications. It works by using a mix of hardware counters (is fast) and software counters, all provided by the Linux Performance Counter (LPC) subsystem that takes charge of the complex task of wrapping the CPU counters for the different type of CPUs. So you can have access to a very efficient way to get information of running processes through their C API or a convenient command in this case (perf).
This command gives you access to a great variety of system and process level events but in this entry, I will use it to investigate CPU bounded issues.
First, you need to install perf tools using your favorite package manager:
#install perf in Archlinux sudo pacman -Sy perf #install perf in Fedora sudo dnf install perf
As a part of this tutorial we are going to use a sample Node.js server application which I wrote and push in GitHub, the directory for this example is /cpu_bound, there you can find the following files:
- This is a basic Express.js web service, with two basic services /fib and /fast more on those later.
- A bash script to run some rudimentary concurrent network request, this will generate some traffic for our service.
If you want to execute these scripts, you need the following Linux 4.0+ and Node.js 4.3+.
Once you have this file and have installed the dependencies, you need to install the dependencies with npm install. After that we can proceed to run test.sh that will execute the following steps:
- Run the Node application.js server.
- Execute 150 curl concurrent requests against the two services (you can decrease this amount if it’s too slow).
- Wait for the request to be complete, and then exit the node process “gracefully.”
It will look something like this:
In our example, it was easy to spot the performance problem, but I think the real value in case you need to profile huge projects is that you can create an understandable CPU utilization map, that will simplify the process of finding the inefficient spots. Other advantages are that you can use this with other programming languages so you can have the same amount of information and, as I mentioned earlier, thanks to the architecture of LPC the impact is minimal as some of the work is done at the hardware level.
Here are some useful links:
- Perf Command Documentation.
- How to use it from the command line.
- Perf Technical Documentation.
- Great documentation explaining the inner workings.
- Flame Graphs
- Very useful to analyze the run-time behavior of an application.
- V8 Compiler
- Nice talk of Franziska Hinkelmann about the V8 compiler optimization techniques.