Over a million developers have joined DZone.

Evaluation of PMP Profiling Tools

DZone's Guide to

Evaluation of PMP Profiling Tools

Take a look at how tools including GDB, eu-stack, and Quickstack stack up against one another when it comes to stack tracing and PMP profiling before debugging.

Free Resource

Transform incident management with machine learning and analytics to help you maintain optimal performance and availability while keeping pace with the growing demands of digital business with this eBook, brought to you in partnership with BMC.

In this blog post, we’ll look at some of the available PMP profiling tools.

While debugging or analyzing issues with Percona Server for MySQL, we often need a quick understanding of what’s happening on the server. Percona experts frequently use the pt-pmp tool from Percona Toolkit (inspired by the Poor Man's Profiler).

The pt-pmp tool collects application stack traces GDB and then post-processes them. From this, you get a condensed, ordered list of the stack traces. The list helps you understand where the application spent most of its time — either running something or waiting for something.

Getting a profile with pt-pmp is handy, but it has a cost — it’s quite intrusive. In order to get stack traces, GDB has to attach to each thread of your application, which results in interruptions. Under high loads, these stops can be quite significant (up to 15-30-60 seconds). This means that the pt-pmp approach is not really usable in production.

Below, I’ll describe how to reduce GDB overhead, and also what other tools can be used instead of GDB to get stack traces.


By default, the symbol resolution process in GDB is very slow. As a result, getting stack traces with GDB is quite intrusive (especially under high loads).

There are two options available that can help notably reduce GDB tracing overhead.

  1. Use readnever patch. RHEL and other distros based on it include GDB with the readnever patch applied. This patch allows you to avoid unnecessary symbol resolving with the --readnever option. As a result, you get up to 10 times better speed.

  2. Use gdb_index. This feature was added to address symbol resolving issue by creating and embedding a special index into the binaries. This index is quite compact. I’ve created and embedded gdb_index for Percona server binary (it increases the size around 7-8MB). The addition of the gdb_index speeds up obtaining stack traces/resolving symbols two to three times.

# to check if index already exists:
  readelf -S  | grep gdb_index
# to generate index: 
  gdb -batch mysqld -ex "save gdb-index /tmp" -ex "quit"
# to embed index:
  objcopy --add-section .gdb_index=tmp/mysqld.gdb-index --set-section-flags .gdb_index=readonly mysqld mysqld

eu-stack (elfutils)

The eu-stack from the elfutils package prints the stack for each thread in a process or core file. Symbol resolving also is not very optimized in eu-stack. By default, if you run it under load, it will take even more time than GDB. But eu-stack allows you to skip resolving completely, so it can get stack frames quickly and then resolve them without any impact on the workload later.


Quickstack is a tool from Facebook that gets stack traces with minimal overheads.

Now, let’s compare all the above profilers. We will measure the amount of time it needs to take all the stack traces from Percona Server for MySQL under a high load (sysbench OLTP_RW with 512 threads).

The results show that eu-stack (without resolving) got all stack traces in less than a second, and that Quickstack and GDB (with the readnever patch) got very close results. For other profilers, the time was around two to five times higher. This is quite unacceptable for profiling (especially in production).

There is one more note regarding the pt-pmp tool. The current version only supports GDB as the profiler. However, there is a development version of this tool that supports GDB, Quickstack, eu-stack, and eu-stack with offline symbol resolving. It also allows you to look at stack traces for specific threads (tids). So for instance, in the case of Percona Server for MySQL, we can analyze just the purge, cleaner or IO threads.

Below are the command lines used in testing:

# gdb & gdb+gdb_index
  time gdb  -ex "set pagination 0" -ex "thread apply all bt" -batch -p `pidof mysqld` > /dev/null
# gdb+readnever
  time gdb --readnever -ex "set pagination 0" -ex "thread apply all bt" -batch -p `pidof mysqld` > /dev/null
# eu-stack
  time eu-stack -s -m -p `pidof mysqld` > /dev/null
# eu-stack without resolving
  time eu-stack -q -p `pidof mysqld` > /dev/null
# quickstack - 1 sample
  time quickstack  -c 1 -p `pidof mysqld` > /dev/null
# quickstack - 1000 samples
  time quickstack  -c 1000 -p `pidof mysqld` > /dev/null 

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

performance ,pmp ,profiling ,stack traces

Published at DZone with permission of Alexey Stroganov, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}