DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • Competing Consumers With Spring Boot and Hazelcast
  • What Is React? A Complete Guide
  • Tomorrow’s Cloud Today: Unpacking the Future of Cloud Computing

Trending

  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • Competing Consumers With Spring Boot and Hazelcast
  • What Is React? A Complete Guide
  • Tomorrow’s Cloud Today: Unpacking the Future of Cloud Computing
  1. DZone
  2. Data Engineering
  3. Databases
  4. Do not trust vmstat IOwait numbers

Do not trust vmstat IOwait numbers

Peter Zaitsev user avatar by
Peter Zaitsev
·
Jun. 04, 14 · Interview
Like (0)
Save
Tweet
Share
5.28K Views

Join the DZone community and get the full member experience.

Join For Free

i’ve been running a benchmark today on my old test box with conventional hard drives (no raid with bbu) and noticed something unusual in the cpu utilization statistics being reported.

the benchmark was run like this:

sysbench --num-threads=64 --max-requests=0 --max-time=600000 --report-interval=10 --test=oltp --db-driver=mysql --oltp-dist-type=special  --oltp-table-size=1000000   --mysql-user=root --mysql-password=password  run

which means: create 64 threads and hammer the database with queries as quickly as possible. as the test was run on the localhost i would expect the benchmark to completely saturate the system – being either using cpu or being blocked on io nature of this benchmark so it does not spend a lot on database locks, especially as this system has just 2 cores.

looking at vmstat however i noticed this:

[root@smt1 mysql]# vmstat 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
13  0      0 880156  40060 2140824    0    0    12   750  895 1045 32  8 54  6  0
 0  0      0 877116  40060 2141312    0    0     0  1783 2185 23112 44 10 41  5  0
15  0      0 872648  40068 2141960    0    0     0  1747 2204 25743 41 11 46  2  0
 0  0      0 868056  40068 2142604    0    0     0  1803 2164 26224 40 11 44  5  0
17  1      0 863216  40068 2143160    0    0     0  1875 1948 23020 36  9 50  5  0
 0  0      0 858384  40168 2143656    0    0     0  1063 1855 21116 32  9 45 14  0
23  0      0 855848  40176 2144232    0    0     0  1755 2036 23181 36 10 48  6  0
49  0      0 851248  40184 2144648    0    0     0  1679 2313 22832 45 10 40  5  0
10  0      0 846292  40192 2145248    0    0     0  1911 1980 23185 36  9 50  4  0
 0  0      0 844260  40196 2145868    0    0     0  1757 2152 26387 39 11 45  5  0
 0  3      0 839296  40196 2146520    0    0     0  1439 2104 25096 38 10 50  1  0

looking at last few columns we have io wait typically being in single digit percents where the “idle” cpu floats in 40-50% range. you can also see it visually in the graph from percona cloud tools
cpu usage during the test

so the system should be completely saturated with the load but instead we have 50% of cpu idle and iowait is typically in single digits?

running show processlist can give us some good idea in this case:

mysql> show processlist;
+-----+---------------+-----------------+--------+---------+------+----------------+--------------------------------------------------------------------+-----------+---------------+-----------+
| id  | user          | host            | db     | command | time | state          | info                                                               | rows_sent | rows_examined | rows_read |
+-----+---------------+-----------------+--------+---------+------+----------------+--------------------------------------------------------------------+-----------+---------------+-----------+
|   1 | root          | localhost:39624 | null   | sleep   |    0 |                | null                                                               |         0 |             0 |         0 |
| ...
|  32 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  33 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  34 | root          | localhost       | sbtest | execute |    0 | sorting result | select c from sbtest where id between 365260 and 365359 order by c |         0 |             0 |         0 |
|  35 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  36 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  37 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  38 | root          | localhost       | sbtest | execute |    0 | writing to net | delete from sbtest where id=496460                                 |         0 |             1 |         1 |
|  39 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
...
|  89 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  90 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
|  91 | root          | localhost       | sbtest | execute |    0 | null           | commit                                                             |         0 |             0 |         0 |
| 268 | root          | localhost       | null   | query   |    0 | null           | show processlist                                                   |         0 |             0 |         0 |
+-----+---------------+-----------------+--------+---------+------+----------------+--------------------------------------------------------------------+-----------+---------------+-----------+
68 rows in set (0.00 sec)

so most of the queries are stalling in commit stage. checking setting we can see:

mysql> select @@innodb_flush_log_at_trx_commit;
+----------------------------------+
| @@innodb_flush_log_at_trx_commit |
+----------------------------------+
|                                1 |
+----------------------------------+
1 row in set (0.00 sec)

which means percona server is running in the fully acid mode, which with slow hard drives without bbu means commit is causing syncing data to the disk which is very expensive.

so we figured out what is really happening… yet if we’re really waiting on disk a lot why iowait does not show that?
to answer that lets see how iowait is really computed.

iowait is type of idle time (and in earlier linux versions it wasincluded in idle) – when we have the process/thread which would be runable if it would not be waiting on io we consider this iowait, if it is not runable because of something else (waiting on mutex, network etc) it is considered idle.

writes to innodb log file are serialized – so only one process will be doing log write (and sync) while all others waiting to be able to commit will be waiting on some form of synchronization mechanism to be notified when the commit is complete as part of group commit.

what is the take away? take iowait% numbers from vmstat/iostat with grain of salt. it is possible there is much more to gain than faster io subsystem (or proper configuration) than iowait numbers show. consider the iowait%+idle% of the cpu as “cpu available” which could be potentially put to the good use with improving io subsystem and minimizing contention

to illustrate that lets take a look at the cpu graph again – now as i’ve set innodb_flush_log_at_trx_commit=2

cpu usage graph after optimization

as we can see from the graph we have reclaimed much more than 6-7% of idle time we had – the user cpu usage increased to 70% and idle+iowait is now 10-15%; the throughput of the system has increased about 60% – much more than “eliminating iowait” would allow.



Database Trust (business)

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • Competing Consumers With Spring Boot and Hazelcast
  • What Is React? A Complete Guide
  • Tomorrow’s Cloud Today: Unpacking the Future of Cloud Computing

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: