I/O Waiting CPU Time – ‘wa’ in Top
The wait is over...
Join the DZone community and get the full member experience.Join For Free
CPU consumption in Unix/Linux operating systems is broken down into 8 different metrics: User CPU time, System CPU time, nice CPU time, Idle CPU time, Waiting CPU time, Hardware Interrupt CPU time, Software Interrupt CPU time, and Stolen CPU time. In this article, let us study ‘waiting CPU time’.
What Is ‘Waiting’ CPU Time?
Waiting CPU time indicates the amount of time the CPU spends waiting for disk I/O or network I/O operations to complete. A high waiting time indicates that the CPU is *stranded* because of the I/O operations on that device. For optimal performance, one should aim to keep the I/O waiting CPU time as low as possible. If waiting time is > 10%, then it is worth investigating it.
You can visualize I/O waiting time through this analogy: Say there are hundreds of cars/bikes that are waiting on a busy road for the traffic light to switch from ‘red’ to ‘green’. But due to some technical glitch, it takes a long time for the traffic light to switch from ‘red’ to ‘green’ – then those hundreds of cars/bikes would get stranded unnecessarily. It will result in several undesirable side effects: passengers will reach their destination late, drivers can get frustrated and start to honk the horn (noise pollution), and fuel will be wasted (air pollution).
How to Find ‘Waiting’ CPU Time?
Waiting CPU time can be found from the following sources:
a. You can use web-based root cause analysis tools to report ‘waiting’ CPU time. The tool is capable of generating alerts if ‘waiting’ CPU time goes beyond a threshold.
b. ‘Waiting’ CPU time is also reported in the Unix/Linux command line tool ‘top’ in the field ‘wa’ as highlighted in the image below.
Fig: ‘wa’ time in top
How to Simulate a High ‘Waiting’ CPU Time?
To simulate high ‘waiting’ CPU reporting, let’s use BuggyApp. BuggyApp is an open-source Java project that can simulate various performance problems. When you launch BuggyApp with the following arguments, it will cause the ‘waiting’ CPU consumption to spike on the host.
Fig: You can see the waiting CPU time spike up to 75.9%
Now let’s see the source code in BuggyApp that is causing the ‘waiting’ CPU time to spike.
Here you can see that BuggyApp is launching 5 ‘IOThreads’ in the ‘IODemo’ class. You can see that ‘IOThread’ is going on an infinite while loop. In that loop, it’s writing the content into a file and reading the same content from the file. It is doing these two steps repeatedly. Since writing content and reading content from the disk is an I/O-intensive operation, ‘waiting’ CPU time is spiking up to 75.9%
How to Resolve High ‘Waiting’ Times?
If your device is suffering from a high I/O waiting time, then you can follow the steps outlined below to reduce it:
- You can use the root cause analysis tools, which will point the lines of code in the application causing the high I/O waiting time.
- You can optimize the application’s waiting time by doing the following:
- Reduce the number of database calls
- Optimize the database queries such that less data is returned from the DB to the app
- Reduce the number of network calls that are made to external applications
- Try to minimize the amount of payload sent between external applications and your application
- Try to reduce the number of files written to the disk.
- Try to reduce the amount of data read from the disk.
- Make sure only the essential log statements are written to the disk.
- Make sure your OS is running on the latest version with all patches installed. This is not only good from a security perspective, but it will also improve performance.
- Make sure sufficient free memory is allocated on the device. Lack of free memory has two detrimental effects:
- If there is a lack of free memory, then processes will be swapped in and out of memory. Several pages will be written and read from the disks frequently. It will increase the disk I/O operations.
- If there is less free memory, then the OS wouldn’t be able to cache frequently used disk blocks in memory. When frequently used disk blocks are cached, I/O waiting time will go down.
- Keep your filesystem disk usage below 80% to avoid excessive fragmentation. When there is excessive disk fragmentation, I/O time will increase.
- If all of the above steps fail, you may also consider upgrading your storage for better performance. You might consider switching to faster SSD, NVMe, SAN storage, etc.
Opinions expressed by DZone contributors are their own.