DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Debugging Core Dump Files on Linux - A Detailed Guide
  • Analyzing “java.lang.OutOfMemoryError: Failed to create a thread” Error
  • Understanding ldd: The Linux Dynamic Dependency Explorer
  • Beating the 100-Scheduled-Job Limit in Salesforce

Trending

  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • Artificial Intelligence, Real Consequences: Balancing Good vs Evil AI [Infographic]
  • Analyzing “java.lang.OutOfMemoryError: Failed to create a thread” Error
  • Top Book Picks for Site Reliability Engineers
  1. DZone
  2. Software Design and Architecture
  3. Performance
  4. Tuning Linux I/O Scheduler for SSDs

Tuning Linux I/O Scheduler for SSDs

By 
Seth  Proctor user avatar
Seth Proctor
·
Sep. 12, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
21.2K Views

Join the DZone community and get the full member experience.

Join For Free
This post comes from Michael Rice at the NuoDB Techblog.

Tuning Linux I/O Scheduler for SSDs

Hello Techblog readers!

I'm going to talk about tuning the Linux I/O scheduler to increase throughput and decrease latency on an SSD. I'll also cover another interesting topic about tuning the performance of NuoDB by specifying storage for the archive and journal directories.

Tuning the Linux I/O Scheduler

Linux gives you the option to select the I/O scheduler. The scheduler can be changed without rebooting, too! You may be asking at this point, "why would I ever want to change the I/O scheduler?" Changing the scheduler makes sense when the overhead of optimizing the I/O (re-ordering I/O requests) is unnecessary and expensive. This setting should be fine-tuned per storage device. The best setting for an SSD will not be a good setting for an HDD.

The current I/O scheduler can be viewed by typing the following command:

mrice@host:~$ cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

The current I/O scheduler (in brackets) for /dev/sda on this machine is CFQ, Completely Fair Queuing. This setting became the default in kernel 2.6.18 and it works well for HDDs. However, SSDs don't have rotating platters or magnetic heads. The I/O optimization algorithms in CFQ don't apply to SSDs. For an SSD, the NOOP I/O scheduler can reduce I/O latency and increase throughput as well as eliminate the CPU time spent re-ordering I/O requests. This scheduler typically works well on SANs, SSDs, Virtual Machines, and even fancy Fusion I/O cards.  At this point you're probably thinking "OK, I'm sold! How do I change the scheduler already?"  You can use the echo command as shown below:

mrice@host:~$ echo noop | sudo tee /sys/block/sda/queue/scheduler

To see the change, just cat the scheduler again.

mrice@host:~$ cat /sys/block/sda/queue/scheduler
[noop] anticipatory deadline cfq 

Notice that noop is now selected in brackets. This change is only temporary and will reset back to the default scheduler, CFQ in this case, when the machine reboots. You need to edit the Grub configuration in order to keep the setting permanently. However, this will change the I/O scheduler for all block devices. The problem is that NOOP is not a good choice for HDD. I'd only permanently change the setting if the machine only has SSDs. 

On Grub 2:

Edit: /etc/default/grub
Add "elevator=noop" to the GRUB_CMDLINE_LINUX_DEFAULT line.
sudo update-grub

At this point, you've changed the I/O scheduler to NOOP. How do you know if it made a difference? You could run a benchmark against it and compare the numbers (just remember to flush the file-system cache). The other way is to take a look at the output from iostat. I/O requests spend less time in the queue with the NOOP I/O scheduler. This can be seen with in the "await" field from iostat. Here's an example of a larger write operation with NOOP.

iostat -x 1 /dev/sda

Device: sda
rrqm/s 0.00
wrqm/s 143819.00
r/s 6.00
w/s 2881.00
rkB/s 24.00
wkB/s 586800.00
avgrq-sz 406.53
avgqu-sz 0.94
await 0.33
r_await 3.33
w_await 0.32
svctm 0.11
%util 31.00

Tuning NuoDB Performance

Now that you've learned about the NOOP I/O scheduler, I'll talk about tuning NuoDB with an SSD. If you've read the tech blogs you'll know that there are two building blocks for a NuoDB database: the Transaction Engine, TE for short, and the Storage Manager, SM.  The TE is an in-memory only copy of the database (actually a portion of the database). As a result, an SSD won't help the performance of a TE because, it doesn't store atoms to disk. The SM contains two modules that write to disk: the archive and the journal. The archive stores atoms to disk when archive configuration parameter points to a file system (versus HDFS and S3). The journal, on the other hand, synchronously writes messages to disk. If you read the blog post on durability, you may remember that the "Remote Commit with Journaling" setting provides the highest level of durability but at the cost of slower speed. Using an SSD in this situation can drastically improve performance.

To tune this setting, we'll need to make a nuodb directory on the SSD:

mkdir -p /ssd/nuodb

The SSD in this example has the mount point /ssd on this machine. Easy, right? I'm assuming you've already set the Linux I/O scheduler for the SSD to NOOP. The next step is to configure NuoDB to use this path when creating the journal directory. The journal has a direct correlation on the transaction throughput because the journal has to finish the disk write before an ACK for the transaction commit is sent by the SM back to the TE. What about the archive? The archive is decoupled from the transaction commit, the atoms will remain in memory and gradually make their way to disk. This will have very little effect on the TPS of the database. As a result, the archive directory can be placed on just a regular HDD. The quick guide for tuning performance is to put the journal on an SSD and archive can be placed on an HDD.

Here are the commands:

nuodbmgr --broker localhost --password bird
nuodb [domain] > start process sm
Database: hockey
Host: s1
Process command-line options: --journal enable --journal-dir /ssd/nuodb/demo-journal
Archive directory: /var/opt/nuodb/demo-archives
Initialize archive: true
Started: [SM] s1/127.0.0.1:37880 [ pid = 25036 ] ACTIVE

nuodb [domain/hockey] > start process te
Host: t1
Process command-line options: --dba-user dba --dba-password goalie
Started: [TE] t1/127.0.0.1:48315 [ pid = 25052 ] ACTIVE

The important point with this tuned configuration is that only the journal is on an SSD. As a result, the SSD doesn't have to be one of these massive TB drives. The costs of SSDs have significantly dropped in price and a single 128 GB or 256 GB SSD would be adequate for the journal data. This configuration should rival the local commit performance which is wicked awesome considering it is the highest durability level! I encourage you to try it out and ask some questions.



job scheduling Linux (operating system)

Published at DZone with permission of Seth Proctor, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Debugging Core Dump Files on Linux - A Detailed Guide
  • Analyzing “java.lang.OutOfMemoryError: Failed to create a thread” Error
  • Understanding ldd: The Linux Dynamic Dependency Explorer
  • Beating the 100-Scheduled-Job Limit in Salesforce

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!