DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • High-Performance Java Serialization to Different Formats
  • Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?
  • Looking for the Best Java Data Computation Layer Tool
  • The Magic of Apache Spark in Java

Trending

  • Caching 101: Theory, Algorithms, Tools, and Best Practices
  • Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 2: Understanding Neo4j
  • IoT and Cybersecurity: Addressing Data Privacy and Security Challenges
  • Prioritizing Cloud Security Risks: A Developer's Guide to Tackling Security Debt
  1. DZone
  2. Coding
  3. Java
  4. Java Sequential IO Performance

Java Sequential IO Performance

By 
Martin Thompson user avatar
Martin Thompson
·
Jan. 09, 12 · Interview
Likes (2)
Comment
Save
Tweet
Share
18.7K Views

Join the DZone community and get the full member experience.

Join For Free

Many applications record a series of events to file-based storage for later use.  This can be anything from logging and auditing, through to keeping a transaction redo log in an event sourced design or its close relative CQRS. 

Java has a number of means by which a file can be sequentially written to, or read back again.  This article explores some of these mechanisms to understand their performance characteristics.  For the scope of this article I will be using pre-allocated files because I want to focus on performance.  Constantly extending a file imposes a significant performance overhead and adds jitter to an application resulting in highly variable latency.  "Why is a pre-allocated file better performance?", I hear you ask.  Well, on disk a file is made up from a series of blocks/pages containing the data.  Firstly, it is important that these blocks are contiguous to provide fast sequential access.   Secondly, meta-data must be allocated to describe this file on disk and saved within the file-system.  A typical large file will have a number of "indirect" blocks allocated to describe the chain of data-blocks containing the file contents that make up part of this meta-data.   I'll leave it as an exercise for the reader, or maybe a later article, to explore the performance impact of not preallocating the data files.  If you have used a database you may have noticed that it preallocates the files it will require.

The Test

I want to experiment with 2 file sizes.  One that is sufficiently large to test sequential access, but can easily fit in the file-system cache, and another that is much larger so that the cache subsystem is forced to retire pages so that new ones can be loaded.  For these two cases I'll use 400MB and 8GB respectively.  I'll also loop over the files a number of times to show the pre and post warm-up characteristics.

I'll test 4 means of writing and reading back files sequentially:

  1. RandomAccessFile using a vanilla byte[] of page size.
  2. Buffered FileInputStream and FileOutputStream.
  3. NIO FileChannel with ByteBuffer of page size.
  4. Memory mapping a file using NIO and direct MappedByteBuffer.
The tests are run on a 2.0Ghz Sandybridge CPU with 8GB RAM, an Intel 230 SSD on Fedora Core 15 64-bit Linux with an ext4 file system, and Oracle JDK 1.6.0_30.

The Code

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

import static java.lang.Integer.MAX_VALUE;
import static java.lang.System.out;
import static java.nio.channels.FileChannel.MapMode.READ_ONLY;
import static java.nio.channels.FileChannel.MapMode.READ_WRITE;

public final class TestSequentialIoPerf
{
    public static final int PAGE_SIZE = 1024 * 4;
    public static final long FILE_SIZE = PAGE_SIZE * 2000L * 1000L;
    public static final String FILE_NAME = "test.dat";
    public static final byte[] BLANK_PAGE = new byte[PAGE_SIZE];

    public static void main(final String[] arg) throws Exception
    {
        preallocateTestFile(FILE_NAME);

        for (final PerfTestCase testCase : testCases)
        {
            for (int i = 0; i < 5; i++)
            {
                System.gc();
                long writeDurationMs = testCase.test(PerfTestCase.Type.WRITE,
                                                     FILE_NAME);

                System.gc();
                long readDurationMs = testCase.test(PerfTestCase.Type.READ,
                                                    FILE_NAME);

                long bytesReadPerSec = (FILE_SIZE * 1000L) / readDurationMs;
                long bytesWrittenPerSec = (FILE_SIZE * 1000L) / writeDurationMs;

                out.format("%s\twrite=%,d\tread=%,d bytes/sec\n",
                           testCase.getName(),
                           bytesWrittenPerSec, bytesReadPerSec);
            }
        }

        deleteFile(FILE_NAME);
    }

    private static void preallocateTestFile(final String fileName)
        throws Exception
    {
        RandomAccessFile file = new RandomAccessFile(fileName, "rw");

        for (long i = 0; i < FILE_SIZE; i += PAGE_SIZE)
        {
            file.write(BLANK_PAGE, 0, PAGE_SIZE);
        }

        file.close();
    }

    private static void deleteFile(final String testFileName) throws Exception
    {
        File file = new File(testFileName);
        if (!file.delete())
        {
            out.println("Failed to delete test file=" + testFileName);
            out.println("Windows does not allow mapped files to be deleted.");
        }
    }

    public abstract static class PerfTestCase
    {
        public enum Type { READ, WRITE }

        private final String name;
        private int checkSum;

        public PerfTestCase(final String name)
        {
            this.name = name;
        }

        public String getName()
        {
            return name;
        }

        public long test(final Type type, final String fileName)
        {
            long start = System.currentTimeMillis();

            try
            {
                switch (type)
                {
                    case WRITE:
                    {
                        checkSum = testWrite(fileName);
                        break;
                    }

                    case READ:
                    {
                        final int checkSum = testRead(fileName);
                        if (checkSum != this.checkSum)
                        {
                            final String msg = getName() +
                                " expected=" + this.checkSum +
                                " got=" + checkSum;
                            throw new IllegalStateException(msg);
                        }
                        break;
                    }
                }
            }
            catch (Exception ex)
            {
                ex.printStackTrace();
            }

            return System.currentTimeMillis() - start;
        }

        public abstract int testWrite(final String fileName) throws Exception;
        public abstract int testRead(final String fileName) throws Exception;
    }

    private static PerfTestCase[] testCases =
    {
        new PerfTestCase("RandomAccessFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                RandomAccessFile file = new RandomAccessFile(fileName, "rw");
                final byte[] buffer = new byte[PAGE_SIZE];
                int pos = 0;
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;

                    buffer[pos++] = b;
                    if (PAGE_SIZE == pos)
                    {
                        file.write(buffer, 0, PAGE_SIZE);
                        pos = 0;
                    }
                }

                file.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                RandomAccessFile file = new RandomAccessFile(fileName, "r");
                final byte[] buffer = new byte[PAGE_SIZE];
                int checkSum = 0;
                int bytesRead;

                while (-1 != (bytesRead = file.read(buffer)))
                {
                    for (int i = 0; i < bytesRead; i++)
                    {
                        checkSum += buffer[i];
                    }
                }

                file.close();

                return checkSum;
            }
        },

        new PerfTestCase("BufferedStreamFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                int checkSum = 0;
                OutputStream out =
                    new BufferedOutputStream(new FileOutputStream(fileName));

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;
                    out.write(b);
                }

                out.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                int checkSum = 0;
                InputStream in =
                    new BufferedInputStream(new FileInputStream(fileName));

                int b;
                while (-1 != (b = in.read()))
                {
                    checkSum += (byte)b;
                }

                in.close();

                return checkSum;
            }
        },


        new PerfTestCase("BufferedChannelFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                FileChannel channel =
                    new RandomAccessFile(fileName, "rw").getChannel();
                ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE);
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;
                    buffer.put(b);

                    if (!buffer.hasRemaining())
                    {
                        channel.write(buffer);
                        buffer.clear();
                    }
                }

                channel.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                FileChannel channel =
                    new RandomAccessFile(fileName, "rw").getChannel();
                ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE);
                int checkSum = 0;

                while (-1 != (channel.read(buffer)))
                {
                    buffer.flip();

                    while (buffer.hasRemaining())
                    {
                        checkSum += buffer.get();
                    }

                    buffer.clear();
                }

                return checkSum;
            }
        },

        new PerfTestCase("MemoryMappedFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                FileChannel channel =
                    new RandomAccessFile(fileName, "rw").getChannel();
                MappedByteBuffer buffer =
                    channel.map(READ_WRITE, 0,
                                Math.min(channel.size(), MAX_VALUE));
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    if (!buffer.hasRemaining())
                    {
                        buffer =
                            channel.map(READ_WRITE, i,
                                        Math.min(channel.size() - i , MAX_VALUE));
                    }

                    byte b = (byte)i;
                    checkSum += b;
                    buffer.put(b);
                }

                channel.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                FileChannel channel =
                    new RandomAccessFile(fileName, "rw").getChannel();
                MappedByteBuffer buffer =
                    channel.map(READ_ONLY, 0,
                                Math.min(channel.size(), MAX_VALUE));
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    if (!buffer.hasRemaining())
                    {
                        buffer =
                            channel.map(READ_WRITE, i,
                                        Math.min(channel.size() - i , MAX_VALUE));
                    }

                    checkSum += buffer.get();
                }

                channel.close();

                return checkSum;
            }
        },
    };
}


Results


400MB file
===========
RandomAccessFile    write=379,610,750   read=1,452,482,269 bytes/sec
RandomAccessFile    write=294,041,636   read=1,494,890,510 bytes/sec
RandomAccessFile    write=250,980,392   read=1,422,222,222 bytes/sec
RandomAccessFile    write=250,366,748   read=1,388,474,576 bytes/sec
RandomAccessFile    write=260,394,151   read=1,422,222,222 bytes/sec

BufferedStreamFile  write=98,178,331    read=286,433,566 bytes/sec
BufferedStreamFile  write=100,244,738   read=288,857,545 bytes/sec
BufferedStreamFile  write=82,948,562    read=154,100,827 bytes/sec
BufferedStreamFile  write=108,503,311   read=153,869,271 bytes/sec
BufferedStreamFile  write=113,055,478   read=152,608,047 bytes/sec

BufferedChannelFile write=388,246,445   read=358,041,958 bytes/sec
BufferedChannelFile write=390,467,111   read=375,091,575 bytes/sec
BufferedChannelFile write=321,759,622   read=1,539,849,624 bytes/sec
BufferedChannelFile write=318,259,518   read=1,539,849,624 bytes/sec
BufferedChannelFile write=322,265,932   read=1,534,082,397 bytes/sec

MemoryMappedFile    write=300,955,180   read=305,899,925 bytes/sec
MemoryMappedFile    write=313,149,847   read=310,538,286 bytes/sec
MemoryMappedFile    write=326,374,501   read=303,857,566 bytes/sec
MemoryMappedFile    write=327,680,000   read=304,535,315 bytes/sec
MemoryMappedFile    write=326,895,450   read=303,632,320 bytes/sec

8GB File
============
RandomAccessFile    write=167,402,321   read=251,922,012 bytes/sec
RandomAccessFile    write=193,934,802   read=257,052,307 bytes/sec
RandomAccessFile    write=192,948,159   read=248,460,768 bytes/sec
RandomAccessFile    write=191,814,180   read=245,225,408 bytes/sec
RandomAccessFile    write=190,635,762   read=275,315,073 bytes/sec

BufferedStreamFile  write=154,823,102   read=248,355,313 bytes/sec
BufferedStreamFile  write=152,083,913   read=253,418,301 bytes/sec
BufferedStreamFile  write=133,099,369   read=146,056,197 bytes/sec
BufferedStreamFile  write=131,065,708   read=146,217,827 bytes/sec
BufferedStreamFile  write=132,694,052   read=148,116,004 bytes/sec

BufferedChannelFile write=406,147,744    read=304,693,892 bytes/sec
BufferedChannelFile write=397,457,668    read=298,183,671 bytes/sec
BufferedChannelFile write=364,672,364    read=414,281,379 bytes/sec
BufferedChannelFile write=371,266,711    read=404,343,534 bytes/sec
BufferedChannelFile write=373,705,579    read=406,934,578 bytes/sec

MemoryMappedFile    write=123,023,322    read=231,530,156 bytes/sec
MemoryMappedFile    write=121,961,023    read=230,403,600 bytes/sec
MemoryMappedFile    write=123,317,778    read=229,899,250 bytes/sec
MemoryMappedFile    write=121,472,738    read=231,739,745 bytes/sec
MemoryMappedFile    write=120,362,615    read=231,190,382 bytes/sec

Analysis

For years I was a big fan of using RandomAccessFile directly because of the control it gives and the predictable execution.  I never found using buffered streams to be useful from a performance perspective and this still seems to be the case.

In more recent testing I've found that using NIO FileChannel and ByteBuffer are the clear winners from a performance perspective. With Java 7 the flexibility of this programming approach has been improved for random access with SeekableByteChannel.

I've seen these results vary greatly depending on platform.  File system, OS, storage devices, and available memory all have a significant impact.  In a few cases I've seen memory-mapped files perform significantly better than the others but this needs to be tested on your platform because your mileage may vary...

A special note should be made for the use of memory-mapped large files when pushing for maximum throughput.  I've often found the OS can become unresponsive due the the pressure put on the virtual memory sub-system.

Conclusion

There is a significant difference in performance for the different means of doing sequential file IO from Java.  Not all methods are even remotely equal.  For most IO I've found the use of ByteBuffers and Channels to be the best optimised parts of the IO libraries.  If buffered streams are your IO libraries of choice, then it is worth branching out and and getting familiar with the sub-classes of Channel and Buffer.

 

From http://mechanical-sympathy.blogspot.com/2011/12/java-sequential-io-performance.html

file IO Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • High-Performance Java Serialization to Different Formats
  • Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?
  • Looking for the Best Java Data Computation Layer Tool
  • The Magic of Apache Spark in Java

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!