I had played with JMH before however this is the first time I have used JMH to solve a production problem. I had some idea how to optimise the code involved but trying different combinations with JMH lead to a significant improvement.
In this test I am encoding a String as UTF8 to direct memory so it can be written to a TCP socket. Also I need to take data written to direct memory via an NIO SocketChannel.read and encode it into a StringBuilder (which can be reused)
These tests involved a combination of using reflection to obtain the underlying data structure of String and StringBuilder but also using Unsafe to access the native memory.
To my surprise, access String via reflection appeared to be no faster, possibly slower. For shorter strings it was worse, suggesting that the overhead of using reflection was larger than any benefit.
Another surprise was that accessing StringBuilder via reflection did make a difference.
See decode_usingSimpleLoop and decode_usingCharArray Accessing the underlying char was over 4x faster. Note: this is clear an optimisation issue and future versions of Java might not have this problem.
All the results
Benchmark Mode Cnt Score Error Units DecodeMain.decode_fromUTF8 thrpt 20 5395187.171 � 17525.486 ops/s DecodeMain.decode_usingCharArray thrpt 20 7967263.552 � 259148.327 ops/s DecodeMain.decode_usingCharArrayAndAddress thrpt 20 11644515.566 � 52786.179 ops/s DecodeMain.decode_usingSimpleLoop thrpt 20 1884355.264 � 3442.892 ops/s EncodeMain.encode_simpleToUTF8 thrpt 20 5050422.611 � 31681.322 ops/s EncodeMain.encode_unsafeLoopCharArray thrpt 20 16837387.866 � 814047.308 ops/s EncodeMain.encode_unsafeLoopCharAt thrpt 20 18225151.521 � 132811.688 ops/s EncodeMain.encode_unsafeLoopCharAtUnrolled thrpt 20 13848365.955 � 102407.681 ops/s EncodeMain.encode_usingSimpleLoop thrpt 20 8868356.295 � 368546.131 ops/s EncodeMain.encode_usingSimpleLoopUnrolled thrpt 20 7077634.663 � 30359.636 ops/s
In future this functionality might be built in to the JVM, however there is likely to be functionality which is not built in to the JVM which is causing a performance issues and having an alternative is needed.