Last weekend was LJC Open Conference #4, and like many people I got a lot out of it.
My talk was up first which meant I could relax for the rest of the day.
Here are the slides
Note: the message size was 16 longs or 128 bytes. This makes a difference at higher throughputs.
In answer to @aix's questions
On slide 12; what are we measuring here, elapsed time from the message hitting the buffer to the "echo" reply showing up on the client socket?In each message I add a timestamp as it is written. When each message is read, I compare the timestamp with the current time. I sort the results and take the middle / 50%tile as typical timings and the worst 0.01% as the 99.99%tile.
What's causing the large difference between "nanoTime(), Normal" and "RDTSC, Normal" in the bottom half of the slide (2M/s)?The reason for taking the RDTSC directly (9 ns) is that System.nanoTime() is quite slow on Centos 5.7 (180 ns) and the later is a system calls which may disturb the cache. At modest message rate 200K/s (one message every 5000 ns) the difference is minor however as the message rates increase 2M/s (one message every 500 ns) the addition to latency is significant. Its not possible to send messages over 5 M/s if I was using System.nanoTime() where as with RDTSC I got up to 12 M/s. Without timing each message, I got a throughput of 17 M/s.