Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Making Code Faster: Streamlining the Output

DZone's Guide to

Making Code Faster: Streamlining the Output

Ayende Rahien further continues his series of teaching you how to make your code faster.

· Performance Zone
Free Resource

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

After looking at the profiler results, I realized that we are actually spending a considerable amount of time just writing the output to a file. That didn’t really matter when our code ran in 30+ seconds, spending another 100 – 200 ms to write the results was just noise, but when our code is doing that in under a second, that a considerable cost.

I’m running this code on a different machine, so we can’t directly compare. The performance of the initial version is 38,478 ms and allocated 7,612,741 kb with peak working set of 874,660 kb.

And the speed of the latest version is 842 ms and allocated 208,435 kb with peak working set of 375,452 kb.

So, we are 45 times faster than the initial version.

The problem is that doing this in parallel takes quite a lot and mask some inefficiencies, so I decided to change it back to using a single threaded approach. Which gives 1,498 ms and allocated 123,787 kb with peak working set of 319,436 kb.

Merely 25 times faster than the original version.

And now let's focus on the output.

private static void WriteOutput(Dictionary<int, FastRecord> stats)
{
using (var output = File.CreateText("summary.txt"))
{
foreach (var entry in stats)
{
output.WriteLine($"{entry.Value.Id:D10} {TimeSpan.FromTicks(entry.Value.DurationInTicks):c}");
}
}
}

This is pretty simple code, but it hides a lot of inefficiencies, in particular, it is doing a lot of allocations as it format the string. We can do much better.

Merely changing the WriteLine to:

output.WriteLine($"{entry.Value.Id} {entry.Value.DurationInTicks}");

Saved us close to 200 ms (!), so there is a lot of space to improve here. Again, this is mostly an issue of creating highly specific code to solve this exact scenario. Here is what I did:

private static void WriteOutput(Dictionary<int, FastRecord> stats)
{
byte[] tempBuffer = new byte[21];
tempBuffer[10] = (byte)' ';
tempBuffer[13] = (byte)':';
tempBuffer[16] = (byte)':';
tempBuffer[19] = (byte)'\r';
tempBuffer[20] = (byte)'\n';
using (var output = File.Create("summary.txt"))
{
foreach (var entry in stats)
{
WriteFormattedInt(entry.Value.Id, tempBuffer, 0, 10);
var ts = TimeSpan.FromTicks(entry.Value.DurationInTicks);
WriteFormattedInt(ts.Hours, tempBuffer, 11, 2);
WriteFormattedInt(ts.Minutes, tempBuffer, 14, 2);
WriteFormattedInt(ts.Seconds, tempBuffer, 17, 2);
output.Write(tempBuffer, 0, tempBuffer.Length);
}
}
}
​
private static void WriteFormattedInt(long id, byte[] temp, int pos, int numberOfDigits)
{
int i = pos + numberOfDigits - 1;
do
{
temp[i--] = (byte)((id % 10) + '0');
} while ((id /= 10) > 0);
while (i >= pos)
{
temp[i--] = (byte)'0';
}
}
​
private static void WriteFormattedInt(int id, byte[] temp, int pos, int numberOfDigits)
{
int i = pos + numberOfDigits-1;
do
{
temp[i--] = (byte)((id % 10) + '0');
} while ((id /= 10) > 0);
while (i >= pos)
{
temp[i--] = (byte)'0';
}
}

I wrote a simple function to format the number into a buffer, then change the summary line to write a single line into a prepared buffer (and skip all the static stuff), and write the to the file file in one shot.

private static void WriteOutput(Dictionary<int, FastRecord> stats)
{
byte[] tempBuffer = new byte[21];
tempBuffer[10] = (byte)' ';
tempBuffer[13] = (byte)':';
tempBuffer[16] = (byte)':';
tempBuffer[19] = (byte)'\r';
tempBuffer[20] = (byte)'\n';
using (var output = File.Create("summary.txt"))
{
foreach (var entry in stats)
{
WriteFormattedInt(entry.Value.Id, tempBuffer, 0, 10);
var ts = TimeSpan.FromTicks(entry.Value.DurationInTicks);
WriteFormattedInt(ts.Hours, tempBuffer, 11, 2);
WriteFormattedInt(ts.Minutes, tempBuffer, 14, 2);
WriteFormattedInt(ts.Seconds, tempBuffer, 17, 2);
output.Write(tempBuffer, 0, tempBuffer.Length);
}
}
}
​
private static void WriteFormattedInt(long id, byte[] temp, int pos, int numberOfDigits)
{
int i = pos + numberOfDigits - 1;
do
{
temp[i--] = (byte)((id % 10) + '0');
} while ((id /= 10) > 0);
while (i >= pos)
{
temp[i--] = (byte)'0';
}
}
​
private static void WriteFormattedInt(int id, byte[] temp, int pos, int numberOfDigits)
{
int i = pos + numberOfDigits-1;
do
{
temp[i--] = (byte)((id % 10) + '0');
} while ((id /= 10) > 0);
while (i >= pos)
{
temp[i--] = (byte)'0';
}
}

And the results are 1,191 ms and allocated 16,942 kb with peak working set of 311,432 kb.

You might have noticed that I have two copies of the WriteFormattedInt, this is to skip the implicit cast to long, and yes, it matters, by about 50 ms in my tests. But this version also reduces the number of allocations we have by over 100 MB! So this is great.

And here are the profiler results on analyzing this method:

image

This function is now almost seven times faster! That is pretty awesome, and even talking about single threaded performance, we are looking at 32 times better than the original version.

Trying the parallel version give me 731 ms and allocated 101,565 kb with peak working set of 381,224 kb

And a total improvement of 52 times! But we can do even more…I’ll talk about that in the next post.

Learn tips and best practices for optimizing your capacity management strategy with the Market Guide for Capacity Management, brought to you in partnership with BMC.

Topics:
streamlining ,performance ,speed ,coding

Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}