Over a million developers have joined DZone.

Making Code Faster: Specialization

It may seem like we're pushing it at this point, but...can the code get any faster at this point? Probably. Read on to learn how.

· Performance Zone

Discover 50 of the latest mobile performance statistics with the Ultimate Guide to Digital Experience Monitoring, brought to you in partnership with Catchpoint.

Okay, at this point we are really pushing it; but I do wonder. Can get it faster still?

image

So, we spent a lot of time in the ParseTime call, parsing two dates and then subtracting them. I wonder if we really need to do that?

I wrote two optimizations, once to compare only the time part if they are the same, and the second to do the date compare in seconds, instead of ticks. Here is what this looks like:

private static int DiffTimesInSecond(byte* start, byte* end)
{
if (*(long*)start == *(long*)end &&
*(int*)(start + sizeof(long)) == *(int*)(end + sizeof(long)))
{
// same day, just compare the times
var startTime = ParseInt(start + 11, 2) * 3600 + ParseInt(start + 14, 2) * 60 + ParseInt(start + 17, 2);
var endTime = ParseInt(end + 11, 2) * 3600 + ParseInt(end + 14, 2) * 60 + ParseInt(end + 17, 2);
return endTime - startTime;
}
​
return UnlikelyFullDateDiff(start, end);
}
​
private static int UnlikelyFullDateDiff(byte* start, byte* end)
{
return ParseDateInSeconds(end) - ParseDateInSeconds(start);
}

Note that we compare the first 12 bytes using just two instructions (by comparing long and int values) since we don’t care what they are, only that they are equal. The result:

283 ms and allocated 1,296 kb with peak working set of 295,200 kb

So, we are now 135 times faster than the original version.

Here is the profiler output:

image

At this point, I think that we are pretty much completely done. We can parse a line in under 75 nanoseconds and we can process about 1 GB a second on this machine (my year old plus laptop).

We can see that the skipping the date compare for time compare if we can pay off in about 65% of the cases, so that is probably a nice boost right there. I really can’t think of anything else that we can do here that can improve matters in any meaningful way.

For comparison purposes.

Original version:

  • 38,478 ms
  • 7,612,741 kb allocated
  • 874,660 kb peak working set
  • 50 lines of code
  • Extremely readable
  • Easy to change

Final version:

  • 283 ms
  • 1,296 kb allocated
  • 295,200 kb peak working set
  • 180 lines of code
  • Highly specific and require specialize knowledge
  • Hard to change

So yes, that is 135 times faster, but the first version took about 10 minutes to write, then another half an hour of fiddling to make it non-obviously inefficient. The final version took several days of careful thorough analysis of the data and careful optimizations.

Is your APM strategy broken? This ebook explores the latest in Gartner research to help you learn how to close the end-user experience gap in APM, brought to you in partnership with Catchpoint.

Topics:
performance ,code speed ,specialization

Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}