Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Making Code Faster: Specialization

DZone's Guide to

Making Code Faster: Specialization

It may seem like we're pushing it at this point, but...can the code get any faster at this point? Probably. Read on to learn how.

Free Resource

Maintain Application Performance with real-time monitoring and instrumentation for any application. Learn More!

Okay, at this point we are really pushing it; but I do wonder. Can get it faster still?

image

So, we spent a lot of time in the ParseTime call, parsing two dates and then subtracting them. I wonder if we really need to do that?

I wrote two optimizations, once to compare only the time part if they are the same, and the second to do the date compare in seconds, instead of ticks. Here is what this looks like:

private static int DiffTimesInSecond(byte* start, byte* end)
{
if (*(long*)start == *(long*)end &&
*(int*)(start + sizeof(long)) == *(int*)(end + sizeof(long)))
{
// same day, just compare the times
var startTime = ParseInt(start + 11, 2) * 3600 + ParseInt(start + 14, 2) * 60 + ParseInt(start + 17, 2);
var endTime = ParseInt(end + 11, 2) * 3600 + ParseInt(end + 14, 2) * 60 + ParseInt(end + 17, 2);
return endTime - startTime;
}
​
return UnlikelyFullDateDiff(start, end);
}
​
private static int UnlikelyFullDateDiff(byte* start, byte* end)
{
return ParseDateInSeconds(end) - ParseDateInSeconds(start);
}

Note that we compare the first 12 bytes using just two instructions (by comparing long and int values) since we don’t care what they are, only that they are equal. The result:

283 ms and allocated 1,296 kb with peak working set of 295,200 kb

So, we are now 135 times faster than the original version.

Here is the profiler output:

image

At this point, I think that we are pretty much completely done. We can parse a line in under 75 nanoseconds and we can process about 1 GB a second on this machine (my year old plus laptop).

We can see that the skipping the date compare for time compare if we can pay off in about 65% of the cases, so that is probably a nice boost right there. I really can’t think of anything else that we can do here that can improve matters in any meaningful way.

For comparison purposes.

Original version:

  • 38,478 ms
  • 7,612,741 kb allocated
  • 874,660 kb peak working set
  • 50 lines of code
  • Extremely readable
  • Easy to change

Final version:

  • 283 ms
  • 1,296 kb allocated
  • 295,200 kb peak working set
  • 180 lines of code
  • Highly specific and require specialize knowledge
  • Hard to change

So yes, that is 135 times faster, but the first version took about 10 minutes to write, then another half an hour of fiddling to make it non-obviously inefficient. The final version took several days of careful thorough analysis of the data and careful optimizations.

Collect, analyze, and visualize performance data from mobile to mainframe with AutoPilot APM. Learn More!

Topics:
performance ,code speed ,specialization

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}