Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Excerpts from the RavenDB Performance Team Report: Optimizers, Assemble!

DZone's Guide to

Excerpts from the RavenDB Performance Team Report: Optimizers, Assemble!

· Performance Zone
Free Resource

Download our Introduction to API Performance Testing and learn why testing your API is just as important as testing your website, and how to start today.

[This article was written by Federico Lois]

In the previous post on the topic after inspecting the decompiled source using ILSpy  we were able to uncover potential things we could do.

Do you remember the loop from a couple of posts behind?

image

This loop was the responsible to check for the last bytes; either because the words compare found a difference or we are at the end of the memory block and the last block is not 8 bytes wide. You may remember it easier because of the awful code it generated.

image

After optimization work that loop looked sensibly different:

image

And we can double-check that the optimization did create a packed MSIL.

image

Now let’s look at the different operations involved. In red, blue and orange we have unavoidable instructions, as we need to do the comparison to be able to return the result.

13 instructions to do the work and 14 for the rest. Half of the operations are housekeeping in order to prepare for the next iteration.

The experienced developer would notice that if we would have done this on the JIT output each one of the increments and decrements can be implemented with a single assembler operation using specific addressing mode. However, we shouldn’t underestimate the impact of memory loads and stores.

How would the following loop look in pseudo-assembler?

:START
Load address of lhs into register
Load address of R into raddr-inregister
Move value of lhs into R
Load byte into a 32 bits register (lhs-inregister)
Substract rhs-inregister, [raddr-inregister],
Load int32 from R into r-inregister
Jump :WEAREDONE if non zero r-inregister,
Load address of lhs into lhsaddr-register
Add 4 into [lhsaddr-register] (immediate-mode)
Load address of rhs into rhsaddr-register
Add 4 into [rhsaddr-register]   (immediate-mode)
Load address of n into naddr-register
Increment [naddr-register]

Load content of n into n-register
Jump :START if bigger than zero n-register
Push 0
Return

:WEAREDONE
Push r-inregister
Return

As you can see the housekeeping keeps being a lot of operations. J

Armed with this knowledge. How would you optimize this loop?

Find scaling and performance issues before your customers do with our Introduction to High-Capacity Load Testing guide.

Topics:
java ,high-perf ,database ,tools ,tutorial ,optimization ,performance ,tips and tricks ,ravendb ,ravendb performance

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}