Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Cost of Managed Memory

DZone's Guide to

The Cost of Managed Memory

Sometimes doing things the right way, well, isn't. Here, learn how a simple document can consume orders of magnitude more memory that you'd think it needs.

· Performance Zone
Free Resource

Download our Introduction to API Performance Testing and learn why testing your API is just as important as testing your website, and how to start today.

A customer reported that under memory constrained system, a certain operation is taking all the memory and swapping hard. On a machine with just a bit more memory, the operation completed very quickly. It didn’t take long to figure out what was going on, we were reading too much, and we started swapping, and everything broke after that.

The problem is that we have code that is there specifically to prevent that, it is there to check that the size that we load from the disk isn’t too big, and that we aren’t doing something foolish. But something broke here.

The problem happens when we convert it to a.NET object:

image

Yep, when we de-serialized it, it takes close to 13 times more space than the text format.

For fun, let us take the following JSON:

image

This generates a string whose size is less than 1KB.

But when parsing it:

image

The reason, by the way? It is the structure of the document.

image

So each two bytes for object creation in JSON ( the {} ) are holding, we are allocating 116 bytes. No wonder this blows up so quickly.

This behavior is utterly dependent on the structure of the document and is very hard to protect against because you don’t really have a way of seeing how much you allocated.

We resolved it by not only watching the size of the documents that we are reading, but the amount of free memory available on the machine (aborting if it gets too low), but that is a really awkward way of doing that. I’m pretty sure that this is also something that you can use to attack a server, forcing it to allocate a lot of memory by sending very little data to it.

I opened an issue on the CoreCLR about this, and we’ll see if there is something that can be done.

In RavenDB 4.0, we resolved that entirely by using the blittable format, and we have one-to-one mapping between the size of the document on disk and the allocated size (actually, since we map, there is not even allocation of the data, we just access it directly).

Find scaling and performance issues before your customers do with our Introduction to High-Capacity Load Testing guide.

Topics:
memory ,profiling

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}