Over a million developers have joined DZone.

Reviewing LevelDB, Part IV: On std::string, buffers and memory management in C++

· Database Zone

To stay on top of the changing nature of the data connectivity world and to help enterprises navigate these changes, download this whitepaper from Progress Data Direct that explores the results of the 2016 Data Connectivity Outlook survey.

This is a bit of a side track. One of the things that is quite clear to me when I am reading the leveldb code is that I was never really any good at C++. I was a C/C++ developer. And that is a pretty derogatory term. C & C++ share a lot of the same syntax and underlying assumption, but the moment you want to start writing non trivial stuff, they are quite different. And no, I am not talking about OO or templates.

I am talking about things that came out of that. In particular, throughout the leveldb codebase, they are very rarely, if at all, allocate memory directly. Pretty much the whole codebase rely on std::string to handle buffer allocations and management. This make sense, since RAII is still the watch ward for good C++ code. Being able to utilize std::string for memory management also means that the memory will be properly released without having to deal with it explicitly.

More interestingly, the leveldb codebase is also using std::string as a general buffer. I wonder why it is std::string vs. std::vector<char>,  which would bet more reasonable, but I guess that this is because most of the time, users will want to pass strings as keys, and likely this is easier to manage, given the type of operations available on std::string (such as append).

It is actually quite fun to go over the codebase and discover those sort of things. Especially if I can figure them out on my own Smile.

This is quite interesting because from my point of view, buffers are a whole different set of problems. We don’t have to worry about the memory just going away in .NET (although we do have to worry about someone changing the buffer behind our backs), but we have to worry a lot about buffer size. This is because at some point (80Kb), buffers graduate to the large object heap, and stay there. Which means, in turn, that every time that you want to deal with buffers you have to take that into account, usually with a buffer pool.

Another aspect that is interesting with regards to memory usage is the explicit handling of copying. There are various places in the code where the copy constructor was made private, to avoid this. Or a comment is left about making a type copy-able intentionally. I get the reason why, because it is a common failing point in C++, but I forgot (although I am pretty sure that I used to know) the actual semantics of when/ how you want to do that in all cases.

 

Turn Data Into a Powerful Asset, Not an Obstacle with Democratize Your Data, a Progress Data Direct whitepaper that explains how to provide data access for your users anywhere, anytime and from any source.

Topics:

Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}