Over a million developers have joined DZone.

Reviewing LevelDB: Part XIV, The Mem Table and the Immutable Memtable

· Database Zone

To stay on top of the changing nature of the data connectivity world and to help enterprises navigate these changes, download this whitepaper from Progress Data Direct that explores the results of the 2016 Data Connectivity Outlook survey.

In leveldb we have the memtable, into which all of the current writes are going and the immutable memtable. So far, I haven't really checked what it actually means.

It looks like this happens on the write, by default, a mem table is limited to about 4 MB of space. We write to the memtable (backed by the tx log) until we get to the point where we go beyond the 4MB limit (note that again, if you have large values, you might go much higher than 4MB and then we switch to another memtable, change the existing memtable to be the immutable one, and move on.

Something that might trip you is that if you write 2 big values one after the other, in separate batches, the second one might need to wait for the compaction to complete.

Here is the test code:

   std::string big(1024 * 1024 * 5, 'a');
     
     for(int i=0; i < 10; i++ ){
        
         std::clock_t start = std::clock();
       
         std::ostringstream o;
         o << "test" << i;
         std::string key =  o.str();
         db->Put(writeOptions, key, big);
    
         std::clock_t end = std::clock();
   
         std::cout << i << ": " << 1000.0 *(end - start) / CLOCKS_PER_SEC << "ms " << std::endl;
   
     }

And the output is:

0: 20ms

1: 30ms

2: 50ms

3: 50ms

4: 50ms

5: 50ms

6: 40ms

7: 60ms

8: 50ms

9: 50ms

Not really that interesting, I'll admit. But when I tried it with 50 mb for each value, it felt like the entire machine grind to a halt. Considering the amount of times memory is copied around, and that it needs to be saved to at least two locations (log file & sst), that makes a lot of sense.

For references ,those were my timing, but I am not sure that I trust them.

0: 170ms

1: 310ms

2: 350ms

3: 460ms

4: 340ms

5: 340ms

6: 280ms

7: 530ms

8: 400ms

9: 1200ms

Turn Data Into a Powerful Asset, Not an Obstacle with Democratize Your Data, a Progress Data Direct whitepaper that explains how to provide data access for your users anywhere, anytime and from any source.

Topics:

Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}