DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How To Recover SQL Server FILESTREAM Enabled Database
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Trending

  • AWS Kiro: The Agentic IDE That Makes Specs the Unit of Work
  • Integrating AI-Driven Decision-Making in Agile Frameworks: A Deep Dive into Real-World Applications and Challenges
  • Architecting an Embedded Efficiency Layer: A Platform Deep Dive into Day-Two Operational Tuning
  • AI-Driven Integration in Large-Scale Agile Environments
  1. DZone
  2. Data Engineering
  3. Databases
  4. Stepping through LMDB: Making Everything Easier

Stepping through LMDB: Making Everything Easier

By 
Oren Eini user avatar
Oren Eini
·
Jul. 27, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
9.5K Views

Join the DZone community and get the full member experience.

Join For Free

okay, i know that i have been critical about the lmdb codebase so far. but one thing that i really want to point out for it is that it was pretty easy to actually get things working on windows. it wasn’t smooth, in the sense that i had to muck around with the source a bit (hack endianess, remove a bunch of unix specific header files, etc). but that took less than an hour, and it was pretty much it. since i am by no means an experienced c developer, i consider this a major win. compare that to leveldb, which flat out won’t run on windows no matter how much time i spent trying, and it is a pleasure .

also, stepping through the code i am starting to get a sense of how it works that is much different than the one i had when i just read the code. it is like one of those 3d images, you suddenly see something.

the first thing that became obvious is that i totally missed the significance of the lock file. lmdb actually create two files:

  • lock.mdb
  • data.mdb

lock.mdb is used to synchronized data between different readers. it seems to mostly be there if you want to have multiple writers using different processes. that is a very interesting model for an embedded database, i’ve to admit. not something that i think other embedded databases are offering. in order to do that, it create two named mutexes (one for read and one for write).

a side note on windows support:

lmdb supports windows, but it is very much a 2nd class citizen. you can see it in things like path not found error turning into a no such process error (because it try to use getlasterror() codes as c codes), or when it doesn’t create a directory even though not creating it would fail.

i am currently debugging through the code and fixing such issues as i go along (but no, i am doing heavy handed magic fixes, just to get past this stage to the next one, otherwise i would have sent a pull request).

here is one such example. here is the original code:

image

but readfile in win32 will return false if the file is empty, so you actually need to write something like this to make the code work:

image

past that hurdle, i think that i get a lot more about what is going on with the way lmdb works than before.

let us start with the way data.mdb works. it is important to note that for pretty much everything in lmdb we use the system page size. by default, that is 4kb.

the data file starts with 2 pages allocated. those page contain the following information:

image

looking back at how couchdb did things, i am pretty sure that those two pages are going to be pretty important . i am guess that they would always contain the root of the data in the file. there is also the last transaction on them, which is what i imagine determine how something gets committed. i don’t know yet, as i said, guessing based on how couchdb works.

i’ll continue this review in another time. next time, transactions…

Database Data file

Published at DZone with permission of Oren Eini. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How To Recover SQL Server FILESTREAM Enabled Database
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook