DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Getting Rid of Performance Testing Outliers With a New JMeter Plugin
  • How We Solved an OOM Issue in TiDB with GOMEMLIMIT
  • DZone Community Awards 2022
  • Projections/DTOs in Spring Data R2DBC

Trending

  • Memory Management in Java: An Introduction
  • Top 7 Best Practices DevSecOps Team Must Implement in the CI/CD Process
  • How To Verify Database Connection From a Spring Boot Application
  • Effective Tips for Debugging Complex Code in Java
  1. DZone
  2. Culture and Methodologies
  3. Career Development
  4. Degenerate Performance Scenario for LMDB

Degenerate Performance Scenario for LMDB

Oren Eini user avatar by
Oren Eini
·
Sep. 12, 13 · Interview
Like (0)
Save
Tweet
Share
5.03K Views

Join the DZone community and get the full member experience.

Join For Free

i was working peacefully through some stuff with voron, when i ran into some really bad performance in a pretty important scenario. i decided that this is probably something that i was doing wrong, and set out to reproduce the same scenario in lmdb, to figure out what i was doing wrong.

here is the code:

int main(int argc,char * argv[])
{
    int i = 0, j = 0, rc;
    uuid id;
    mdb_env *env;

    mdb_val key, data;

    mdb_stat mst;
    mdb_cursor *cursor;
    char sval[32];
    clock_t start = clock(), end;

    srandom(time(null));

    rc = mdb_env_create(&env);
    rc = mdb_env_set_mapsize(env, 1024*1024*768);
    rc = mdb_env_set_flags(env, mdb_writemap, 1); 
    rc = mdb_env_open(env, "e:\\data\\lmdb2", 0, 0664);

    key.mv_size = sizeof(uuid);
    data.mv_size = sizeof(sval);
    data.mv_data = sval;
    
    for (i=0;i<100;i++) {    
        mdb_txn *txn;
        mdb_dbi dbi;

        rc = mdb_txn_begin(env, null, 0, &txn);
        rc = mdb_open(txn, null, 0, &dbi);

        for (j= 0; j < 100; j++)
        {
            uuidcreate(&id);
            
            key.mv_data = &id;
            rc = mdb_put(txn, dbi, &key, &data, 0);
        }

        rc = mdb_txn_commit(txn);
        mdb_close(env, dbi);
    }

    end = clock();

    printf("%i", (end - start));
    fgetc(stdin);
    mdb_env_close(env);

    return 0;
}

as you can see, we are inserting 10,000 items (100 transactions of 100 items each). each item has key of 16 bytes and a value of 100 bytes. now, you might note that this is probably the worst case scenario for a b+tree, uuids are unordered, and this generates a lot of fragmentation in the tree. bad scenario, yes, but also a relatively common one, and one that needs to be handled properly for our needs. it took me a long time to narrow down what is actually going on. at first i was convinced that the problem was with windows’ implementation of memory mapped files, but eventually i realized that select ain’t broken .

here is the process monitor’s trace of the first few transactions. the highlighted calls are to flushbuffersfile, which is how flushfilebuffers look, which indicates a commit.

image

as i said, those are the first few transactions. you can see that the os is doing a pretty good job in merging writes and avoiding seeks. however, as times goes by …

image

and as more times goes by, we get to (there are more rows at the top that i had to remove so i could take the snapshot) …

image

in fact, i put the data in excel and got:

image

and those are some really bad numbers, i think you’ll agree. after the very short period of time, the cost of committing a transaction goes to over 50 seeks and writing of 350 kb.

let us compare that to what happens when we are actually writing sequential data (using uuidcreatesequential instead of uuidcreate). the test complete in half the time, and the actual statistics are also very interesting.

image

we are writing a lot less, but much more importantly, we are doing a lot less seeks. for reference, here is the final transaction that we have with the sequential write scenario:

image

i ran those tests on a hdd drive, so seek times matter a lot, but i have gotten similar results when running on an ssd.

now, the interesting question here is what exactly is going on ? and i think that a large part of this is the way lmdb allocates pages. it uses a copy-on-write method, which means that after the transaction is done, the previously used pages are free. that means that they can be reclaimed, and the next transaction will do just that. the problem with this approach is that it tends to spread writes all over the file. this saves disk space, but requires a lot more seeks when you need to commit a transaction.

that means that, in order to optimize this particular scenario, i probably need to do some thinking about the free page allocation behavior. we want to be a lot more conservative about when and how we give out those pages, to make sure that we aren’t generating a really bad situation for the database when commit time comes.



Database Process Monitor Commit (data management) Data (computing) Testing career Memory (storage engine) Tree (data structure)

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Getting Rid of Performance Testing Outliers With a New JMeter Plugin
  • How We Solved an OOM Issue in TiDB with GOMEMLIMIT
  • DZone Community Awards 2022
  • Projections/DTOs in Spring Data R2DBC

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: