DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly
  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • The Curious Case of Dead-End-Lock(Deadlock) | MySql

Trending

  • How to Convert XLS to XLSX in Java
  • Dropwizard vs. Micronaut: Unpacking the Best Framework for Microservices
  • Mastering Advanced Traffic Management in Multi-Cloud Kubernetes: Scaling With Multiple Istio Ingress Gateways
  • Software Delivery at Scale: Centralized Jenkins Pipeline for Optimal Efficiency
  1. DZone
  2. Data Engineering
  3. Databases
  4. MongoDB and its locks

MongoDB and its locks

By 
Giorgio Sironi user avatar
Giorgio Sironi
·
Dec. 06, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
27.2K Views

Join the DZone community and get the full member experience.

Join For Free

Sometimes, you need your jobs to be persisted to a database. Existing solutions such as Gearman only used relational or file-based persistence, so they were a no-go for us and we went with MongoDB.

Fast-forward a few months, and we have some problems with the database load. However, it's not that workers are pestering it too much: the problem was related to locks.

MongoDB locking model

As of 2.4, MongoDB holds write locks on an entire database for each write operation. Since atomicity is guaranteed only on a single document, this isn't usually a problem because even if you are inserting thousands of documents you are doing so in thousands of different operations that can be interleaved with queries and other inserts with a fair policy.

This sometimes results in count() queries being inconsistent as documents are moved and indexes are asynchronously updated. However, write corruption is inexistent as documents are a very cohesive entity.

However, atomic operations over a single document still lock the whole database, as in the case of findAndModify(), which looks for a document matching a certain query and updates it with a $set operation before returning it; all in a single shot and with the guarantee no other process will be able to perform the same operation of reading and writing at the same time.

You can see this operation is ideal for implementing workers based on a pull model, each asking the database for a new job to do and locking it with '$set: {locked: true}}'. However, after the number of workers increases a little bit, locks become a problem.

Lock duration

We cleaned up the working space collection of our MongoDB database by keeping in it only the unfinished jobs, and moving all the rest (completed or failed) to a different collection for archival.
As the load increases due to new contracts, we saw the locking time increase as well: the application and the workers were insisting on the same database.

The first of the problems was that after reducing the specs of our primary server, we started seeing timeouts of unrelated code even if the CPU and IO usage were low. The locks taken by workers to pick jobs were starting to take seconds or tens of seconds.

Moreover, the MongoDB server started filling the logs with:

Fri Dec  6 00:01:07 [conn280998] warning: ClientCursor::yield can't unlock b/c of recursive lock...

I'm a user, not MongoDB guru but that seems not very good, especially given hundreds of these messages were written every day (although the queues continued to work correctly.) We did not find any explanation for these messages in the documentation, but I suppose they mean some operations are taking so long that they have to yield to make room for others, but in the case of atomic operations they can't to preserve consistency.

An easy solution

Since MongoDB does not have collection-wide locks yet, we decided to move the job pool and the completed job collections to a different database. In this way, we had a main database with the usual collections and one containing just these two, named with a '_queue' suffix.

Note that we're still writing to the same database server: there is still the same number of connections being created by each process. This solution preallocates more space given two databases are involved, but as you know space is cheap nowadays.

Both insertion of jobs and worker reads must take place on the same database. Here is where we discovered cohesion pays: if you have this information in a single place it is very easy to change configuration. If you have a singleton database, because "we should only have one database in this application, it will never change" this feature would cost you a lot. Fortunately, in our case it was about 10 lines of code, including the refactoring on the Factory Methods that created MongoDB database objects.

Long term

This solution is not for the long term, as we know the numbers of machines and their workers pool will increase in the future; a sufficiently high number of workers will saturate the connections available on the MongoDB server and lock the common collection until a pick of a job takes dozens of seconds.

The design towards which we are moving includes one "foreman" to each machine, and many workers under his control; only the foreman polls the database and may lock the common collection.

Distributing the job pool is not what we want for ease of retrieval of a job in case something goes bad (ever done a query on multiple databases?). Also, we don't want a push solution as it will involve the registration of workers or foremen to a central point of failure that assignes them their jobs. Since most of our servers are shutdown and rebooted according to the user load, we prefer a dynamic solution where a server can start picking jobs whenever it wants and stop without notifying remote machines.

Database Threading MongoDB Relational database career

Opinions expressed by DZone contributors are their own.

Related

  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly
  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • The Curious Case of Dead-End-Lock(Deadlock) | MySql

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!