DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Securing Your Software Supply Chain with JFrog and Azure
Register Today

Trending

  • Strategies for Reducing Total Cost of Ownership (TCO) For Integration Solutions
  • What ChatGPT Needs Is Context
  • Integrating AWS With Salesforce Using Terraform
  • Tactics and Strategies on Software Development: How To Reach Successful Software [Video]

Trending

  • Strategies for Reducing Total Cost of Ownership (TCO) For Integration Solutions
  • What ChatGPT Needs Is Context
  • Integrating AWS With Salesforce Using Terraform
  • Tactics and Strategies on Software Development: How To Reach Successful Software [Video]
  1. DZone
  2. Data Engineering
  3. Databases
  4. Using a Relational DBMS as a Multi Server Concurrency Control

Using a Relational DBMS as a Multi Server Concurrency Control

Ricardo Zuasti user avatar by
Ricardo Zuasti
·
Mar. 29, 12 · Interview
Like (0)
Save
Tweet
Share
5.86K Views

Join the DZone community and get the full member experience.

Join For Free

Sometimes you just don’t want or need the complexity of a fancy distributed transaction manager or complicated RPCs to coordinate a concurrent job amongst a farm of servers.

Here is a very simple and efficient way to achieve a lock/release type of concurrency control in a farm-like architecture using nothing but a few SQL sentences and your relational database.

Imagine the following scenario: A web app that runs on multiple servers/instances where users activity is balanced. All servers are clones, and you want to keep it that way so adding/removing servers from the farm remains easy and safe; and you need to run some background scheduled job, not triggered by user activities.

To keep things simple (operations wise), you want to keep those batch jobs inside your app and you don’t want/can’t afford to integrate a fancy multi-server transaction manager.

For example, you want to implement a background job that periodically checks the price of stocks and sends an email to all subscribed users. Since all the servers in your farm are clones, this job will be running on all of them, and that’s a good thing, because if one fails any of the other can pick it up. But on the other hand you don’t want to send 25 emails to each user every time, if you have 25 servers in your farm.

So our goal is to ensure only one server performs the job, but if it doesn’t, any other can realize it and do it himself.

The solution is really simple, we define a table in our database with the following structure:

CREATE TABLE job_status (
    job_id  varchar(254) NOT NULL,
    running_at  varchar(254) NULL,
    last_run_started datetime NULL,
    next_run datetime NULL,
    PRIMARY KEY(job_id)
)

And pre-insert one row for each job (only one in our example, but it’s nice to have support for several from scratch, right?) with only the job_id field set (all other fields can start as NULL).

Assuming we code our batch job as an endless loop with a pause between runs, when each cycle starts we should execute the following statement:

UPDATE job_status
SET running_at=<NODE_IP_OR_ADDRESS>,
    last_run=<CURRENT_DATETIME>,
    next_run=<CALCULATED_DATETIME_NEXT_RUN>
WHERE job_id=<JOB_IDENTIFICATION> AND
      (
        (running_at IS NULL AND next_run <= <CURRENT_DATETTIME>)
        OR
        (running_at IS NOT NULL AND last_run_started <= <CALCULATED_TIMEOUT>)
      )

Where:

  • <NODE_IP_OR_ADDRESS>: Identification of the node in the farm where the job is running
  • <CURRENT_DATETIME>: Current system date and time
  • <CALCULATED_DATETIME_NEXT_RUN>: The date and time when the job should run again assuming this run finishes correctly.
  • <JOB_IDENTIFICATION>: A string that uniquely indentifies the job
  • <CALCULATED_TIMEOUT>: A calculated date and time (usually the current system time minus a pre-defined value) upon which the job is considered “dead” and should be picked up by another node.

In most relational DBMS you get the number of updated rows for each UPDATE command as a result of the UPDATE itself (or you can embed a second query in the same database transaction that returns it). If the number of updated rows was 1, the executing node should go ahead and execute the job now, if the number was 0 it should go back to sleep until the next iteration.

The UPDATE “locks” the job (setting the running_at field to the executing node address) only if the job was due to be executed and no other node had started it, or if the job was in fact being executed by another node, but so long ago it was considered timed out. Combine this with the ACID guarantees of a RDBMS and we got ourselves a multi-node semaphore.

To wrap things up, if the current node actually got to execute the job, it should finish the iteration by releasing the lock with:

UPDATE job_status SET running_at=NULL
WHERE job_id=<JOB_IDENTIFICATION>

Nice, isn’t it? No messy RPC, no fancy transaction manager and our app has scheduled jobs with farm support.

 

 

 

 

Relational database career Dbm Database

Published at DZone with permission of Ricardo Zuasti, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Strategies for Reducing Total Cost of Ownership (TCO) For Integration Solutions
  • What ChatGPT Needs Is Context
  • Integrating AWS With Salesforce Using Terraform
  • Tactics and Strategies on Software Development: How To Reach Successful Software [Video]

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: