DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Why FoundationDB Might Be All It's Cracked Up To Be

Why FoundationDB Might Be All It's Cracked Up To Be

Doug Turnbull user avatar by
Doug Turnbull
·
Mar. 08, 13 · Interview
Like (0)
Save
Tweet
Share
6.16K Views

Join the DZone community and get the full member experience.

Join For Free

When I first heard about FoundationDB, I couldn’t imagine how it could be anything but vaporware. Seemed like Unicorns crapping happy rainbows to solve all your problems. As I’m learning more about it though, I realize it could actually be something ground breaking.

NoSQL: Let's Review…

So, I need to step back and explain one reason NoSQL databases have been revolutionary. In the days of yore, we used to normalize all our data across multiple tables on a single database living on a single machine. Unfortunately, Moore’s law eventually crapped out and maybe more importantly hard drive space stopped increasing massively. Our data and demands on it only kept growing. We needed to start trying to distribute our database across multiple machines.

Turns out, its hard to maintain transactionality in a distributed, heavily normalized SQL database. As such, a lot of NoSQL systems have emerged with simpler features, many promoting a model based around some kind of single row/document/value that can be looked up/inserted with a key. Transactionality for these systems is limited a single key value entry (“row” in Cassandra/HBase or “document” in (Mongo/Couch) — we’ll just call them rows here). Rows are easily stored in a single node, although we can replicate this row to multiple nodes. Despite being replicated, it turns out transactionally working with single rows in distributed NoSQL is easier than guaranteeing transactionality of an SQL query visiting potentially many SQL tables.

There are deep design ramifications/limitations to the transactional nature of rows. First you always try to cram a lot of data related to the row’s key into a single row, ending up with massive rows of hierarchical or flat data that all relates to the row key. This lets you cover as much data as possible under the row-based transactionality guarantee. Second, as you only have a single key to use from the system, you must chose very wisely what your key will be. You may need to think hard how your data will be looked up through its whole life, it can be hard to go back. Additionally, if you need to lookup on a secondary value, you better hope that your database is friendly enough to have a secondary key feature or otherwise you’ll need to maintain secondary row for storing the relationship. Then you have the problem of working across two rows, which doesn’t fit in the transactionality guarantee. Third, you might lose the ability to perform a join across multiple rows. In most NoSQL data stores, joining is discouraged and denormalization into large rows is the encouraged best practice.

FoundationDB Is Different

FoundationDB is a distributed, sorted key-value store with support for arbitrary transactions across multiple key-values — multiple “rows” — in the database.

To understand the distinction, let me pilfer an example from their tutorial. Their tutorial models a university class signup system. You know, the same system every CS major has had to implement in their programming 101 class. Anyway, to demonstrate the potential power here, I just want to share a single function with you, the class signup function:

def attendsKey(s, c):
    """ Key for student(s) attending class(c)"""
    return fdb.tuple.pack(('attends', s, c))

def classKey(c):
    """ Key for num available seats in class"""
    return fdb.tuple.pack(('class', c))

@fdb.transactional
def signup(tr, s, c):
    rec = attendsKey(s, c) # generates key for a whether a student attends a class
    if tr[rec].present(): return # already signed up (step 3)

    seatsLeft = int(tr[classKey(c)]) ## Get the num seats left for a class
    if not seatsLeft: raise Exception('no remaining seats') ## (step 3)

    classes = tr[attendsKeys(s)] ## Count the number of "attends" records for this student
    if len(list(classes)) >= 5: raise Exception('too many classes') ## (step 4)

    tr[classKey(c)] = str(seatsLeft-1) ## decrement the available steps
    tr[rec] = '' # mark that this student attends this class

Okay, more than one function, but the other functions are just helpers to show you how keys are getting generated.

Important here is that all work is done through signups first argument, tr, this is the transaction object where all work is done. First we check for the existence of a special key that indicates whether student s is attending classc. Then in the same transaction, we work on a completely different “row” — the count of students attending a class. If we are able to, we update that count and then create a row to store the fact that that student stores that class. More important than what is actually happening here, FoundationDB is able to attempt to perform this transaction atomically across the entire cluster.

If this were a more traditional NoSQL store, we would have to take a bit more awkward tack to do this atomically. We’d have to chose either the class or the student to make the row that we can work with atomically. Implicitly, our key would become either a lookup for a class or a lookup for a student. For the sake of discussion, lets say we made our rows classes and we simply stored the id of all the students attending that class in that row. Its trivial to work on classes to add/remove students. We simply lookup a class and append the student id to sign them up.

Conceptually this model is pretty simple, but its lacking if we suddenly want to lookup students in the database. What would that query look like? Can you do it atomically? You’ll need to have another type of rows for students. Then you have to entities to work across outside of the transactionality guarantees.

FoundationDB == Unopinionated Transactions

A big reason that many NoSQL stores were simplified to the atomic row architecture is to get away from the forced large-scale transactionality (and performance hit) of SQL transactions. The solution was to go back to making everything a map and to make accesses to each entry/row/document a transaction. So we all bought into that and began working our schemas into that model.

However, at the end of the day both SQL and traditional NoSQL are both very opinionated about a transaction should be. Despite the transaction manifesto, Foundation is completely unopinionated when it comes to how you define transactions. The same signup code above could easily be implemented as two or three transactions if that was truly what was called for.

This power is expressed in how you access Foundation. Foundation gets exposed more as a library for defining transactions on an arbitrary key-value store. This narrower aim lets you write code in your language, not constrained to a second query language or awkwardly fitting your code to an ORM. Instead, You write natural code expressing the transactions that you want to perform over the key-value store. Pretty exciting stuff.

Whoah Whoah Whoah, Slow Your Roll Sparky, Looks Cool And All But Prove This Isn’t A Giant Boondongle?

Okay: Foundation is new and unproven. There are plenty of unanswered questions about it. How does it perform vs {HBase/Cassandra/Mongo/Couch/…}? What is the cost of this transactionality? At what point does its transactional architecture stop scaling? What are the trade-offs? Etc Etc

Yeah, yeah so don’t start rewriting all your database code to use Foundation, that would be pretty crazy. Nevertheless, the unopinionated, highly client-controlled notion of transactionality is ground-breaking, obviously useful, and I’m hopeful it can be successful.

Database sql FoundationDB

Published at DZone with permission of Doug Turnbull, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How To Build a Spring Boot GraalVM Image
  • Comparing Map.of() and New HashMap() in Java
  • Create Spider Chart With ReactJS
  • What Is API-First?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: