Document Database Granularity Level for the Relational Minded in 2 Minutes
Schemaless is not about having no schema. It's about having a flexible, inferred schema.
When we explain what NoSQL is and how Couchbase fits in that picture, we usually get questions about Relational Databases, and we can't avoid comparing the two. While their architectures are quite different, we can find some similarities in concepts. This is a two-minute read to understand those similarities.
Table/Column vs. Document
Let's start by taking a developer perspective. In her code, the developer often manipulates domain objects. Some of these objects are read from a database or persisted in a database. In a relational DB, an object is represented by one or more Rows from Tables made of Columns. A column has a name and a type. In a document DB, a document is traditionally a key/value pair where the value is a JSON document (a JSON string). We use the term Document DB when the key/value DB gives the developer a way to query the document based on the value.
A JSON document can have multiple fields and each field has a name and a type. So if you push the analogy further, JSON allows you to have a flexible set of columns on a per document basis. It means that you don't have to define every table or column once. You have the flexibility to modify them as you see fit in a Document DB. This is really what being schemaless is about. It's not about having no schema, it's about being able to change it easily. The typed fields of the JSON document makes the schema—a flexible, inferred schema.
I cannot bring the granularity topic on the table and not talk about our new Sub Document API. This is coming up in Couchbase 4.5. In a relational database, you can modify or read specific fields of your object because you can modify or read a particular row of a particular column. The equivalent with a JSON document would be to modify or read one particular field of your JSON document. And, this is exactly what the Sub Document API allows you to do.
When you run a SQL query, the FROM clause refers to tables. Couchbase also has a declarative query language called N1QL. It's a superset of SQL and as such allows you to do JOINs and other cool stuff. And since there are no tables in Couchbase, the FROM clause is applied to a Bucket. The JOIN queries are also made between Buckets. And, this brings us to the next topic.
Schema vs. Bucket
Tables and columns are usually stored in what is called a Schema or Database (with other things like stored procedures, materialized views, counters and more). With Couchbase, documents are stored in a Bucket. The set of options and capabilities is quite different. If we take security for example: in a relational database, you can go as far as specifying permissions on a particular column of a table. In a Bucket, you can only store k/v pairs. So you can only grant permissions on this. Which means you can only grant read or read/write permission on a Bucket for a particular user. So if you want to enforce security it has to be done on the application level. This is one of the trade-offs you have to have right now when choosing a NoSQL DB; some of the application logic that was in the Relational DB will shift to the application layer. We can argue if this is a good thing or not.
As a developer, I prefer to have logic in code, but this is just my opinion. We can start a discussion about this on the comments below :) Let us know what you think! We could also bring the topic beyond this scope and talk about clusters, replication, and sharding, but it would not be a 2-minute read anymore!
Comments