Unless you’ve been hiding under a rock for a few years, you probably know that JSON is quickly gaining support in major database servers. Due to its use in the web front-end, JSON has overtaken XML in APIs, and it’s spreading through all the layers in the stack one step at a time.
Most major databases have supported XML in some fashion for a while, too, but developer uptake wasn’t universal. JSON adoption amongst developers is nearly universal today, however. (The king is dead, long live the king!) But how good is JSON support in the databases we know and love? We’ll do a comparison in this blog post.
What is JSON Support?
First – what does it even mean for a database to support JSON?
You could easily argue that JSON support has been in databases for a long time. After all, all you have to do is store a blob of text that is correctly formatted in the database, and applications can store, retrieve, and manipulated as usual. But the level of JSON support that we are seeing in databases today exceeds that. JSON is becoming natively supported as a datatype, with functions and operators and all of the decoration that goes along with it.
For the purposes of this article, we don’t insist upon JSON being a natively supported data type that is distinct from other datatypes. It is enough for JSON to be stored as a blob, and manipulated with functions. However, as you will see, most databases go well beyond that.
JSON Support in MySQL
MySQL is late to the party. For a while now, developers, including VividCortex, have been storing JSON in blobs in MySQL, and either manipulating it in the application, or using see compiled functions or stored procedures to manipulate it.
This has always been sub optimal. MySQL, and particularly InnoDB, are not great at storing and manipulating blob data. It can be stored inefficiently, with a lot of overhead. Native compression can be tricky to get right, so JSON blobs can take up a lot of space. And a lot of the querying and string manipulation functions in MySQL don’t do very well with UTF-8 text in some cases.
In the upcoming MySQL 5.7 version, however, this all changes. In MySQL 5.7, JSON is a natively supported data type. Internally, JSON will actually be stored as BSON. This is a compact binary format that can be translated to and from the JSON text format efficiently and easily.
In addition, MySQL 5.7 will have a set of JSON functions for manipulating JSON columns directly. One of the weirdnesses of this is that the functions are named with a
JSN_prefix. This seems a little odd, but as Morgan Tocker, community relations manager for MySQL, explained to me in an email, it is to avoid any namespace conflicts if JSON functions become an official part of the SQL standard.
Finally, MySQL 5.7 will support indexing values buried inside JSON documents by using virtual columns, another new feature included in the 5.7 release candidate.
In older versions of MySQL, a set of UDF functions has been available for quite a while for the community to install into their databases and manipulate JSON. There are also community UDFs in the MySQL UDF repository.
JSON Support in PostgreSQL
JSON has been supported for a while longer in Postgres. I have not used it personally, but friends of mine have been talking to me about it at least since the 9.2 release. In this release, JSON was a natively supported datatype, and there were a couple of functions available for querying and manipulating it.
In the 9.3 release, support was greatly extended. In addition to the functions, there are a half a dozen operators. Personally, I find these operators hard to read, and I don’t think they add much to SQL. I think they would be better off as functions. But that’s just my personal preference.
In addition to the operators, there are 10 or so new functions too.
In version 9.4, the JSON datatype is supplemented by a JSONB datatype. Another half a dozen operators come along with this, further making the SQL look like Perl. There are also a couple of dozen JSONB functions, one for each corresponding JSON function.
Perhaps one of the biggest advantages of JSONB over plaintext JSON is that it supports indexing. You can create functional indexes in Postgres, of course, so you can still index plaintext JSON; but it’s better and more efficient to index JSONB than JSON.
JSON Support in MongoDB
It may seem a little silly to include this section in this blog post, especially if you are familiar with MongoDB. However, even if it’s a little obvious, we don’t want to leave it unstated. And there are non-obvious things, too.
However, until recent versions of MongoDB, the data was stored with no compression. The recommended way of mitigating this problem was to use short field names, which wasn’t a very good alternative, frankly. In MongoDB version 3.0, the new storage engine includes compression by default, making the situation much improved. You can also use Percona’s TokuMX, which includes a storage engine with high-performance, transactions, and native compression.
Indexing is natively supported as well, naturally. This has been included since the first MongoDB releases. However, indexing has improved a lot over time, including the addition of sparse indexes (since not all documents will contain all fields you are indexing).
JSON Support in SQL Server
JSON support in Microsoft SQL server will be coming in the 2016 release of the product. However, in contrast to the other databases we have discussed, there will be no native datatype. Instead, the functionality will be quite similar to the native XML functionality that has existed in SQL Server for a long time. If you are familiar with SQL Server, you will immediately recognize the
OPENJSON()function. It operates on
Indexing will also be rather limited. No native JSON indexes; just fulltext indexing.
You can read more details on the MSDN blog.
One of the big reasons that people are interested in JSON support in databases is that they want to use fewer types of databases. The modern technology stack is beginning to introduce significant sprawl as people use different databases in particular areas, taking advantage of their strengths to gain efficiency. However, this polyglot persistence increases the technology surface area enough that it can become quite difficult to monitor, manage, develop, and operate such a diverse set of databases.
One potential answer to this problem is to continue using work horses such as MySQL and Postgres, replacing specialized JSON native databases with the new JSON functionality. For the record, MongoDB is not the only JSON native database. There are many databases, including RethinkDB, that deal with JSON natively.
A big difference between the JSON support in MySQL, Postgres, and MongoDB is that in MongoDB, this is the native transport across the wire as well. JSON is native end to end in this database, whereas in other databases it is typically shoehorned into row and column storage in the client and on the wire, as well as the programming API.
Still, keeping technology diversity down can be a big enough reason to continue using the reliable and trusted databases of yore.