{{announcement.body}}
{{announcement.title}}

After 50 Years of SQL, Can You Imagine a Truly NoSQL Approach?

DZone 's Guide to

After 50 Years of SQL, Can You Imagine a Truly NoSQL Approach?

After using SQL for so long, is it possible to imagine a world in which a truly NoSQL database API can trump SQL?

· Database Zone ·
Free Resource

Imagine that you could use object chainable operants and operations without the mismatch between the DBMS and OOP; imagine a loose connection between the DBMS and API; think of an API database client that's more like a bridge to fetch or transfer data. Can you imagine a truly NoSQL database API?

Introduction

In a decade of investigating NoSQL systems, I noticed a huge effort from many vendors to create SQL-compatible APIs. Yes, there is a pretty good reason behind this tendency to toe the industry line. Everybody has learned to program in SQL, and everybody knows how to access a DBMS using SQL.

Well, I guess after fifty years of SQL-based DBMS there has to be a better approach to win a programmer's acceptance but it seems nobody has managed to achieve such an ambitious goal in scale. Nobody dares to propose something different to be adopted as a better solution and even those few that tried something different (the Apache Tinkpop-Gremlin language, GraphQL, and Cypher come to mind), have not managed to convince the majority of IT users to switch gears to their proposed query environment. Others were more radical in their approach and even imagined a connected semantic web, but when they designed their query language, SPARQL, it looked very similar to SQL, and even its acronym reminds us SQL.

I can start writing many reasons for why all these approaches have not reached the consensus for the majority of IT technical users. I could also state a number of facts for why the industry and several consortiums keep using SQL as their favorite DBMS query language. But with this article I would like you to focus on a very specific point, and with that perhaps I can share a bit of my imagination on how it can be possible to access a DBMS differently; with more flexibility than SQL; without losing the comfort of your programming language or the hurdles in SQL; and most importantly, while earning the interest and respect of developers.

You may also enjoy: NoSQL vs. SQL: Differences Explained

Two Programming Paradigm Approaches for A NoSQL API

What is SQL? It's a domain-specific language designed specifically to manage data in a relational DBMS, although many proponents of relational theory correctly argue that these DBMS do not follow Codd's relational principles. But that is another important story to tell some other time.

Recently, in the last decade, graph databases started becoming very popular, but interestingly, many important vendors stretched SQL language to cover graph traversal and other related operations. Columnar databases another very successful NoSQL type also followed the same tactic regarding to the query language. Therefore, although the data model and the physical layer implementation can vary a lot, SQL attempts to create an artificial unity.

But again SQL is domain-specific, started as data model specific and remains a declarative query language. The main question is what happens at the API side, i.e. at the side of the programming language ,and that is exactly where all interesting, peculiar things happen. How exactly do you connect to the DBMS server, and what is the protocol DBMS driver uses to transfer data? How well do objects match with entities and attributes in your DBMS, schema vs object model changes, nullable and partial loading of fields-atrributes, persistence, state, concurrency and caching of objects? For OOP aficionados and others, what I briefly described is known as the object-relational impedance mismatch problem, the Vietnam of Computer Science. And make no mistake: it is still "Vietnam" out there.

Chainable Operators and Operands

OOP is perhaps the most popular programming paradigm. yet because of the SQL magic spell everything had to be squeezed on a tabular form using one ORM or the other. But during all these years of my personal research and development, I noticed one specific characteristic of these ORM APIs that is related to the data retrieval approach in use. There are three major trends here, Query-By-Example (QBE) the precursor of GraphQL, Query-By-API (QBA) the precursor of Gremlin and Query-By-Language (QBL) where all the SQL like query languages are categorized, GQL is not an exception. Developers' practical experience showed that certain complex queries, particularly those with joins, were more difficult to represent in QBA and QBE, and once more SQL won again this battle. But it is not a surprise that proponents of graph database technology have only partially touched the object-relational impedance mismatch. It requires a deeper architectural design and a multi-perspective attitude than only comparing how well nodes of the graph match objects of a class.

So, was that a fair battle between QBA and QBL? The answer is no because of the SQL existing dominance in DBMS and the mimicking of SQL join operations. That is where imagination enters the game. Imagine that you could use object chainable operations using the fluent interface design method, a.k.a QBA, without the mismatch between the DBMS and OOP, i.e. imagine a loose connection between the DBMS and API, think of API database client more like a bridge to fetch or transfer data. The method of chaining query operators is very promising indeed, it can win a second round of boxing with SQL. I will briefly explain why.

OOP is based on the concept of objects that can exchange messages and modify their internal state. Therefore that kind of fluent interface is very natural in OOP languages. On the other hand, most complex queries can be visualized and processed as a data pipeline. My question to those that design GQL, the next international query language standard, is why do you keep following a dead-end declarative approach? Wouldn't it be more natural and appealing to the users to follow a data pipeline methodology by standardizing operands and operations? There is plenty of hard evidence to verify that this is not an assumption. Recall the TinkerPop-Gremlin reference above and in Python, there is Pandas, a very popular data analysis library (23621 starts at GitHub).

Functional Operations

Speaking of programming languages, what is the other very competitive paradigm in software development? Arguably it is the functional one and yet the functional requirements for such an API are more obscure. But such an API exists in one of the most powerful functional programming languages of IT industry, Mathematica. See how carefully they designed Wolfram language functions on Database-Like Operations on Datasets and Computation with Structured Datasets and how perfectly these fit with many other functions of the same language.

A Glimpse from The Future

Enough said; software engineering is not a theoretical thing, it is based on practice. I will give you an example. One of the drawbacks of SQL is that you don't have a direct access to manage data dictionary information, as this is handled more or less automatically by the DBMS. Gartner recently reported that data quality and data integration tools are incorporating data catalogs. This is certainly a step towards the right direction, and you need a query API that is capable to manage both data and metadata using a unified approach.

Consider the following code snippet in Python:

The first command demonstrates fetching metadata using OOP chainable Python methods:

Python
 







The second command demonstrates fetching metadata using a functional approach:

Python
 




xxxxxxxxxx
1
17


 
1
mis.get(nid=221, 
2
        what='fields', 
3
        select='nid, dim4, dim3, dim2, cname, alias, ntype, ctype, counter',
4
        index='dim4, dim3, dim2', out='dataframe')
5
 
          
6
Out[19]: 
7
                nid                 cname alias ntype ctype  counter
8
dim4 dim3 dim2                                                      
9
1    605  7     227              Duration    NA   FLD   CSV        0
10
          8     228            Start date    NA   FLD   CSV        0
11
          9     229              End date    NA   FLD   CSV        0
12
          10    230  Start station number    NA   FLD   CSV        0
13
          11    231         Start station    NA   FLD   CSV        0
14
          12    232    End station number    NA   FLD   CSV        0
15
          13    233           End station    NA   FLD   CSV        0
16
          14    234           Bike number    NA   FLD   CSV        0
17
          15    235           Member type    NA   FLD   CSV        0



And the third command demonstrates fetching data using again a functional approach

Python
 




xxxxxxxxxx
1


 
1
mis.get(221, what='data', out='dataframe', limit=5, offset=1000)
2
 
          
3
Out[20]: 
4
    534  2012-04-01 10:29:06  2012-04-01 10:38:00  31113  Columbia Rd & Belmont St NW  31201                                15th & P St NW  W00663  Member
5
0  1937  2012-04-01 10:29:07  2012-04-01 11:01:25  31202               14th & R St NW  31621              4th & D St NW / Judiciary Square  W00692  Casual
6
1   470  2012-04-01 10:29:17  2012-04-01 10:37:08  31104  Adams Mill & Columbia Rd NW  31200          Massachusetts Ave & Dupont Circle NW  W00020  Member
7
2   727  2012-04-01 10:29:30  2012-04-01 10:41:38  31103         16th & Harvard St NW  31200          Massachusetts Ave & Dupont Circle NW  W00880  Member
8
3  1144  2012-04-01 10:29:59  2012-04-01 10:49:03  31110     20th St & Florida Ave NW  31236        37th & O St NW / Georgetown University  W00681  Casual
9
4  1698  2012-04-01 10:30:09  2012-04-01 10:58:27  31107      Lamont & Mt Pleasant NW  31610  Eastern Market / 7th & North Carolina Ave SE  W01154  Member


Epilogue

I am sure it is not difficult for those few out there to imagine and even implement a better solution than SQL. In fact, as I mentioned in this article, that partly exists. But it is certainly far more challenging and difficult to imagine how it can be more productive, efficient, and innovative to work with a truly NoSQL API. Keep an eye for the next release of the TRIADB project and who knows, you might be convinced that this is truly a unique and valuable tool and technology to use.

Further Reading

NoSQL in Plain English

The Biggest Challenges of Moving to NoSQL

Topics:
nosql ,api ,dbms ,graph database ,sql ,oop ,database

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}