Relational NoSQL: Yes, that is an Option
Recently, I had a conversation with someone who knows databases and database management systems very well. I had asked him for his technical opinion about FaunaDB and his comments provoked a very lively discussion. His main reservation about FaunaDB was that it is a NoSQL database and he prefers SQL DBMSs because he likes to work with data relationally.
We went on to discuss what it means for a database to be relationally correct and also have the flexibility of NoSQL. I told my database friend that FaunaDB is a real example of a “Relational NoSQL” database and that it puts the focus back on data integrity through a modern database approach unlike that of other databases. I used the picture above to help illustrate where FaunaDB fits in a data management spectrum.
As my friend was intrigued by the picture and the possibility of “Relational NoSQL”, I wanted to explore the idea of Relational NoSQL in this blog.
Why Do People Think Relational = SQL?
Tradition can be very powerful and there is a strong sense of tradition in the field of data management. For quite a few years, I was one of the people telling others that they needed to use a relational database management system that supported SQL. SQL was familiar -- people in the data management field learned SQL early in their careers and could always count on it for accessing data. They used SQL for their applications and for their ad hoc queries. Many business intelligence tools used SQL to gather data out of databases for their pretty reports and graphs. And SQL gives many novices the ability to do powerful complex analytics on huge datasets without the need for deep programming skills.
SQL Can Be Good or Bad
But there can be downsides to SQL for some use cases, too. SQL can actually be a hindrance to consistently high application performance (and low latency). It is a declarative language so in using it, you relinquish some of the control over the execution flow to the interpreter or compiler. There are some applications where you want to know exactly how your code will behave and SQL doesn’t enable that.
Talk to the systems engineers of any relational database management system company and they will tell you about that one person in the inner bowels of the company who is their SQL god or goddess, the go-to person who can look at any problem customer or demo SQL code and tell you what the execution plan will be, exactly what the code will do, and why it isn’t doing what you want it to do. It is that uncertainty and unpredictability that causes grief for application developers looking for certain and predictable high performance.
Which is Better? SQL or NoSQL? Both and Neither
I’m a believer in the “horses for courses” approach to databases. There are some use cases where a SQL-based RDBMS will work perfectly just like there are other use cases which are picture-perfect for a NoSQL DBMS. Likewise, the different flavors of NoSQL (document stores, key-value stores, and graph databases) each have their great fitting use cases as well.
Of course, there can be too much of a good thing. If you take fit-for-purpose too far, you end up in the land of polyglot persistence where you may have a different database for every application or even a bunch of different databases for a single application. That itself creates all sorts of other problems like data silos, spaghetti system architectures, and ETL complications and it tends to make both application development and database operations very non-agile.
The interesting thing to me is that people seem to conflate SQL and Relational. A similar association happened with Hadoop and Big Data – these have become almost synonymous so that anyone who thinks they have a Big Data problem naturally first thinks they need to use Hadoop to solve it. (No…Hadoop might help or, as many companies have discovered when they’ve tried that, it might not). And when SQL became synonymous with relational, NoSQL by default became non-relational.
Relational is a type of data model. It is a way of organizing data and looking at data -- seeing how it relates to other data.
SQL (as a query language) is a type of interface – it is a way of interacting with a data model, typically a relational data model.
I wonder whether some (or most) people who talk highly of SQL are actually more fans of the relational data model underneath SQL.
I think it is possible that many of these people just like working with their data relationally. They like normalizing their data – it feels “clean”. They like being able to relate different groups of data through mechanisms like joins. Relational gives you that ability to keep data separate but operate on it together with other data. It gives you the ability to eat your cake and have it, too.
How did SQL come to mean relational, and NoSQL to be dismissed as non-relational? I have seen a pattern in recent database blogs, videos, and other media posts that encourages that kind of thinking: the debate about SQL vs NoSQL often includes the word “Relational” in the discussions. It always seems to come down to “you have a choice: you can have relational (and SQL) or you can have NoSQL.”
When relational databases were first created, SQL became the first, most popular language for interfacing with them. There are people out there who think SQL was actually a poor choice for the dominant language for the then new relational database management systems. There were alternatives, like ISBL (Information Systems Base Language), and some people believe the inferior option (SQL) became the winner in that contest.
It seems like the popular thinking is that tables are the main manifestation of relations so anything that stores data in tables must be relational and if data isn’t stored in tables, it can’t be relational. I think that is way too restrictive of a use of the term relational.
I think, too, that some of the early NoSQL platform providers have been at least partially responsible for the conflating of the terms and concepts of SQL and relational. Like the Cassandra community. Cassandra can’t do relational. Cassandra doesn’t support SQL (it has its own version of a query language, CQL). You read through Cassandra documentation and you see the two concepts intertwined. Prominently shown on one of DataStax’s web pages: “Relational databases (RDBMS) have struggled to keep up with the wave of modernization, leading to the rise of NoSQL as the most viable database option for online Web and mobile applications”. DataStax goes on to offer several pieces of collateral that will instruct you on how to “Implement a Migration Strategy from Relational to NoSQL”.
Why does that have to be a choice? Why do you need to choose between NoSQL and Relational? It doesn’t. You don’t.
A NoSQL database can be relational or, perhaps more precisely, a NoSQL DBMS can support a relational view of data and relational access of data. Some multi-model database management systems, such as FaunaDB, do exactly that. On the other hand, many NoSQL databases today support SQL-like declarative languages but they aren’t relational -- supporting SQL won't make a database relational.
FaunaDB is a document store, but it also supports using graph data modeling or taking a temporal view or, as we have said, treating data relationally. FaunaDB doesn’t store data in tables but it still provides you with the ability to create relationships within your data.
Here is a Fauna multi-model code snippet from our architecture web page which illustrates graph queries, multi-level joins, indexes, and temporality – all within a single query:
Try that with a polyglot persistence architecture.
Is there a need for Relational NoSQL? We believe there is -- we see it in our own customers and we hear it in our daily conversation with prospects. We have just as many of both who have tested and rejected Cassandra as who have tested and rejected Postgres. They wanted the performance and scalability of Cassandra but also needed the relational data interface of Postgres. In FaunaDB, they found a happy combination of performance, scalability and a relational data interface.
Relational NoSQL is not only a real thing, it is something you can have today in your database. The underlying storage format of the data isn’t what matters, it is how you can access it. Multi-model databases like FaunaDB provide different interfaces into the same data, letting application developers mix and match those interfaces, even within the same query. We will show you details of FaunaDB’s Relational NoSQL capability, including how it compares to Relational SQL, in upcoming papers and blogs.
If you enjoyed this topic and want to work on systems and challenges just like this, Fauna is hiring!