I was chatting with Brian Blignaut last week after the Equal Experts NoSQL event and he made an interesting observation that in this age of Polyglot Persistence we often rule out the relational database.
I think it’s definitely better that we now have many different options for where we store our data – be it as key/value pairs, documents or as a network/graph.
Having these options forces us to think more about how we’re going to read/write data in our application whereas previously our effort was focused around which tables we were going to pull out.
Having said that, I realised I’d fallen into the trap that Brian was referring to when thinking through how we could model the energy plans that users were selecting from a results table.
We wanted to run aggregate queries over the data to work out the most popular plans and suppliers on different days and then across different customer segments.
We represented each user selection as an event, stored as a document in MongoDB which worked fine to start with but queries started to become much slower as the number of documents approached 500,000 or so.
I started thinking about whether we’d be better off storing the data in a different store which was better optimised for the types of queries that we wanted to run.
My initial thought was to use one of the flashier ‘NoSQL’ databases but Ashok pointed out that the use case – slicing/dicing data at a scale (< 1 m rows) where everything would fit on disc - was perfect for something likePostgreSQL.
I realised that he was absolutely right and I think it’s important to remember that in future – when we talk about choosing the appropriate data store for our problem that doesn’t mean we have to rule out the relational database even though it’s probably more fun to use another store.