Glipho: Under the hood of the social blogging startup
Glipho is a social blogging startup, founded in 2012. As an unrestricted platform, Glipho has a mix of registered users and non-registered users, many of whom regularly return without signing up. In October, the site had close to 70,000 unique visitors to the web version of the site alone.
I talked to Glipho CTO James Toyer about the technology driving the social blogging platform.
Under the hood of Glipho is the document database MongoDB. When asked why Glipho rejected a more traditional solution, Toyer says part of it comes from experience. In a previous company Toyer says they had always used a SQL solution of some kind and, while both MySQL solved some of the problems that they were having with SQL Server, “it still didn't get away from the fact that ultimately creating and delivering content is document based”.
Every post on Glipho has a set of fields that are specific to displaying that piece. Toyer says that from past experience he knows that you will create mammoth SQL queries to load a simple article with this solution. If you needed to add a field, Toyer says, with MySQL you would have to introduce a column in a table -- needing either downtime, or planning far in advance.
For Glipho, it just made sense to go with something that had a flexible schema and was better a solution for their needs.
But why MongoDB over the alternatives? All application code for Glipho is written in C#, why didn’t they go with RavenDB?
When people learn Glipho’s application code is written in C# there are inevitably questions like "Shouldn't you be using SQL Server or RavenDB?" But Toyer says he doesn’t see why it should surprise people. MongoDB is one of the most used NoSQL databases, with db-engines.com showing a continued upward trend in popularity. And its popularity is one of the reasons Glipho chose to use MongoDB over other solutions.
The decision for the database behind Glipho was between MongoDB, RavenDB and CouchDB. Toyer says they discounted the leading graph database Neo4j early on because he felt they needed a document store rather than a graph store -- though he feels like there is some “cool stuff” Glipho could do with graphing their data in the future
Toyer also says Cassandra was discounted early on for being too big and too distributed for what Glipho needed, at least at the beginning.
While Toyer says he had heard good things about the scalability of databases like MongoDB, he wasn’t sold on any particular one of them -- so it came down to a three-way contest between MongoDB, RavenDB and CouchDB.
For Glipho, the three databases were judged on several factors including popularity, documentation, .Net support and community. Since Glipho would be using Amazon Web Services (AWS), Toyer was also looking for examples of the two being used together.
While Toyer admits that “most popular” does not necessarily mean the same thing as “best”, he considers the ease of use of MongoDB to be a main driver behind its popularity. Toyer also says that because MongoDB is popular it has an active and supportive community behind it.
Another area where MongoDB won out over RavenDB and CouchDB was the high profile examples of people using it on AWS, such as Foursquare (though Foursquare have since moved nearly all of their data from AWS to their own datacenters).
Not that it has been all plain sailing for Glipho and MongoDB, with Toyer admitting there have been issues with performance that he says “came down to a combination of my own naivety and running on AWS”.
By default, running MongoDB allows you to run queries without having indexes for the query. While Toyer admits this is fine when you run it locally on an SSD, or when you have very little data, it becomes a problem when you've got more than a few thousand documents in a collection.
Toyer says: “Every time you query it has to scan the disk, which is quite inefficient. This is made worse by the EBS (Elastic Block Storage) used on AWS instances. This effectively runs over a network which has variable latency. This can means it becomes very slow to run queries, which ultimately slows down your site.”
Glipho have since spent a lot of time going over queries to make them more efficient, and ensuring there are indexes on the queries they run. For more background on the issue and improvements made, you can see a presentation James Toyer made to the London MongoDB User Group on the importance of indexes here: http://glipho.com/james/the-importances-of-indexes-in-mongodb.
Dictated by the technologies used to write their application code as well as their use of AWS, Glipho use Redis and DynamoDB to sync data cross webservers.
Web frameworks, like ASP.Net, store data related to your session on the server in order to optimise your browsing experience. ASP.Net's session store is written in such a way that you change the "provider" or store type in one place without having to change your code. The default providers are "in-memory" and using SQL Server, but neither was a suitable option for Glipho. However, AWS offer a provider for DynamoDB -- which is a dramatically lower cost for a startup like Glipho, when compared to the price of running an instance of SQL Server.
When it comes to Redis, Glipho use SignalR (Socket.IO is a popular Node.js equivalent) to push notifications to users instantly. As Glipho have multiple servers for the site a user could open two tabs but be connected to different servers. In order to ensure that all tabs keep in sync the servers need to be kept in sync with each other. Toyer explains that SignalR has a library you can drop in that takes advantage of the "Pub/Sub" feature of Redis. In the future, Toyer hopes to be able to go one step further to improving performance by moving everything from DynamoDB onto Redis, ensuring everything is one the one server.
Handlebars was chosen to be the templating library we used for HTML in client side code. Backbone, through a dependant library called Underscore.js, has templating built into it, but Toyer says that because it uses a lot of angle brackets and percentage signs it reminds him of “the pre MVC days in ASP.Net” and has given him “an irrational hatred of any templating langauge that uses them”. Toyer describes Handlebars.js as being “a slightly more fully featured version of Moustache.js” and rates that it can be precompiled to reduce the payload going to the client.
Toyer says “I feel like the world of client side web applications is changing so fast at the moment it's almost mindblowing. One day you think you've got your head around a direction you want to use, the next it's all changed and that's before you even looked outside the frameworks and libraries you're using”.
Since the decision to use Backbone.js and Handlebars.js was made, Toyer says he feels that Angular.js has taken off as a more fully-featured framework, and says he looks forward to replacing Glipho’s Backbone and Handlebars code with Angular.js.
Follow Glipho CTO James Toyer on Twitter here.