Don't Throw Away Your Old Java Web Framework: The SPI History of Twitter
Hear about how Twitter's conservative architectural revolution in 2012 changed the site from a client-centric SPI to a server-centric, more SEO compatible site.
Twitter.com is one of the most popular websites in the world, but few people know that is also one of the few Single Page Interface, stateless, SEO compatible websites in the world.
It is "stateless" in the sense that Twitter ensures that servers do not have information about the status of the user page being loaded—i.e. web session data. This allows requests to arrive at any node in a server cluster without sharing sessions or needing server affinity. Looking at the AJAX requests, Twitter sends an id representing the temporary state of the user's page saying something like "previous stuff is already loaded, I want new things."
As we will see later, this SPI approach is server centric or hybrid (even though it has a lot of programming client), but Twitter has not reached the current implementation in the first attempt. There was previously an SPI client centric implementation.
First version: client-centric
We all know Twitter's REST API, which returns user activity data in JSON format. This API was very popular in alternative Twitter clients until the company introduced limitations that harmed the popularity of these readers. By then the Twitter website itself was a consumer of it's own REST API so that the browser was a real Twitter client for logged in users...
The title "client-centric" means that the HTML is rendered from data. Where and when HTML is rendered from server data is the big architectural part of a web application, in this case, it is the client.
Hashbangs are also SEO compatible because Google has been supporting them for many years.
When Google sees: http://twitter.com/#!jmarranz
Google will load: http://twitter.com/?_escaped_fragment_=jmarranz
Second (and current) version: server-centric (or hybrid)
In early 2012 a change occurred at Twitter web engineering which apparently could be described as a conservative revolution, an apparent return to pages, to the first pre-SPI Twitter website. A retro-revolution led by Dan Webb, principal engineer in Twitter.
At the same time one of the key developers of the client-centric SPI at Twitter left the company for a startup.
Dan Webb seemed an avowed enemy of the Single Page Interface according to an article in his blog against hashbangs, a cornerstone idea for providing SPI, bookmarking, and SEO compatibility in any browser:
At the end of the Twitter blog entry, it seems to be a light and hope for SPI:
"What’s next? We’re currently rolling out this new architecture across the site. Once our pages are running on this new foundation, we will do more to further improve performance. For example, we will implement the History API to allow partial page reloads in browsers that support it, and begin to overhaul the server side of the application."
The key words are: "History API".
I myself was alarmed to read about hashbangs being under attack by the chief engineer of Twitter, one of the major drivers of the SPI, and tried to "dissuade" Dan:
I was crazy enough to make this proposal:
"In JSON and AJAX requests, avoid your own REST API in server, render your page chunks, and inject the markup into the page with inner HTML as much as possible"
Dan Webb's response seems to consider using some partial updates via AJAX, but not using hashbangs. The History API is proposed instead (not possible in IE 6-8 browsers).
The last tweet says:
"we made our perf decisions based on data. It's not about liking or not liking a technique. It's about what we prove is fastest."
The current server-centric (or hybrid) SPI approach of Twitter.com
The main motivation behind the new hybrid architecture was performance:
"That architecture broke new ground by offering a number of advantages over a more traditional approach, but it lacked support for various optimizations available only on the server"
- Any publicly loaded page is initially the same for users, logged in or not (maybe bots). This ends the dual model website for SEO support.
- It follows a Single Page Interface approach but avoiding hashbangs. Instead the History API is used. The History API is not available in older browsers with AJAX support like IE 6-8. In these minority browsers, it is accepted that user navigation is poorly paged.
What is the point of this article for a Java (or general backend) developer?
Well, a lot, considering that there is a trend towards 100% client-centric applications accessing the server via REST APIs returning JSON data.
Therefore, don't throw away your old web framework, especially your template processor, maybe you're going to need it again :)