Performance is the elusive butterfly of API development. Everybody is intrigued with its beauty, yet few know how to capture it.
In the old days, the approach of many shops to ensure a performant API was to create some code and then pass it over to the wall to QA to do load testing. Later some integration testing took place. As long as the API worked and it was met some marginal performance benchmarks, things were good.
This worked well when a public, HTTP-based API, consumed by a wide variety of distributed devices was more the exception than the rule. However, today APIs are a big deal and they are everywhere, so much so that companies are posting very big infographics prominently on the front page of the New York Times to create even more awareness about the technology to the general public.
This is good news.
The rapid growth and increasing popularity of API use are causing a lot of companies to look inward, to take new views on API performance. Code, load test, and publish won’t do any longer. Companies are doing more. They are looking beyond the HTTP entry points.
Today the whole technical stack upon which an API sits is grist for the performance mill.
Look to the Data
One of the most interesting discoveries I’ve made when talking to people that publish large-scale APIs is how critical underlying data structures and data architecture is to the overall picture. Diamond DevOps is a company that does a lot of work on both sides of the API fence, consuming APIs and publishing them. I talked to one of the key technical people, Diego Woitasen (@DiegoWoitasen) co-found and tech-lead, about what he looks for when considering API performance. He came back with a two words, database indexes.
Diego’s take is that many times less experienced database developers will throw indexes on a database intended to speed reads without giving consideration to the impact on writes. To quote Diego:
We took an app from a client that we were to refactor, but in the meantime, we needed to keep the old app running. We discovered that there were 10 to 15 tables and more than 100 indexes. Indexes affect write performance and in this case, the app was used to collect data mostly. Using so many indexes was a really bad choice. You can add indexes for apps that have more read operations than write operations.
Separating read functionality from write functionality at the database level can be a critical design decision when it comes to API performance.
Using denormalization in order to separate read from write functionality proved to be a big win in terms of API performance for Dmytro Seredenko (@dseredenko) Senior Director of North American Business at EPAM Systems. According to Dmytro:
We had a requirement to expose aggregated data on visitors through the API, sliced in multiple dimensions. The underlying system was a reporting component (RDBMS) that was fed by the data from a Map-Reduce job. … it worked pretty slowly….
So we had to denormalize aggregated data stored in the Reporting RDBMS so the data could be queried quickly without complex joins. It (denormalizing) did increase the performance significantly. Since our API was read-only, we horizontally scaled RDMS through adding read-only nodes.
You can have lightning fast web servers in play up at the endpoints, but if you’re not getting the data you need, when you need it, your performance will suffer. Data architecture really does matter. However, data design is not the only consideration. Workflow process comes into play often.
It’s the Use Case
A common scenario in API usage is what I call, “a lot of state definition in, a lot of data back”.
In this type of situation, you have an API that requires you to submit a lot of information about the use case at hand. The API will do a boatload of processing on that information and return a lot of data back. I’ve experienced cases in the casting industry in which an agent will have to submit hundreds of actors for a given role and the API will have to process all of that information. Once processed, a lot of information about that submission is returned. The submission data is large, the processing is laborious, and the data return can be big too.
How to address this issue? To quote Dmytro Seredenko again, “It’s important to keep the dialog.”
Dmytro and others propose that in certain cases, it’s useful to segment processing via a number of API endpoints and to provide callback information when certain background processes complete.
Those of us that have posted a video for processing on the Internet are familiar with the pattern. You submit your video and then, once the upload is complete, the site will send you an email indicating your video is ready for viewing. Granted email notification is a pretty primitive way to transmit state information via a callback. But, it is consistent with the conversation pattern.
Typically as a site improves processing speed, email callback gets eliminated. But, getting an email is a far sight better than having a user sitting in front of a screen watching a spinning dial for tens of minutes on end.
Understanding the services your API is to deliver and figuring out how to design an architecture that segments processing into a series of dialog-like API calls will improve the overall performance of the API experience.
Still, what do you do in situations where you keep finding yourself submitting a lot of information to an API in order to get work done? This is where the notion of state caching can come into play.
Online shopping sites are essentially one big state machine. You have a lot of data in play — customers, inventory, shipments, payments, etc — all in various states of flux. Also, there are algorithms reacting to any and all state change. Online shopping can be an API performance nightmare, API all upon API call needed to select items to buy, make payment and then shipment.
The online retailer Nordstromrack.com | HauteLook is confronted with this state problem all the time. The way the company has dealt with the problem is to create a core design sensibility which all developers are to follow. Raj Murali (@rex_thuh_king ) Senior Manager of ERP Engineering at Nordstromrack.com | HauteLook, states this principle simply:
“The fastest API is one that has to do NOTHING.”
Raj and his team have devised a way in which a significant load of API work is done by background processes that store information in a distributed cache. In many cases, the work the API does is nothing more than checking the cache to determine the state of the given process. Also, their code takes full advantage of the HTTP response code standard. When a process is started via an API call, a 202 Accepted response code is returned. Later on, when an API call needs to know if a process is complete, a 200 OK response code is delivered.
Creating an API endpoint that has essentially one piece of fast, finite work goes a long way to improving API performance. Yes, there is a lot of management to be done on the backend. However, making your API endpoint essential allows you the flexibility to seek performance gains down in the stack. The more work your API has to do, and the more state it has to hold on the web server, will make it more brittle. A brittle API may be fast today and slow a week from now.
Putting it All Together
As I mentioned at the beginning of this article, there is a whole lot more to creating high-performance APIs than coding and load testing. Comprehensive design and analysis all the way through the stack, from the database to workflow process design, all the way up to HTTP access point, is critical. It’s a different way of thinking, a different perspective. There are the three fundamental takeaways to remember as we move forward.
First, give a lot of attention to how your API is writing and reading data. Be relentless in squeezing every bit of unnecessary work out of your data infrastructure. As we read above, be very careful about how you use indexes. Separate read databases from write database and synchronize data accordingly. Denormalize whenever possible. Make each of these things more efficient can add up to enough improvements in performance.
The second is to understand the use of your API as an aggregate of endpoints. Can you define relationships among your API endpoints that have a common semantic meaning? If so, can you make it so that your API endpoints can participate effectively and efficiently in a structured, self-enforcing conversation? Sometimes a lot of back and forth transmission between a publisher and a consumer can be more effective than one big, data-heavy interaction with a lot of processing burden.
The third is have your API get as close to doing nothing as is possible. If your application accesses a lot of global state information that is slow moving, can you make it so your API avoids the costly CPU utilization that comes with in-process calculation? Can you use background processes? Can you use a distributed cache to hold slow moving data that is global to all endpoints? Can you just make a simple call to another endpoint to get the information? Again, you want your API calls to be fast, without having to bear the burden of a lot of real time processing.
Consumers want information and services that are accurate and they want them fast. Thus, just to be in the game, your API needs to a level of performance that is very high.
Moving beyond the old school paradigm of code, load test, publish will open new doors in which performance is seen as an important feature of your API and not some after the fact consideration. Take a new perspective on API performance. Move beyond the endpoint perspective to one in which your entire system is really the API.
You’ll be happy you did. Your customers will be even happier.