Ah, the world of APIs! You sprink on your application some magical powder, made from the mashed bones of curl_*() functions, and it magically integrates with external systems, giving you access to all the interesting data in the world. You have to sometimes curse the 400 Bad Request responses showing HTML instead of the request Content-Type during development, but that's due to the code missing some parameter. However, once deployed the code that performs an HTTP request you're fine, right? Not really.
Once you're calling an external API from your production servers, your application becomes part of a distributed systems. Unfortunately, this system has part outside of your administrative and geographic control. In fact, applications have always run on a set of multiple machines such as a web server and a database; however, the effects of integration are mitigated by:
- the machines being in the same LAN and datacenter. As such, the latency between TCP connections between them is order of magnitudes inferior to connecting to other systems.
- The software running on some of these machines is dependable and stable, not requiring to be updated more often than about once a year. Relational (and other kinds of) databases follow these principles, as do most of the infrastructure services (caches, monitoring).
- All the machines are under your control, so you can intervene on them when a fault comes up to restore service.
When integrating with an external service, the scenario is different:
- the actual nodes you're talking to may be in another continent.
- They may experience availability issues at any time (no one has 100% availability and the number of 9s is low for many web service not owned by Google and Facebook).
- These issues are not under your control, as you can't rollback external nodes nor their state depends on your deployments.
But are these dependencies a problem? Isn't the Facebook API supposed to be always available?
Kaizumeus is radical in that he suggests that you should never, ever call an external system in the same process that is serving an HTTP request from the client. The reason is to avoid by design any latency and availability issues coming from external services. Let's suppose you're integrating with N APIs each with an availability of 1-P: this means the probability of the first of them being down or not responding at any instant in time is P. Assuming the APIs are independent, the probability of no one of them being down at a single time is (1-P)^N: being 1-P obviously less than 1, this probability shrinks very rapidly as N increases, even for small P (probability of failure) values.
So we have to accept than when we integrate with a large number of external services, some of them will always be down. But we can decide how to deal with it.
First of all, you have to distinguish between synchronous interaction and asynchronous ones. Synchronous interactions require a response to be provided to the client, while asynchronous are accepted by a system.
Which strategy to choose depends on the scenario - data retrieval from a store is usually synchronous, while notifications, mails to be sent, and updates to other eventually consistent stores may be asynchronous.
Requests that fulfill the asynchronous model should always be offloaded to a queue (ActiveMQ) or a job system (Gearman) that executes them in another process: since you don't need a response, you can perform them without hanging the HTTP client. The synchronization of the user with these jobs can be dealt with via a pull model (polling the system) or via a push one (notification of completion).
The important advantages you gain from treating asynchronous requests like this is that their failure modes do not affect your application: a client never sees a blank screen because of a mail not being able to be sent.
Moreover, the latency of the initial interactions is decreased as you can return immediately after a job has been created. This is especially important as HTTP requests get nested into each other:
[A] -> [B] -> [C] [A] <- [B] <- [C] 200 OK 200 OK
[A]->[B] [A]<-[B] 202 Accepted [B]->[C] [B]<-[C] 202 Accepted
This store-and-forward model it's typical of messaging system, but that's not to say that it can't be implement with HTTP requests returning 202 Accepted. A pattern is not its implementation. There is a retry mechanism to be set up both on the initiating side (in case the receiving side is unreachable) and on the receiving one (to always accept incoming requests and guarantee availabilty, but being able to deal with requests later and closing the connection immediately).
Finally, an hybrid model for asynchronous requests I've seen used in production systems is to perform first request in process, but set up a retry mechanism in case this fails. This choice lets you write synchronous end2end tests without having to write many waiting conditions; at the same time, it preserves the capability of repeating requests until completion and guarantee external systems that go up and down with time will eventually be reached.
No one has perfect availability, and the more external systems are involved in the use case of your application the more failures are prone to propagate to it. Decoupling the application from external systems is often possible, with the help of queues, worker and brand new processes, and a store-and-forward approach with respect to nested interactions.