Aside from APIs, there are a good number of scenarios in which all you want to do is make a routing decision based on an HTTP header, and nothing else. There’s the “are you a bot, bro?” scenario, in which you’re trying to weed out bot-driven requests that require inspection of the HTTP “User-Agent” header. There’s the “I am sitting in front of three hosts, which one do you want? (a.k.a virtual hosting)” scenario, in which you want to direct ingress traffic to a specific host by inspecting the HTTP header, well, “host.” And there’s the “which version of this API did you want?” scenario, based on the HTTP URI header.

Image title

In these scenarios, you really only need to inspect a few HTTP headers (L7) – and standard ones at that. The concept of ingress controllers in Kubernetes is largely based on this notion, in which HTTP header values (URI or host) are used to make sure requests are routed to the appropriate service. The assumption is you don’t need to do much else with the HTTP payload at the proxy. You might need to insert X-Forwarded-For or some other custom HTTP header, but you don’t need to alter existing headers (like rewrite the URI) or inspect anything in the actual payload.

And if it’s an API you’re routing – whether on URI or host – you probably don’t want to slow it down even the nominal sub seconds added by L7. Because yes, no matter how slight, L7 inspection introduces latency. HTTP is text-based, which means it has to be parsed and then examined, and that takes CPU cycles. Moore’s Law has made this fast, but constraining inspection to a known set of strings is still going to be faster.

The premise is this: only inspect a subset of headers (the standard ones). This means you get application routing capabilities with really fast performance because the system knows exactly what it’s looking for. 

The “Fast” comes from the ability to process requests on a per-packet basis. This is why layer 4 (TCP) based load balancing is incredibly fast and scalable. It doesn’t have to buffer packets in order to reconstruct the entire HTTP request before it can inspect it in preparation to make a decision. When Layer 7 (HTTP) is in play – such as is used for app routing and situations in which the request (or response) may need modification (data scrubbing, for example) – you need a full proxy capable of buffering packets until a full HTTP request (or response) is received.

Now, you can do a lot of interesting, creative things with that, and a lot of folks do. It’s incredibly powerful to be able to intercept HTTP requests and responses in flight and execute security or performance-enhancing functions on it. But sometimes you just want to route requests based on one of a few, standard HTTP headers like User-Agent or Host or the URI.

Assuming these requests are contained in a single packet (less than 1500 bytes), a “fast” version of app routing with a proxy can be applied. This mode of a proxy combines the speed and scale of L4 load balancing with the intelligence of L7 to give you speed, scale, and smarts. That means quick parsing and decisions that consume less time and resources on the proxy, which translates to better performance of the request and response.

This use of app routing + load balancing is most useful in environments where you don’t have a lot of public IP space to waste, and you want to use it judiciously. This architectural approach means you can use just one public IP address for many different hosts. You can extend this with another tier of app routing, say an ingress controller in a Kubernetes cluster, or use another app-aware proxy to further weed out traffic (do some sharding or perhaps some app security inspection). It’s also very helpful in routing APIs, where again you’re publicly sorting out versions or service invocations based on a URI that will be forwarded to another tier of services that perform deeper inspection or more environment-specific scale.

Now, just about every proxy can perform app routing, particularly based on HTTP headers. But not every proxy can do it fast. If speed and scale are important, make sure to ask if your proxy of choice can operate in a “fast” HTTP mode.