Improving Web Performance With Varnish

Table of Contents

Introduction to Varnish - Performance matters Getting started with Varnish HTTP and built-in VCL: How cache control boosts performance Varnish Configuration Language (VCL): Control and flexibility for customization Cache invalidation: Granular control with purging and banning Manage your backends: Health, directors, and grace Cache hit or miss? Improve your hit rate to boost performance Monitor with logging, measuring, and debugging Better performance — Better security

Section 1

Introduction to Varnish - Performance matters

Web performance is globally cited by companies, across industries, as one of the biggest challenges in delivering content. For an optimal user experience, your websites and apps have to work fast, at any scale and with resilience. This is where caching technology comes into play.

Varnish Cache, at its most basic, is a reverse caching proxy, but is much more than that. It’s a piece of software that stands in front of your web server(s) to reduce the load times of your website, application or API by caching the server’s output. It also protects your origin server from being overloaded by too many requests, ensuring uptime and stability. We’ll be talking about the building blocks of web performance and how to ensure that performance is both speedy (how long does it take to load the page?) and stable (is performance stable when load increases?).

When you don’t cache, that process is repeated with every client request, wasting time and resources re-computing identical data. Caching is not a trick to compensate for poor performing systems; caching is an architectural decision that, properly implemented, will increase efficiency and reduce infrastructure cost.

Section 2

Getting started with Varnish

Now that you know what Varnish can do to help combat web performance problems, how do you get started with a basic Varnish installation?

Install and verify Varnish

Most real-world installations will be on a Linux system. Supported Linux distributions include: Ubuntu, Debian, Red Hat, and CentOS. Varnish can be installed easily from the package manager of your operating system, but you can also compile Varnish from source if you wish. You can install Varnish from the AWS marketplace here.

There will be slight differences in your installation depending on your OS and distribution*. Once you have finished installing Varnish, verify your Varnish version by running varnishd -V.

Configure Varnish

Next you will configure your settings so you can really start using Varnish. These startup options are located in the configuration file and are assigned to the varnishd program.

Common Startup Configurations

Just getting started with configuring your Varnish involves a few important options (more advanced ones that help boost performance and cache control come later):

DAEMON_OPTS=" -a :80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
              -S /etc/varnish/secret \
 -s malloc, 3g \ 

Some of these options include:

Network binding, address, and port where Varnish processes incoming HTTP requests
CLI address binding, address, and port where the Varnish CLI runs
VCL file locations, where caching policies live
Security options
Storage options

Add Backends

Varnish needs to know which backends it can talk to. Open the default.vcl file, which comes with your Varnish installation, and define the backend(s) you want Varnish to cache from.

*Very thorough, step-by-step instructions on distributions, installations and configurations are detailed in both the Varnish Book and the Getting started with Varnish Cache book from O’Reilly.

Section 3

HTTP and built-in VCL: How cache control boosts performance

Varnish does HTTP - and only HTTP. For that reason, it is important that you be familiar with HTTP and how it behaves. Varnish is an HTTP accelerator and uses HTTP best practices to decide what gets cached, how it gets cached, and for how long it remains cached. With built-in VCL and caching rules, you see performance gains because you get more control over your cache, which ensures not only speed-related performance but also in other ways, such as cache freshness.

When you install Varnish, it comes with two VCL files:

builtin.vcl: the default behavior of Varnish and a basic configuration for caching. If you don’t apply any changes to it Varnish will cache your website using the rules set in that VCL file. These rules are a set of best practices for HTTP caching, i.e. do not cache if the request has an authorization or a cookie header:
if (req.http.Authorization || req.http.Cookie) { /* Not cacheable by default */ return (pass); }
default.vcl: the VCL file you want to modify to implement the preferred Varnish behavior. Open your favorite editor and edit /etc/varnish/default.vcl.

Worth noting: your default.vcl file will, by default, have the builtin.vcl appended. You can avoid this setting a return(action) within your VCL file.

Once installed and configured, Varnish can already do a lot for you because of the aforementioned default behavior expressed by the built-in Varnish Configurational Language (VCL) that dictates the set of rules that Varnish follows. These rules manage everything from deciding what is cacheable, when to bypass the cache, how to identify an object (and cache it or not), what to do when an object is not stored in cache, and TTL considerations.

Five ways Varnish respects HTTP best practices:

Idempotence: Varnish will only cache by default resources that are requested through an idempotent HTTP verb, which is an HTTP verb that does not change the state of the resource (i.e. GET and HEAD requests).
State: Pertaining to user-specific data and how to handle it, state keeps track of specific information, such as authorization headers and cookies.
Expiration: Expiration is all about setting a time-to-live (TTL), as cached objects cannot live forever, for both storage limitation and “freshness” reasons, i.e. keeping the cache up-to-date. HTTP has two different kinds of response headers that handle expiration — expires and cache-control. Note: Other headers, such as “Age,” can provide some indication of the freshness of an object as well.
Conditional requests: After cached items expire, the headers and the payload are transmitted and stored in cache, which could be resource intensive, especially if the requested data has not changed. HTTP allows you to keep track of the validity of a resource beyond relying on TTL limits and have more control over what is kept in cache. Some examples of conditional request headers include Etag and Last-Modified.
Cache variations: HTTP uses the Vary header to perform cache variations. A very common example is language detection based on the Accept-Language request header. The cache needs this level of instruction to understand differences in requests. The cache stores the object based on the first request, so if the first request was made in one language (e.g. French), all other users thereafter being served content from cache would receive the content in French for the duration of the cache lifetime. The Vary header instructs the cache to keep a separate version of the cached object based on the Accept-Language value of the request.

Section 4

Varnish Configuration Language (VCL): Control and flexibility for customization

While many reverse caching proxies exist, Varnish Cache is known for its unusual flexibility, made possible by the Varnish Configuration Language (VCL), a domain-specific language that lets the user control Varnish's behavior. VCL will allow you to hook into the finite state machine of Varnish to programmatically extend its behavior with various subroutines, objects, and variables.

But VCL is more than expressing and controlling the behavior of the software. Because VCL is exposed to a rich API through its objects there are no limits to the level of detail to which you can tune Varnish.

Very simple VCL example

With a few lines of VCL you can “X-cache” a response header whose value says whether a piece of content is cached or not.

sub vcl_deliver {
        if (obj.hits > 0) {
                set resp.http.X-Cache = "HIT";
        } else {
                set resp.http.X-Cache = "MISS";
        }
}

vmod-http example

Using a VMOD, a Varnish module, the behavior and set of features of Varnish can be extended.

In this example we send a request to another service and include the response cookie. This is the starting point for implementing your authentication and authorization mechanism within Varnish using pure VCL.

​x
vcl 4.0;
​
import http;
​
sub vcl_recv
{
    http.init(0);
​
    http.req_copy_headers(0);
    http.req_set_url(0, "https://example.com/api/authorize");
​
    // Send the request, we will read the response later
    http.req_send(0);
}
​
sub vcl_deliver
{
    // Block for the response
    http.resp_wait(0);
​
    if (http.resp_get_status(0) == 200) {
        set resp.http.Set-Cookie = http.resp_get_header(0, "Set-Cookie", "session=ERROR");
    } else {
        return(synth(401, "UNAUTHORIZED"));
    }
}

Section 5

Cache invalidation: Granular control with purging and banning

You now know that you can cache just about everything, but that with granular control over your cache, you should put some solid cache invalidation strategies in place to ensure that objects in your cache do not live too long. Why does this matter so much? Ultimately, this is one point when you start to think about the big picture for the end-user: keeping data in cache for too long means that you are not always offering your end-user the most up-to-date data available. When you run a news or e-commerce website, on-the-fly updates are crucial. Performance from the end-user’s viewpoint is about receiving the freshest information as well as the speed at which they can access it.

Your first step is to become conversant with the Cache-control, Expires and other cache-related headers. Your next step is to acknowledge that sometimes objects are out of date even before their TTL expires, meaning that you need to take different actions. You can’t necessarily set your TTL lower without jeopardizing the health and responsiveness of your backend, but you still don’t want to keep outdated objects in cache. Low TTLs also means that objects get evicted often, which requires resources that can slow things down.

Varnish allows for evicting objects from cache regardless of TTL: with VCL, you can use code to actively invalidate objects. One way is with purging and the other is banning.

Purging

Purging is the easiest way to invalidate the cache. In the following example, you can see that in VCL you can perform a return (purge) from within the vcl_recv subroutine. This will explicitly evict the object from cache. The object will be identified by the criteria set in vcl_hash, so by default that is the hostname and the URL. Memory is freed up immediately and cache variations are also evicted.

acl purge {
   "Localhost";
"192.168.55.0"/24;
}
​
sub vcl_recv {
# allow PURGE from localhost and 192.168.55…
​
if (req.method == "PURGE") {
if (!client.ip ~ purge) { 
return(synth(403,"Not allowed."));
}
return (purge);
}
}

Purging is easy: it uses the object’s hash; it evicts that one object, and it can be executed with a simple return (purge).

Banning

Sometimes, though, you may have a large number of purges to perform or you may not be sure which resources are stale, meaning that exact URL invalidations might be too limiting for your needs. URL pattern-based invalidation solves this problem, which is what banning is. Bans use a regular expression match to mark objects that should be removed from cache, which are then added to the ban list. Banning does not remove items from cache immediately (it only makes them unusable until their TTL ends or until the ban_lurker thread wakes up and cleans them) and hence does not free up any memory directly. Purging, on the other hand, evicts an object immediately from cache.

Here’s a basic BAN example:

acl ban {
"Localhost";
"192.168.55.0"/24;
}
​
sub vcl_recv {
if (req.method == "BAN") {
if (!client.ip ~ ban) {
return(synth(403, "Not allowed."));
​
}
ban("req.http.host == " + req.http.host +
" && req.url ~ " + req.url);
return(synth(200, "Ban added"));
}
}

Section 6

Manage your backends: Health, directors, and grace

Setting up your caching strategy and tactics is done in part to avoid having to use your backends too much. You need a backend to cache objects in the first place. Using Varnish to manage your backends and configure access to them is another way that you can take control over performance-related matters.

First and foremost, you will want to introduce the backend to Varnish. There are two ways to do this automatically: By adding a backend to your VCL file or by omitting VCL completely and using the -b flag at startup time.

A VCL backend definition is easy; all you need are the host address and the port the backend is listening on:

backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

However, there are often multiple backends, and you want to be able to control which request goes to which backend. You can

define multiple backends and use req.backend_hint to assign a backend other than the default one. By incorporating the req.backend_hint variable in our VCL logic, we can perform content-aware load balancing so that each backend can be tuned to its specific task.

Backend health

A basic backend health check can easily be undertaken.

You can view the health of your backends by executing the following command: varnishadm backend.list

This could be the output of that command:

Backend name	Admin	Probe
boot.public	probe	Healthy 10/10
boot.admin	probe	Healthy 10/10

Both backends are listed and their health is automatically checked by a probe (you can read up on how to analyze health probes in the Varnish Book). In the example above, the backend is considered healthy because 10 out of 10 checks were successful. You can define what you consider a healthy backend for your Varnish setup as the health probe is configurable.

Directors

A director is a built-in VMOD (Varnish module) that groups several backends and presents them as one. Directors offer different decision-making strategies to decide which request is handled by which backend. The goal of directors is to avoid backend latency by distributing requests across multiple backends: by balancing out backend requests, backend servers will experience

less load, which reduces latency for horizontal scalability.

Besides pure load balancing, directors also make sure that unhealthy backend nodes are not used, routing requests to, another healthy node, making up a high availability strategy.

In order to use directors, you first need to import the directors VMOD and initialize it in vcl_init.

There are different director distribution strategies, such as round-robin, which distributes loads equally, and each backend takes its turn sequentially.

Here’s an example of a round-robin director declaration that uses three backends:

sub vcl_init {
new loadbalancing = directors.round_robin();
loadbalancing.add_backend(backend1);
loadbalancing.add_backend(backend2);
loadbalancing.add_backend(backend3);
}

After we declare the director, it needs to be assigned in vcl_recv:

sub vcl_recv {
set req.backend_hint = loadbalancing.backend();
}

Round-robin is a good approach, but it is not suitable for every situation. For example, in cases in which your backend servers don’t have the same server resources, you would be forcing the server with the least amount of memory or CPU to share an equal load with better-resourced servers.

There are several other useful directors in Varnish, including random director, hash director, and the fallback director, which you can learn about in the O'Reilly book "Getting started with Varnish Cache”.

Clearly you will want to set your directors up for maximum stability and resilience, but in cases where no backend is available, what happens? You either end up delivering nothing (which is a no-go situation for most businesses), or at least delivering the last-known-good version. This is where Grace mode comes in. In Varnish you can assign a certain period of grace time, letting Varnish serve objects in cache beyond their TTL. These objects will continue to be served as long as there’s no updated object for a duration defined by the grace time. In many cases, Grace mode has ensured that end-users barely notice, if they notice at all, any “hiccups” in content being served.

Section 7

Cache hit or miss? Improve your hit rate to boost performance

When you get started with Varnish, you might already be in a less-than-ideal situation and need Varnish to make performance improvements. So far, we have assumed that you are implementing Varnish in ideal conditions. So what kinds of common mistakes can you prepare for to ensure that you achieve a high cache hit rate and high-performance content delivery?

Some (but certainly not all) common mistakes when getting started with Varnish:

Caching too aggressively (and caching items that should not be cached).
Very low hit rate due to lack of awareness of built-in VCL.
Not understanding hit-for-pass: Varnish creates a hit-for-pass object when fetched objects cannot be cached. Varnish will cache the decision not to cache for 120 seconds. Objects in the hit-for-pass cache will not be queued like regular misses and will directly make a backend connection.
Forgetting about the consequences of return statements. Most people know that Varnish doesn’t cache POST requests by default, and they assume that the Varnish engine will deal with that.
Purging without purge logic: Don’t forget that purge support needs to be implemented in your VCL file.
No-purge ACL: Remember to protect access to your purge logic via an ACL.
Accidentally caching your 404 responses.
Not being aware of cache variations, such as the Accept-Language header example described earlier. Know how and when to use the Vary header.

There are many other common mistakes you could examine to try to avoid them yourself, but there are also bigger questions you should pose when setting up your caching strategy for maximum performance:

Should you really cache static assets, such as never-changing images, PDFs, or videos? There is no right or wrong answer necessarily, but it is a question you should ask yourself. If images and other static assets are filling up your cache, you can take steps to keep them out of cache or bypass the cache.
Should you create URL blacklists and whitelists to control what gets cached and what does not?
How will you manage cookies and cookie variations?
Can you optimize your URLs to avoid unnecessary cache misses, e.g. by removing the port, with query string sorting, by removing Google Analytics URL parameters, removing the URL hash, etc.?
Can you set a hit/miss marker to easily see if the page you’re seeing is the result of a cache hit or a cache miss?
How can you manage content blocks, which will not be cached if your header section is not cacheable, e.g. with AJAX or Edge Side Includes (ESI)?

By asking these kinds of questions and finding out that Varnish has ways for you to build a high-performance, highly cacheable application, you can see that flexibility is key to letting you control your content delivery and to ensure stable performance.

Section 8

Monitor with logging, measuring, and debugging

Getting Varnish running is one thing, but it’s important to monitor just how effective it is in meeting your needs and in ensuring continued high performance. With logging, measuring, and debugging, you can get ahead of potential problems as well as easily troubleshoot the problems you don’t see coming. With granular monitoring, you also have a powerful tool you can use to gain insight into performance issues. Monitoring can be seen as a kind of “nerve center” for tuning and tightening performance, seeing “blind spots,” and fixing them.

Varnish offers ways to debug and measure the HTTP requests, the HTTP responses, and the cache:

varnishstat: displays statistics of a running Varnish instance. General statistics about connections, sessions, backends, storage, hit rate, and much more. This is a good dashboard for sysadmins.
varnishlog: reads the shared memory logs and displays this information in real time. The log is displayed as a list of tags and values, which is unfiltered and extremely verbose. Once you are able to interpret this data and filter it properly, though, you have highly detailed information about the state of a request. varnishlog is essential for debugging.
varnishtop: Like varnishlog, varnishtop uses the shared memory log as varnishlog and uses the same concept of tags and values. Its output, however, is different, presenting a continuously updated list of the most commonly occurring log entries

All of these use the shared memory log.

Section 9

Better performance — Better security

Securing performance with caching and smart caching policies is one thing. But caching can also protect against DDoS and similar attacks. Security in Varnish involves client- and backend-side SSL/TLS integration. Open-source Hitch can be leveraged for the client-side SSL/TLS implementation while the back-end side SSL/TLS option is really a switch-on, switch-off function, available in the commercial version of Varnish. The option for total encryption of the cache to guard against vulnerabilities such as CloudBleed and Meltdown using dual key AES 256 encryption is also available from Varnish Software.