Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Systematic Content Validation with Varnish

DZone's Guide to

Systematic Content Validation with Varnish

Sometimes, doing the unthinkable is necessary. Varnish indeed can solve some problems in twisted (yet still effective) ways.

· DevOps Zone
Free Resource

Download “The DevOps Journey - From Waterfall to Continuous Delivery” to learn learn about the importance of integrating automated testing into the DevOps workflow, brought to you in partnership with Sauce Labs.

"Can Varnish go to the backend and check the content freshness for every single request?"

Sometimes, when someone wants to do something unusual with Varnish, I tend to reply like a true developer with, "Let me explain to you why you don't need it." The problem with this is that we don't operate in an ideal world, and most of the time, architecture isn't dictated by what should be done, but by what must work NOW! (Here's a depressing read, if you need one.)

I'm telling you this because in this post, we are going to solve a problem in a twisted way — but, given your specific requirements, it may be the best way. Along the way, I obviously won't resist the urge to explain why it's twisted and what the other solutions are, but feel free to skip those parts.

The Issue (and the Issue With the Issue)

The Etag header is an opaque string returned by the server that should uniquely identify a version of an HTTP object. It's used for content validation, allowing the client to say to the server, "I have this version of the object; please either tell me it's still valid or send me the new object."

Systematic, synchronous revalidation (flowchart)

What we would like is for Varnish to leverage this Etag header and systematically ask the backend if the object we are about to deliver is fresh — something along those lines:

Systematic, synchronous revalidation (sequence)

I have a conceptual problem with this. It goes against the cache's goal of shielding your server. With the proposed setup, we have to bother the backend for every request, even if using 304 responses or something similar will largely reduce the load since they have no body.

There's also a functional problem; it's synchronous, so if your backend fails or slows down, your users will suffer delays, which is, again, against the idea of a caching layer.

The Purge — Actually a Good Solution (Unlike the Movie)

We need to backtrack a bit here and think about what our issue really is. It's not that we want to go to the backend all the time, but rather that we don't want to deliver outdated content — a commendable goal. The backend knows about content freshness, so it's definitely the right source of information, but what if, instead of asking it, we could make it tell us directly when something changes?

Using purges, bans, or xkey, it's possible to remove content from the cache based on URL, regex, or semantic tags. If your backend is able to trigger an HTTP request when its content changes, you're good to go! A lot of CMS do it, such as eZ Publish, WordPress, or Magento, among others.

Backend-driven purge

With this, you can set your TTLs to weeks or months to ensure super high hit ratios, almost never fetching from the backend, and still deliver up-to-date content. This seems to be the perfect solution. So, why isn't everybody using it?

The first thing is that the backend needs to be aware of all the Varnish caches to purge, obviously. This generally means that adding a new Varnish server triggers a reconfiguration of the backend. Using the Varnish Administration Console (and the bundled Super Fast Purger), this becomes a non-issue because new Varnish servers will register to it when created, allowing the VAC to know about all the caches:

New node registering to the VACThen, instead of having the backend purging the Varnish boxes directly, it sends only one purge request to the VAC that will broadcast it:

VAC-broadcasted purgeIf your backend supports it, this is definitely the best option.

Saving Grace

However, your backend may not be able to trigger those HTTP requests, or you may not have access to said backend. In this case, the previous solution is out of your reach, sadly.

Fret not, though. I do have another choice to offer you, and it doesn't involve cache invalidation. It's actually super simple:

sub vcl_backend_response {    set beresp.ttl = 0s;    /* set ttl to 0 second */
    set beresp.grace = 1d;  /* set grace to 1 day  */    set beresp.keep = 1d;   /* keep the content for 1 extra day                               allowing content revalidation   */
}

Grace is Varnish's implementation of HTTP's stale-while-revalidate mechanism. Very simply, this is the period of time Varnish can keep an expired object and still serve it if it has nothing better to offer. Since Varnish 4.0 grace is totally asynchronous, the backend fetch is happening in the background, taking advantage of the If-None-Match header to minimize bandwidth.

In the (extreme) VCL example above, the object gets a TTL of zero seconds, meaning it's already expired. Varnish will get a new version as soon as it's requested — but in the background. At this point, you may wonder how that could be awesome — by removing the synchronousness we just rob ourselves of on-time updates, right? That's correct, but we gain something that may be more important: request coalescing.

For one request or URL, Varnish will only trigger one background fetch at a time, allowing your backend to survive sudden surges of traffic, notably if requests take a long time.

sysreval-seq-3.png

However, you can object that if your object changes and you can't afford to deliver a single out-of-date request, that's not a viable solution. Again, you'd be right. I would, however, ask, "Can you really ever ensure this?"

The object may very well change on the backend right after the response is sent and before it is received. This is part of the weakness of HTTP, and we have to be aware of it.

With that said...

A Developer's Gotta Do What a Developer's Gotta Do

At the risk of being horrible, I need to hammer this point home: Make sure this is your last resort because you won't be taking the easy path.

We now arrive where Varnish shines: uncharted territory. Varnish is great because it's fast and efficient, but above all, it refuses to define policies. This is the whole idea behind the VCL: the user can have almost full control over what's going on instead of being limited to the path chosen by the tool.

That allows us to abuse the system and twist its arm to do our bidding. We can do it with a pure VCL solution! However, before we do that, let's have a brief reminder of what the VCL is and does (skip that one if you already know).

The Varnish Configuration Language basically maps to state-machine processing requests from the moment they are received to the moment they are delivered. At each step, the VCL is called, asked what should be done (tweak headers, remove query strings, etc.) and what the next step is. A simple representation of the state machine could be as shown in this flowchart:

VCL state machine

The problem is, that doesn't really map with the flowchart we came up with the first time:

Systematic, synchronous revalidation (flowchart), yes, againNo worries. We'll make it work. To do so, we'll use restarts extensively, allowing us to go back to vcl_recv and start processing the request again without resetting it. We basically have two paths to cover:

MISS — object is not in cache:

  1. Proceed as usual. We'll fetch the object, put it in the cache, and deliver. It'll obviously be fresh. Yes, that was the easy one.

HIT — object is in cache:

  1. Save the Etag.
  2. Restart because we need to go to the backend.
  3. Pass because we don't necessarily want to put the object in the cache.
  4. Set the method to HEAD because we don't care about the body — only the Etag header.
  5. If the backend replies with an Etag differing from the one we have, kill the object.
  6. Restart.
  7. Once there, just proceed as normal.

The HIT path will look like this when mapped on the state-machine:

sysreval-flow+fsm.png

Note: We could use If-None-Math and test the status code (200 or 304) instead of comparing Etags, but we'd still want to use HEAD to make sure only headers are sent.

This is where things get a bit ugly. Since we'll be going through some steps multiple times, we have to keep track of what state we'll be in and route accordingly. Now, here's what you have been waiting for this whole post — the VCL:

sub vcl_recv {
if (req.restarts == 0) {
set req.http.x-state = "cache_check";
return (hash);
} else if (req.http.x-state == "backend_check") {
return (pass);
} else {
return (hash);
}
}

sub vcl_hit {
if (req.http.x-state == "cache_check") {
set req.http.x-state = "backend_check";
set req.http.etag = obj.http.etag;
return (restart);
} else {
return (deliver);
}
}

sub vcl_backend_fetch {
if (bereq.http.x-state == "backend_check") {
set bereq.method = "HEAD";
set bereq.http.method = "HEAD";
}
}

sub vcl_backend_response {
if (bereq.http.x-state == "backend_check") {
if (bereq.http.etag != beresp.http.etag) {
ban("obj.http.etag == " + bereq.http.etag);
}
}
}

sub vcl_deliver {
if (req.http.x-state == "backend_check") {
set req.http.x-state = "valid";
return (restart);
}
}

This is obviously a minimal version of it and you will have to work a bit to adapt it to your own setup, as always with code using restarts. I also took a few shortcuts for the sake of clarity, such as banning only based on the Etag, or systematically returning, hence bypassing the built-in VCL code.

Faced with such a piece of code, your first instinct should be skepticism. Does that even work? It does work, and I have the test case to prove it! Actually, I built the VCL directly in the VTC, making development super easy.

A Matter of Choice

Part of the appeal I find in Varnish is explained in this blog. Every time a strange use case appears, my reptilian brain screams, "Nah, you shouldn't do that! That is not the way!" Then, the dev neurons fire up, asking, "Yeah, but if I had to, could I do it?" Pretty much invariably, the answer is, "Sure, just put some elbow grease into it, and that'll work."

Varnish is not a complex tool but rather is composed of a lot of simple cogs. Once you know how they are articulated together, well, the weird requests become fun challenges.

Through this long post, we've seen how to answer the original question, but, I hope, we also expanded our horizons a bit by touching on quite an array of subjects such as:

  • VCL.
  • Varnish Administration Console.
  • Purging.
  • Objects in grace.

Discover how to optimize your DevOps workflows with our cloud-based automated testing infrastructure, brought to you in partnership with Sauce Labs

Topics:
varnish ,cache ,devops ,tutorial ,systematic content validation

Published at DZone with permission of Guillaume Quintard, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}