Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Web Caching Strategy

DZone's Guide to

Web Caching Strategy

A discussion of how organizations can use several different web caching strategies, such as cache-control and caching dynamic datasets, in their applications.

· Web Dev Zone ·
Free Resource

Learn how error monitoring with Sentry closes the gap between the product team and your customers. With Sentry, you can focus on what you do best: building and scaling software that makes your users’ lives better.

I am currently designing a high performant IoT solution for one of my clients. This system is also going to expose internal APIs to their clients so that they can build their systems/apps on top of it.

There will be web applications which means they will be consuming near-static/dynamic endpoints serving the data requirements. To reduce latency and enhance scalability, HTTP caching is one of the techniques I am proposing. In this article, I am going through some of the known terminologies which all of us think about while designing such systems.

I have replaced the client's system name in this document, for obvious reasons, with 'Beehive.'

HTTP Cache

HTTP caching allows us to cache the full output of a request and bypasses the application for subsequent requests. However, caching entire responses isn’t always possible even for near-static datasets.

Which means, not every response from Beehive's API endpoints are cacheable and the API specification document indicates that. Beehive employs an “Entity Tags” based caching approach when there is no shared agreement on time.

Cache-Control

Beehive API responses use the following four cache-control headers.

“no-store”

No browser or intermediate caches.

“no-cache”

"no-cache" indicates that the browser should validate with the server if the cache is still up-to-date.

“max-age”

“max-age” sets the counter from now as TTL for the response. For example, "max-age=60" indicates that the response can be cached and reused for the next 60 seconds.

“last-modified”

“last-modified” specifies the time when the server believes that the resource was last modified.

Conditional Requests

In these instances, the browser asks the server if the response has been updated. The browser sends some information about the cached resource and the server determines where the updated content should be returned or if the browser’s copy is the most recent. When the latter is true then HTTP status code 304 is returned.

Beehive uses both time- and content-based conditional requests. For near-static datasets, a time-based conditional request is used, whereas, for dynamic datasets, content-based conditional requests are used.

Near-Static Datasets: Time-Based Requests

For near-static datasets which live for sufficiently long periods of time, Beehive uses time-based conditional requests. APIs will use the “last-modified” cache control. If the cached copy is the latest copy of the data then the server returns the 304 status code.

To use conditional requests, the server specifies the last modified time of a resource using the Last-Modified response header.

Cache-Control: max-age=31536000
Last-Modified: Thu, 24 May 2018 12:45:57 GMT

On the next request, the browser sends the Last-Modified value as an If-Modified-Since request header.

If-Modified-Since: Thu, 24 May 2018 12:45:57 GMT

If the resource has not been modified since Thu, 24 May 2018 12:45:57 GMT, then the server returns an empty body with the 304 response code.

Caching Dynamic Datasets: Content-Based Requests

Each cacheable API endpoint uses one of the two Cache-Controls - “no-cache” or “max-age=?” -alongside an entity tags header. Entity tag headers allow the server to identify if the cached contents of the resource are up-to-date or not.

“Entity Tags”: Validation Token

Entity tags are used to communicate a validation token to the HTTP client. The server generates a validation token which is the hash of the response payload and sends the validation token as an entity tag header. The client sends back the validation token in the If-None-Match HTTP request header.

Let’s assume the response payload is “{"payload": "empty-payload"}” then the md5 hash of the payload will be 7b982db9faa6611afd5d375619fee5a0.

The server will send the hash with the response as an ETag header.

Cache-Control:public, max-age=60
ETag: "7b982db9faa6611afd5d375619fee5a0"

On subsequent requests, the browser will send an If-None-Match request header with the ETag value.

If-None-Match: "7b982db9faa6611afd5d375619fee5a0"

If the latest resource has the same ETag server, it will return an empty payload with a 304 HTTP status code.

Negative Caching

Beehive caches these 3xx, 4xx, and 5xx responses:

300 - Multiple Choices
301 - Moved Permanently
400 - Bad Request
401 - Unauthorized 
403 - Forbidden
404 - Not Found 
500 - Internal Server Error

What’s the best way to boost the efficiency of your product team and ship with confidence? Check out this ebook to learn how Sentry's real-time error monitoring helps developers stay in their workflow to fix bugs before the user even knows there’s a problem.

Topics:
caching ,http ,web dev ,http requests

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}