DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • HTTP API: Key Skills for Smooth Integration and Operation, Part 2
  • API Gateway Cache for POST Method
  • That Can Not Be Tested!: Spring Cache and Retry
  • The Power of Caching: Boosting API Performance and Scalability

Trending

  • Detection and Mitigation of Lateral Movement in Cloud Networks
  • Creating a Web Project: Caching for Performance Optimization
  • Exploring Intercooler.js: Simplify AJAX With HTML Attributes
  • Cloud Security and Privacy: Best Practices to Mitigate the Risks
  1. DZone
  2. Data Engineering
  3. Data
  4. Web Resource Caching: Client-Side

Web Resource Caching: Client-Side

Learn more about client-side web resource caching.

By 
Nicolas Fränkel user avatar
Nicolas Fränkel
DZone Core CORE ·
Nov. 30, 22 · Analysis
Likes (14)
Comment
Save
Tweet
Share
8.0K Views

Join the DZone community and get the full member experience.

Join For Free

The subject of Web resource caching is as old as the World Wide Web itself. However, I'd like to offer an as-exhaustive-as-possible catalogue of how one can improve performance by caching. Web resource caching can happen in two different places: client-side - on the browser and server-side. This post is dedicated to the former; the next post will focus on the latter.

Caching 101

The idea behind caching is simple: if a resource is time- or resource-consuming to compute, do it once and store the result. When somebody requests the resource afterwards, return the stored result instead of computing it a second time. It looks simple - and it is, but the devil is in the detail, as they say.

The problem is that a "computation" is not a mathematical one. In mathematics, the result of a computation is constant over time. On the Web, the resource you requested yesterday may be different if you request it today. Think about the weather forecast, for example. It all boils down to two related concepts: freshness and staleness.

A fresh response is one whose age has not yet exceeded its freshness lifetime. Conversely, a stale response is one where it has.

A response's freshness lifetime is the length of time between its generation by the origin server and its expiration time. An explicit expiration time is the time at which the origin server intends that a stored response can no longer be used by a Cache without further validation, whereas a heuristic expiration time is assigned by a Cache when no explicit expiration time is available. A response's age is the time that has passed since it was generated by, or successfully validated with, the origin server.

When a response is "fresh" in the cache, it can be used to satisfy subsequent requests without contacting the origin server, thereby improving efficiency.

-- RFC 7234 - 4.2. Freshness

Early Web Resource Caching

Remember that the WWW was relatively simple at its beginning compared to nowadays. The client would send a request, and the server would return the requested resource. When the resource was a page, whether it was a static page or a server-rendered page was unimportant. Hence, early client-side caching was pretty "rustic".

The first specification of Web caching is defined in RFC 7234, aka HTTP/1.1 Caching, in 2014. Note that it has been superseded by RFC 9111 since 2022.

I won't talk here about the Pragma HTTP header since it's deprecated. The most straightforward cache management is through the Expire response header. When the server returns the resource, it specifies after which timestamp the cache is stale. The browser has two options when a cached resource is requested:

  • Either the current time is before the expiry timestamp: the resource is considered fresh, and the browser serves it from the local cache
  • Or it's after: the resource is considered stale, and the browser requires the resource from the server as it was not cached

The benefit of Expire is that it's a purely local decision. It doesn't need to send a request to the server. However, it has two main issues:

  • The decision to use the locally cached resource (or not) is based on heuristics. The resource may have changed server-side despite the Expiry value being in the future, so the browser serves as an out-of-date resource. Conversely, the browser may send a request because the time has expired, but the resource hasn't changed.
  • Moreover, Expire is pretty basic. A resource is either fresh or stale; either return it  Cache or send the request again. We may want to have more control.

Cache-Control to the Rescue

The Cache-Control header aims to address the following requirements:

  • Never cache a resource at all
  • Validate if a resource should be served from the cache before serving it
  • Can intermediate caches (proxies) cache the resource?

Cache-Control is an HTTP header used on the request and the response? The header can contain different directives separated by commas. Exact directives vary depending on whether they're part of the request or the response.

All in all, Cache-Control is quite complex. It might be well the subject of a dedicated post; I won't paraphrase the specification.

However, here's a visual help on how to configure Cache-Control response headers.


The Cache-Control page of Mozilla Developer Network has some significant use cases of Cache-Control, complete with configuration.

As Expire, Cache-Control is also local: the browser serves the resource from its cache, if needed, without any request to the server.

Last-Modified and ETag

To avoid the risk of serving an out-of-date resource, the browser must send a request to the server. Enters the Last-Modified response header. Last-Modified works in conjunction with the If-Modified-Since request header:

The If-Modified-Since request HTTP header makes the request conditional: the server sends back the requested resource, with a 200 status, only if it has been last modified after the given date. If the resource has not been modified since, the response is a 304 without any body; the Last-Modified response header of a previous request contains the date of last modification. Unlike If-Unmodified-Since, If-Modified-Since can only be used with a GET or HEAD.

-- If-Modified-Since

Let's use a diagram to make clear how they interact:


Note: the If-Unmodified-Since has the opposite function for POST and other non-idempotent methods. It returns a 412 Precondition Failed HTTP error to avoid overwriting resources that have changed.

The problem with timestamps in distributed systems is that it's impossible to guarantee that all clocks in the system have the same time. Clocks drift at different paces and need to synchronize to the same time at regular intervals. Hence, if the server that generated the Last-Modified header and the one that receives the If-Modified-Since header is different, the results could be unexpected depending on their drift. Note that it also applies to the Expire header.

Etags are an alternative to timestamps to avoid the above issue. The server computes the hash of the served resource and sends the ETag header containing the value along with the resource. When a new request comes in  If-None-Match containing the hash value, the server compares it with the current hash. If they match, it returns an 304 as above.

It has the slight overhead of computing the hash vs. just handing the timestamp, but it's nowadays considered a good practice.

The Cache API

The most recent way to cache on the client side is via the Cache API. It offers a general cache interface: you can think of it as a local key value provided by the browser.

Here are the provided methods:


Cache.match(request, options) Returns a Promise that resolves the response associated with the first matching request in the Cache object.
Cache.matchAll(request, options) Returns a Promise that resolves to an array of all matching responses in the Cache object.
Cache.add(request) Takes a URL, retrieves it and adds the resulting response object to the given cache. This is functionally equivalent to calling fetch(), then using put() to add the results to the cache.
Cache.addAll(requests) Takes an array of URLs, retrieves them, and adds the resulting response objects to the given cache.
Cache.put(request, response) Takes both a request and its response and adds it to the given cache.
Cache.delete(request, options) Finds the Cache entry whose key is the request, returning a Promise that resolves to true if a matching Cache entry is found and deleted. If no Cache entry is found, the Promise resolves to false.
Cache.keys(request, options) Returns a Promise that resolves to an array of Cache keys.


The Cache API works in conjunction with Service Workers. The flow is simple:

  1. You register a service worker on a URL
  2. The browser calls the worker before the URL fetch call
  3. From the worker, you can return resources from the cache and avoid any request to the server

It allows us to put resources in the cache after the initial load so that the client can work offline - depending on the use case.

Summary

Here's a summary of the above alternatives to cache resources client-side.

Order

 

Alternative

 

Managed by

 

Local

 

Pros

 

Cons

 

1

 

Service worker + Cache API

 

You

 

Yes

 

Flexible

 

  • Requires JavaScript coding skills
  • Coding and maintenance time
2

 

Expire

 

Browser

 

Yes

 

Easy configuration

 

  • Guess-based
  • Simplistic
Cache-Control

 

Browser

 

Yes

 

Fine-grained control

 

  • Guess-based
  • Complex configuration
3

 

Last-Modified

 

Browser

 

No

 

Just works

 

Sensible to clock drift

 

ETag

 

Browser

 

No

 

Just works

 

Slightly more resource-sensitive to compute the hash


Note that those alternatives aren't exclusive. You may have a short Expire header and rely on ETag. You should probably use both a level 2 alternative and a level 3 one.

A Bit of Practice

Let's put the theory that we have seen above into practice. I'll set up a two-tiered HTTP cache:

  • The first tier caches resources locally for 10 seconds using Cache-Control
  • The second tier uses ETag to avoid optimizing the data load over the network

I'll use Apache APISIX. APISIX sits on the shoulder of giants, namely NGINX. NGINX adds ETag response headers by default.

We only need to add the Cache-Control response header. We achieve it with the response-rewrite plugin:

YAML
 
upstreams:
  - id: 1
    type: roundrobin
    nodes:
      "content:8080": 1
routes:
  - uri: /*
    upstream_id: 1
    plugins:
      response-rewrite:
        headers:
          set:
            Cache-Control: "max-age=10"


Let's do it without a browser first.

Shell
 
curl -v localhost:9080
Plain Text
 
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 147
Connection: keep-alive
Date: Thu, 24 Nov 2022 08:21:36 GMT
Accept-Ranges: bytes
Last-Modified: Wed, 23 Nov 2022 13:58:55 GMT
ETag: "637e271f-93"
Server: APISIX/3.0.0
Cache-Control: max-age=10


To prevent the server from sending the same resource, we can use the ETag value in an If-None-Match request header:

Shell
 
curl -H 'If-None-Match: "637e271f-93"' -v localhost:9080


The result is a 304 Not Modified as expected:

Plain Text
 
HTTP/1.1 304 Not Modified
Content-Type: text/html; charset=utf-8
Content-Length: 147
Connection: keep-alive
Date: Thu, 24 Nov 2022 08:26:17 GMT
Accept-Ranges: bytes
Last-Modified: Wed, 23 Nov 2022 13:58:55 GMT
ETag: "637e271f-93"
Server: APISIX/3.0.0
Cache-Control: max-age=10


Now, we can do the same inside a browser. If we use the resend feature a second time before 10 seconds have passed, the browser returns the resource from the cache without sending the request to the server.

Conclusion

In this post, I described several alternatives to cache web resources: Expiry and Cache-Control, Last-Modified and ETag, and the Cache API and web workers.

You can easily set the HTTP response headers via a reverse proxy or an API Gateway. With Apache APISIX, ETags are enabled by default, and other headers are easily set up.

In the next post, I will describe caching server-side.

You can find the source code for this post on GitHub.

To go further:

  • RFC 7234: HTTP/1.1: Caching (obsolete)
  • RFC 9111: HTTP Caching
  • HTTP caching
  • Cache-Control
  • Cache API
API Cache (computing)

Published at DZone with permission of Nicolas Fränkel, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • HTTP API: Key Skills for Smooth Integration and Operation, Part 2
  • API Gateway Cache for POST Method
  • That Can Not Be Tested!: Spring Cache and Retry
  • The Power of Caching: Boosting API Performance and Scalability

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!