Frontcache: Cache for Dynamic Pages
Frontcache: Cache for Dynamic Pages
Learn how to speed up your site for users, by implementing an application that differentiates between real users and bots when caching.
Join the DZone community and get the full member experience.Join For Free
Learn how Crafter’s Git-based content management system is reinventing modern digital experiences.
A lot of applications use Content Delivery Network [CDN] for static content (images, CSS, JS, etc). It really makes your site faster and reduces the load on your servers a lot.
A similar approach can work for dynamic pages as well: static parts of the page are cached, so dynamic parts of the page hit servers only. It can speed up your website a lot and reduce backend load times! It's something Frontcache is designed for.
For example for ‘Product details’ page cart and recommendation parts are dynamic only, so the rest part of the page can be cached.
Frontcache differentiates between HTTP requests from users versus those from bots (Googlebot, Bingbot, Baiduspider, etc). So, the same ‘Product details’ page can be handled entirely from the cache for bots but has dynamic data for customers.
And it’s really easy to integrate, regardless of the programming language you use (Java, PHP, .Net, etc). It can be used as a standalone edge in a corporate network or as servlet filter (for Java-based websites) or as a distributed cluster of edges similar to CDN networks.
Some Use Cases
Scale legacy systems for larger traffic load (including loads from bots). Legacy systems are very sensitive to the amount of concurrent traffic. A little bit higher traffic rate during hot hours can slow down the whole system a lot (even move it out of service). With Frontcache, legacy systems can be scaled up by a dozen times with minimal code change.
Cache ‘heavy’ backend requests (e.g. report generation). For example, the system generates documents in PDF format and it takes a couple seconds to complete the request. Impatient operators want to get the page faster and refresh it a couple more times - but this will slow down the whole system a lot (even move it out of service). Frontcache is designed to shield such sensitive components with fault tolerance tuning and short time caching.
How it Works
General request processing overview:
A request is checked in the cache. If no data is in the cache - hit the origin.
The response is checked for SSI (server side includes). When a response has been included - they are resolved from the cache/origin.
Completed response is sent back to the client.
Pages for the example above have the following markup:
www.coinshome.net uses Frontcache and posts real-time statistics online.
It's fun to create load and check how it's handled in real time:
1. Download the sitemap from https://www.coinshome.net/export/sitemap.txt
2. Create a crawler - as an example, I've included the following bash script (crawler.sh):
#!/bin/bash while IFS='' read -r line || [[ -n "$line" ]]; do echo "crawling $line" curl -H "Accept: text/html" -H "Accept-Encoding: gzip, deflate" -H "User-Agent: Googlebot" -o output.log $line done < "$1"
3. Run the crawler.
Check how it works online.
- Pages have a 'recent updates' section with frequently changed content.
- The 'logs to headers' feature is enabled and response headers have to trace how the page was assembled.
Cheat Sheet With Features/Key Points
Page fragment cache.
User agent specific caching (user vs bot).
Written in Java, and works with any language (Java, PHP, .Net, etc).
Fault tolerance management (based on Netflix Hystrix).
Advanced error handling with fallback configurations for URL patterns.
Advanced web based console for configs and real-time monitoring.
Published at DZone with permission of Serhiy Pavlikovskiy . See the original article here.
Opinions expressed by DZone contributors are their own.