Develop a Reverse Proxy With Caching in Go
Learn to build a caching reverse proxy in Go with the standard library, featuring HTTP forwarding, in-memory caching with TTL, and compression handling.
Join the DZone community and get the full member experience.
Join For FreeReverse proxies act as a crucial intermediary layer in modern web infrastructure, sitting between clients and servers and offering additional functionality such as load balancing, SSL termination, and caching. In this article, we are going to construct a reverse proxy with HTTP response caching using Go’s standard library.
The Basic Structure
As a first step, we will declare our core data structures. We need:
- A cache to store responses
- A proxy server to forward requests
- Logic to determine what and when to cache
Here's our starting point:
package main
import (
"bytes"
"crypto/md5"
"encoding/hex"
"flag"
"fmt"
"io"
"log"
"net/http"
"net/http/httputil"
"net/url"
"sync"
"time"
)
// CacheEntry represents a cached HTTP response
type CacheEntry struct {
Response []byte
ContentType string
StatusCode int
Timestamp time.Time
Expiry time.Time
}
// Cache is a simple in-memory cache for HTTP responses
type Cache struct {
entries map[string]CacheEntry
mutex sync.RWMutex
}
// NewCache creates a new cache
func NewCache() *Cache {
return &Cache{
entries: make(map[string]CacheEntry),
}
}
// Get retrieves a cached response
func (c *Cache) Get(key string) (CacheEntry, bool) {
c.mutex.RLock()
defer c.mutex.RUnlock()
entry, found := c.entries[key]
if !found {
return CacheEntry{}, false
}
// Check if entry has expired
if time.Now().After(entry.Expiry) {
return CacheEntry{}, false
}
return entry, true
}
// Set adds a response to the cache
func (c *Cache) Set(key string, entry CacheEntry) {
c.mutex.Lock()
defer c.mutex.Unlock()
c.entries[key] = entry
}
// ReverseProxy represents our reverse proxy with caching capabilities
type ReverseProxy struct {
target *url.URL
proxy *httputil.ReverseProxy
cache *Cache
cacheTTL time.Duration
cacheableStatus map[int]bool
}
Our core structures include:
CacheEntry
– The class for holding an HTTP response and its metadataCache
– A basic in-memory cache that is thread-safe and has methods for get and setReverseProxy
– Our primary struct for associating a reverse proxy with a caching functionality
Creating a Cache Key
In order to cache requests, we need to be able to uniquely identify them. Now let's define a function that creates a cache key based on the request method, URL, and relevant headers:
// generateCacheKey creates a unique key for a request
func generateCacheKey(r *http.Request) string {
// Start with method and URL
key := r.Method + r.URL.String()
// Add relevant headers that might affect response content
// For example, Accept-Encoding, Accept-Language
if acceptEncoding := r.Header.Get("Accept-Encoding"); acceptEncoding != "" {
key += "Accept-Encoding:" + acceptEncoding
}
if acceptLanguage := r.Header.Get("Accept-Language"); acceptLanguage != "" {
key += "Accept-Language:" + acceptLanguage
}
// Create MD5 hash of the key
hasher := md5.New()
hasher.Write([]byte(key))
return hex.EncodeToString(hasher.Sum(nil))
}
We're using an MD5 hash of the request's key attributes to generate a compact, unique identifier for each cacheable request.
Building the Proxy Handler
Now let's implement the HTTP handler for our reverse proxy:
// NewReverseProxy creates a new reverse proxy with caching
func NewReverseProxy(targetURL string, cacheTTL time.Duration) (*ReverseProxy, error) {
url, err := url.Parse(targetURL)
if err != nil {
return nil, err
}
proxy := httputil.NewSingleHostReverseProxy(url)
// Initialize cacheable status codes (200, 301, 302, etc.)
cacheableStatus := map[int]bool{
http.StatusOK: true,
http.StatusMovedPermanently: true,
http.StatusFound: true,
http.StatusNotModified: true,
}
return &ReverseProxy{
target: url,
proxy: proxy,
cache: NewCache(),
cacheTTL: cacheTTL,
cacheableStatus: cacheableStatus,
}, nil
}
// ServeHTTP handles HTTP requests
func (p *ReverseProxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Only cache GET and HEAD requests
if r.Method != "GET" && r.Method != "HEAD" {
p.proxy.ServeHTTP(w, r)
return
}
// Generate cache key
key := generateCacheKey(r)
// Check if response is in cache
if entry, found := p.cache.Get(key); found {
// Serve from cache
log.Printf("Cache hit: %s %s", r.Method, r.URL.Path)
// Set response headers
w.Header().Set("Content-Type", entry.ContentType)
w.Header().Set("X-Cache", "HIT")
w.Header().Set("X-Cache-Age", time.Since(entry.Timestamp).String())
// Set status code and write response
w.WriteHeader(entry.StatusCode)
w.Write(entry.Response)
return
}
log.Printf("Cache miss: %s %s", r.Method, r.URL.Path)
// Create a custom response writer to capture the response
responseBuffer := &bytes.Buffer{}
responseWriter := &ResponseCapturer{
ResponseWriter: w,
Buffer: responseBuffer,
}
// Serve the request with our capturing writer
p.proxy.ServeHTTP(responseWriter, r)
// If status is cacheable, store in cache
if p.cacheableStatus[responseWriter.StatusCode] {
p.cache.Set(key, CacheEntry{
Response: responseBuffer.Bytes(),
ContentType: responseWriter.Header().Get("Content-Type"),
StatusCode: responseWriter.StatusCode,
Timestamp: time.Now(),
Expiry: time.Now().Add(p.cacheTTL),
})
// Set cache header
w.Header().Set("X-Cache", "MISS")
}
}
// ResponseCapturer captures response data for caching
type ResponseCapturer struct {
http.ResponseWriter
Buffer *bytes.Buffer
StatusCode int
}
// WriteHeader captures status code before writing it
func (r *ResponseCapturer) WriteHeader(statusCode int) {
r.StatusCode = statusCode
r.ResponseWriter.WriteHeader(statusCode)
}
// Write captures response data before writing it
func (r *ResponseCapturer) Write(b []byte) (int, error) {
// Write to both the original writer and our buffer
r.Buffer.Write(b)
return r.ResponseWriter.Write(b)
}
Here's what's happening:
- We create a new reverse proxy directed at a specific target URL.
- The
ServeHTTP
method handles incoming requests:- For non-GET/HEAD requests, it simply forwards them.
- For GET/HEAD requests, it checks the cache first.
- If the response is cached, it serves directly from the cache.
- If not, it forwards the request and captures the response.
- If the response is cacheable, it stores it in the cache.
- The
ResponseCapturer
is a customhttp.ResponseWriter
that records the response while passing it through.
Putting It All Together
Finally, let's implement the main
function to start our proxy:
func main() {
// Parse command line flags
port := flag.Int("port", 8080, "Port to serve on")
target := flag.String("target", "http://example.com", "Target URL to proxy")
cacheTTL := flag.Duration("cache-ttl", 5*time.Minute, "Cache TTL (e.g., 5m, 1h)")
flag.Parse()
// Create the reverse proxy
proxy, err := NewReverseProxy(*target, *cacheTTL)
if err != nil {
log.Fatal(err)
}
// Start server
server := http.Server{
Addr: fmt.Sprintf(":%d", *port),
Handler: proxy,
}
log.Printf("Reverse proxy started at :%d -> %s", *port, *target)
log.Printf("Cache TTL: %s", *cacheTTL)
if err := server.ListenAndServe(); err != nil {
log.Fatal(err)
}
}
In our main function, we:
- Parse command line flags to configure the port, target URL, and cache TTL.
- Create a new reverse proxy with caching.
- Start an HTTP server with our proxy as the handler.
Cache Management and Enhancements
Our current implementation is a good start, but a production-ready proxy needs more capabilities. Let's add cache management and some enhancements:
// Add to ReverseProxy struct
type ReverseProxy struct {
// ... existing fields
maxCacheSize int64
currentCacheSize int64
}
// Add to NewReverseProxy function
func NewReverseProxy(targetURL string, cacheTTL time.Duration, maxCacheSize int64) (*ReverseProxy, error) {
// ... existing code
return &ReverseProxy{
// ... existing fields
maxCacheSize: maxCacheSize,
currentCacheSize: 0,
}, nil
}
// Modify the Set method in Cache
func (c *Cache) Set(key string, entry CacheEntry, proxy *ReverseProxy) bool {
c.mutex.Lock()
defer c.mutex.Unlock()
// Check if we would exceed the max cache size
newSize := proxy.currentCacheSize + int64(len(entry.Response))
if proxy.maxCacheSize > 0 && newSize > proxy.maxCacheSize {
// Simple eviction policy: remove oldest entries
var keysToRemove []string
var sizeToFree int64
// Calculate how much we need to free up
sizeToFree = newSize - proxy.maxCacheSize + 1024*1024 // Free an extra MB for headroom
// Find oldest entries to remove
var entries []struct {
key string
timestamp time.Time
size int64
}
for k, v := range c.entries {
entries = append(entries, struct {
key string
timestamp time.Time
size int64
}{k, v.Timestamp, int64(len(v.Response))})
}
// Sort by timestamp (oldest first)
sort.Slice(entries, func(i, j int) bool {
return entries[i].timestamp.Before(entries[j].timestamp)
})
// Remove oldest entries until we have enough space
var freedSize int64
for _, entry := range entries {
if freedSize >= sizeToFree {
break
}
keysToRemove = append(keysToRemove, entry.key)
freedSize += entry.size
}
// Remove entries
for _, k := range keysToRemove {
oldSize := int64(len(c.entries[k].Response))
delete(c.entries, k)
proxy.currentCacheSize -= oldSize
}
log.Printf("Cache eviction: removed %d entries, freed %d bytes", len(keysToRemove), freedSize)
}
// Add the new entry
c.entries[key] = entry
proxy.currentCacheSize += int64(len(entry.Response))
return true
}
// Add a method to the Cache for cleaning expired entries
func (c *Cache) CleanExpired(proxy *ReverseProxy) {
c.mutex.Lock()
defer c.mutex.Unlock()
now := time.Now()
var freedSize int64
count := 0
for k, v := range c.entries {
if now.After(v.Expiry) {
freedSize += int64(len(v.Response))
delete(c.entries, k)
count++
}
}
proxy.currentCacheSize -= freedSize
if count > 0 {
log.Printf("Cache cleanup: removed %d expired entries, freed %d bytes", count, freedSize)
}
}
// Add a periodic cleanup in main
func main() {
// ... existing code
// Start a goroutine for periodic cache cleanup
go func() {
ticker := time.NewTicker(1 * time.Minute)
for {
select {
case <-ticker.C:
proxy.cache.CleanExpired(proxy)
}
}
}()
// ... server startup
}
These enhancements add:
- A maximum cache size limit
- A simple cache eviction policy (remove oldest entries first)
- Periodic cleanup of expired cache entries
Testing the Proxy
You can test your proxy with the following commands:
# Start the proxy targeting a public website
go build -o caching-proxy main.go
./caching-proxy -target https://news.ycombinator.com -port 8080 -cache-ttl 1m
# Make requests to the proxy
curl -v http://localhost:8080/
# Make the same request again to see cache headers
curl -v http://localhost:8080/
While this is a good starting point, a production-ready proxy would need additional features like:
- More sophisticated cache control based on HTTP headers (Cache-Control, ETag, etc.)
- Memory usage monitoring
- TLS support
- More advanced cache eviction strategies
- Request coalescing for simultaneous identical requests
- Persistent cache storage
Go's standard library provides powerful networking capabilities that make building such tools relatively straightforward. The net/http
package in particular offers a great foundation for HTTP-based network applications like proxies, load balancers, and API gateways.
Source Code
You can find the source code here.
Opinions expressed by DZone contributors are their own.
Comments