DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Using Python Libraries in Java
  • Scaling DevOps With NGINX Caching: Reducing Latency and Backend Load
  • Start Coding With Google Cloud Workstations
  • Beyond ChatGPT, AI Reasoning 2.0: Engineering AI Models With Human-Like Reasoning

Trending

  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • MySQL to PostgreSQL Database Migration: A Practical Case Study
  • Agentic AI for Automated Application Security and Vulnerability Management
  • Medallion Architecture: Why You Need It and How To Implement It With ClickHouse
  1. DZone
  2. Data Engineering
  3. Data
  4. Python: Intro to Caching

Python: Intro to Caching

In this article, we’ll look at a simple example that uses a dictionary for our cache. Then, we’ll move on to using the Python standard library’s functools module to create a cache. Read on for more info.

By 
Mike Driscoll user avatar
Mike Driscoll
·
Mar. 01, 16 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
12.3K Views

Join the DZone community and get the full member experience.

Join For Free

A cache is a way to store a limited amount of data such that future requests for said data can be retrieved faster. In this article, we’ll look at a simple example that uses a dictionary for our cache. Then, we’ll move on to using the Python standard library’s functools module to create a cache. Let’s start by creating a class that will construct our cache dictionary and then we’ll extend it as necessary. Here’s the code:

########################################################################
class MyCache:
    """"""

    #----------------------------------------------------------------------
    def __init__(self):
        """Constructor"""
        self.cache = {}
        self.max_cache_size = 10

There’s nothing in particular that’s special in this class example. We just create a simple class and set up two class variables or properties, cache and max_cache_size. The cache is just an empty dictionary while the other is self-explanatory. Let’s flesh this code out and make it actually do something:

import datetime
import random

########################################################################
class MyCache:
    """"""

    #----------------------------------------------------------------------
    def __init__(self):
        """Constructor"""
        self.cache = {}
        self.max_cache_size = 10

    #----------------------------------------------------------------------
    def __contains__(self, key):
        """
        Returns True or False depending on whether or not the key is in the 
        cache
        """
        return key in self.cache

    #----------------------------------------------------------------------
    def update(self, key, value):
        """
        Update the cache dictionary and optionally remove the oldest item
        """
        if key not in self.cache and len(self.cache) >= self.max_cache_size:
            self.remove_oldest()

        self.cache[key] = {'date_accessed': datetime.datetime.now(),
                           'value': value}

    #----------------------------------------------------------------------
    def remove_oldest(self):
        """
        Remove the entry that has the oldest accessed date
        """
        oldest_entry = None
        for key in self.cache:
            if oldest_entry == None:
                oldest_entry = key
            elif self.cache[key]['date_accessed'] < self.cache[oldest_entry][
                'date_accessed']:
                oldest_entry = key
        self.cache.pop(oldest_entry)

    #----------------------------------------------------------------------
    @property
    def size(self):
        """
        Return the size of the cache
        """
        return len(self.cache)

Here we import the datetime and random modules, and then we see the class we created earlier. This time around, we add a few methods. One of the methods is a magic method called __contains__. I’m abusing it a little, but the basic idea is that it will allow us to check the class instance to see if it contains the key we’re looking for. The update method will update our cache dictionary with the new key / value pair. It will also remove the oldest entry if the maximum cache value is reached or exceeded. The remove_oldest method actually does the removing of the oldest entry in the dictionary, which in this case means the item that has the oldest access date. Finally, we have a property called size which returns the size of our cache.

If you add the following code, we can test that the cache works as expected:

if __name__ == '__main__':
    # Test the cache
    keys = ['test', 'red', 'fox', 'fence', 'junk',
            'other', 'alpha', 'bravo', 'cal', 'devo',
            'ele']
    s = 'abcdefghijklmnop'
    cache = MyCache()
    for i, key in enumerate(keys):
        if key in cache:
            continue
        else:
            value = ''.join([random.choice(s) for i in range(20)])
            cache.update(key, value)
        print("#%s iterations, #%s cached entries" % (i+1, cache.size))
    print

In this example, we set up a bunch of predefined keys and loop over them. We add keys to the cache if they don’t already exist. The piece missing is a way to update the date accessed, but I’ll leave that as an exercise for the reader. If you run this code, you’ll notice that when the cache fills up, it starts deleting the old entries appropriately.

Now, let’s move on and take a look at another way of creating caches using Python’s built-in functools module!

Using functools.lru_cache

The functools module provides a handy decorator called lru_cache. Note that it was added in 3.2. According to the documentation, it will "wrap a function with a memorizing callable that saves up to the maxsize most recent calls." Let’s write a quick function based on the example from the documentation that will grab various web pages. In this case, we’ll be grabbing pages from the Python documentation site.

import urllib.error
import urllib.request

from functools import lru_cache


@lru_cache(maxsize=24)
def get_webpage(module):
    """
    Gets the specified Python module web page
    """    
    webpage = "https://docs.python.org/3/library/{}.html".format(module)
    try:
        with urllib.request.urlopen(webpage) as request:
            return request.read()
    except urllib.error.HTTPError:
        return None

if __name__ == '__main__':
    modules = ['functools', 'collections', 'os', 'sys']
    for module in modules:
        page = get_webpage(module)
        if page:
            print("{} module page found".format(module))

In the code above, we decorate our get_webpage function with lru_cache and set its max size to 24 calls. Then we set up a webpage string variable and pass in which module we want our function to fetch. I found that this works best if you run it in a Python interpreter, such as IDLE. This allows you to run the loop a couple of times against the function. What you will quickly see is that the first time it runs the code, the output is printed our relatively slowly. But if you run it again in the same session, you will see that it prints immediately which demonstrates that the lru_cache has cached the calls correctly. Give this a try in your own interpreter instance to see the results for yourself.

There is also a typed parameter that we can pass to the decorator. It is a Boolean that tells the decorator to cache arguments of different types separately if typed is set to True.

Wrapping Up

Now you know a little about writing your own caches with Python. It’s a fun tool and very useful if you’re making a lot of expensive I/O calls or if you want to cache something like login credentials. Have fun!

Cache (computing) Python (language)

Published at DZone with permission of Mike Driscoll, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Using Python Libraries in Java
  • Scaling DevOps With NGINX Caching: Reducing Latency and Backend Load
  • Start Coding With Google Cloud Workstations
  • Beyond ChatGPT, AI Reasoning 2.0: Engineering AI Models With Human-Like Reasoning

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!