Python Collections Abstract Base Classes

DZone 's Guide to

Python Collections Abstract Base Classes

Data structures and their setup are important in any language. Have a look as we explore Python's Collection Abstract Base Classes in-depth.

· Integration Zone ·
Free Resource

I love thinking about data structures, and how to organize them most efficiently for a specific task. In the normal course of programming in Python, we don't have to think about it very much - the choice between list and dict is obvious, and that's usually as far as things go.

When things get more complex, though, the Collections Abstract Base Classes can be extremely useful. In my experience, they aren't universally known about, so in this post I'll show a couple of interesting uses for them.

List-Based Set

Using a set requires that the items held within are all hashable (that is, they implement the __hash__ method).

This isn't always the case, though. For example, Django models that don't have a PK yet are unhashable, as are dicts. In these situations, it can be useful to have a data structure which acts like a set, but which is backed by a list to sidestep that requirement. Performance will be worse, but in some cases this is acceptable.

>>> s = ListBasedSet([
>>>     {
>>>         'id': 1,
>>>     },
>>>     {
>>>         'id': 2,
>>>     },
>>> ])
>>> len(s)

This can be easily achieved using the MutableSet Abstract Base Class:

import collections

class ListBasedSet(collections.MutableSet):
    store = None

    def __init__(self, items):
        self.store = list(items) or []

    def __contains__(self, item):
        return item in self.store

    def __iter__(self):
        return iter(self.store)

    def __len__(self):
        return len(self.store)

    def add(self, item):
        if item not in self.store:

    def discard(self, item):
        except ValueError:

This exposes the exact same API as a built-in set.

>>> s.add({
>>>     'id': 3,
>>> })
>>> len(s)
>>> s.clear()
>>> len(s)

Lazy-Loading and Pagination

If you have an API that paginates results, but you'd like to expose it as a simple list that can be iterated over, the Collections Abstract Base Classes are a good way to do that.

As an example, APIs often return a response with a list of objects and the total number of objects available:

    "objects": [
            "id": 1
            "id": 2
    "total": 2

In such a case, a class like the following could be used to load the data lazily, when an item in the list is accessed:

class LazyLoadedList(collections.Sequence):

    def __init__(self, url):
        self.url = url
        self.page = 0
        self.num_items = 0
        self.store = []

    def load_data(self):
        data = requests.get(self.url, params={
            'page': self.page,
        self.num_items = data['total']
        objects = data.get('objects', [])
        self.store += objects
        return len(objects)

    def __getitem__(self, index):
        while index >= len(self):
            self.page += 1
            if not self.load_data():
        return self.store[index]

    def __len__(self):
        return self.num_items

With this implementation, you can simply iterate over the list as normal and have the paginated data loaded automatically:

>>> l = LazyLoadedList('http://api.example.com/items')
>>> for item in l:
>>>     process_item(item)

At Zapier, we use something very similar to this to wrap ElasticSearch responses.

I hope these examples show some of the things that can be achieved with Python's Collections Abstract Base Classes!

abstract class, collections, dictionary, lazy loading, list, pagination, python

Published at DZone with permission of Rob Golding , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}