Async Support in Django

This article is about the web framework for perfectionists with deadlines, as well as Django’s lack of async support like an Enhancement or RFC.

Alexey Shepelev

CORE ·

Sep. 20, 21 · Analysis

Likes (9)

Comment

Save

6.1K Views

Hello, my dear readers! Yes, this article is about the web framework for perfectionists with deadlines, as well as Django’s lack of async support. It’s more like an Enhancement Proposal (less formal than it could be) or RFC. So, if you like that sort of thing, you might be interested.

Django Foundation has also considered the issue of adding async support. Their discussions have resulted in DEP-09, which describes the current approximate roadmap. I have even discovered that my post doesn’t contradict it. It’s just that it has very little information about async-native support. It is considered the last stage that still needs to be reached. This reminds me of a meme about how to draw an owl: first, we draw two circles, and then we finish the rest.

Anyway, let’s try to make Django asynchronous, or the Django ORM, to be exact. I forgot to say the following: I think the Django ORM is the main obstacle to adding async support in Django since it is its largest part. The next important thing is that the ORM contains the set of assumptions and features that at least make Django recognizable.

So, we’re talking about the ORM. We need asynchronous database drivers. Of course, we already have them. Schematically, the typical ORM code looks like this:

     Python 
   
   def get_query(*query):
  compiler, sql, params = process_query(*query)
  rows = compiler.execute(sql, params)
  return objects_from(rows)

How do we make it asynchronous? That’s right – we add await before compiler.execute, and add async before def get_query.

By the way, I cannot but mention one thing that I recently learned about. You can use an asynchronous database driver without converting compiler.execute to asynchronous. To do this, you need a greenlet library – here is an example of how to use it. In general, we switch from synchronous code to another greenlet using other_greenlet.switch(), and eventually, we end up in an asynchronous function. This approach to porting code to asynchronous requires less effort. Actually, this method is used by sqlalchemy (this is how I learned about this feature).

Although it is tempting, I don’t really like this method. First, it is bad for debugging: greenlets contain C code, and switching between them is opaque to Python. So, this is some sort of asyncio hack, just in case. And the ORM is, of course, quite important (a little bit of irony), but not important enough to adjust the runtime. But if you like this option, please vote for it at the end of the article!

Forking

I would recommend another option, which is less radical (or maybe even more radical, on the other hand). Let’s be generous and allocate a separate repository for our asynchronous version (or, in other words, let’s fork Django). First, let’s convert the driver API to asynchronous (execute, fetchall, fetchone functions), then we’ll try to use it instead of the synchronous driver. We’ll need to mark many other functions as asynchronous and add await to their calls. So, these will be the main changes required. The Django ORM will remain the same preserving in most cases the attributes of each object.

Of course, there will be exceptions. For example, since you need to write await to run a database query, you will probably have to give up on lazy queries that could be magically executed when accessing attributes. Maybe it’s better: either we use await, and the query is always executed, or we omit it. In this case, the object should already be cached (for example, as a result of select_related or prefetch_related) – it’s clear now!

A few details on the implementation: let it be a fork, but it should have a different name since we don’t need a name conflict. It should be possible to use Django and our new package in the same environment without any problems. That is, we’ll need some kind of script (or just PyCharm) that uses static analysis to rename the Django package into something else (for example, teapot).

There is a widespread belief that forking is bad and cannot be supported long. That’s a moot point. Here I would like to note that we have a special case: a new syntax for asynchronous functions. That is, the changes are predictable and transparent. Secondly, the asynchronous version is completely self-contained. Any kind of compatibility with Django is not vital to it. Well, my opinion is that forking is okay. It’s just a powerful tool that requires some responsibility.

Goals for the First Version

The first version should be concerned with porting to asynchronous and nothing else. No new features. We only port the ORM, and we cut off everything we can. That is, for example, we exclude forms and the Django admin panel. Despite Django’s rather big size, I’m sure there will be relatively few changes. The goals to pursue are convenient merging from Django and the reuse of almost all Django tests with almost no changes.

Compatibility

Honestly, I did not intend to do anything specifically for compatibility with the synchronous version. The asynchronous version alone is a great achievement. But considering the above, it will anyway remain compatible with the sync version. At the database level, I mean mainly the correspondence between the model fields and the database schema. To not use this compatibility would be wrong, as well as to allow it to be violated due to some insignificant reasons. So, there will be certain compatibility. For example, Django models can be used in the async version. That is, if the user wants to use the models in both synchronous and asynchronous contexts, they should be declared in models.py (and inherited from django.db.models.Model).

Development Prospects

If the async version of Django is one day the only one, I don’t see any big problem with that. As for Django itself, I don’t have any serious claims or complaints about it. I just would like to have out-of-the-box support for zero-downtime migrations. Modern open-source relational databases allow for this (to perform long-running operations, such as indexing, without locks), which means this should be a default for modern ORMs. Especially since zero-downtime is a mandatory requirement for many services today. However, this topic is beyond the scope of this text.

DEP?

In fact, yes. The thing is that some kind of Django integration is still needed, and it will be. But in general, enhancement proposals are changes to the project (Django), and what can exist as third-party packages, or forks, aren’t DEPs.

But this is a DEP in the sense that it only aims to add asynchronous support in Django, with no other changes. Anyway, the author’s element is almost completely absent: probably everyone knows what should happen and what needs to be done.

Django (web framework) Database Relational database

Opinions expressed by DZone contributors are their own.

Related

Trending