Python Development With Asynchronous SQLite and PostgreSQL
Solve SQL security, database connector, and prepared statement problems when using FastAPI with Asynchronous PostgreSQL and SQLAlchemy.
Join the DZone community and get the full member experience.
Join For FreeAfter years of working from the comfort of Python and Django, I moved to the wild asynchronous world of FastAPI to improve latency in web-based AI applications. I started with FastAPI and built an open-source stack called FastOpp, which adds command-line and web tools similar to Django.
Initially, things went smoothly using SQLite and aiosqlite to add AsyncIO to SQLite. I used SQLAlchemy as my Object Relational Mapper (ORM) and Alembic as the database migration tool. Everything seemed to work easily, so I added a Python script to make things similar to Django’s migrate.py.
As things were going smoothly, I added Pydantic for data validation and connected Pydantic to the SQLAlchemy models with SQLModel. Although I was pulling in open source packages that I wasn’t that familiar with, the packages were popular, and I didn’t have problems during initial use.
Django comes with an opinionated stack of stable, time-tested tools, which I was starting to miss. However, I was really attracted to FastAPI features such as auto-documentation of APIs and the async-first philosophy. I continued forward by integrating SQLAdmin for a pre-configured web admin panel for SQLAlchemy.
I also implemented FastAPIUsers. At this point, I ran into problems using FastAPIUsers in the same way I used Django authentication. I got my first glimpse of the complex world outside of the Django comprehensive ecosystem. I needed to implement my own JWT authentication and used FastAPIUsers as the hash mechanism.
The FastAPI project has a full-stack-fastapi-template that I assessed as a starting point. I chose not to use it since my primary goal was focused on using Jinja2Templates for a streaming application from an LLM with heavy database use, both SQL and eventually a vector database using pgvector with PostgreSQL, or for simpler deployments, FAISS with SQLite and the FTS5 extension. My goal is to provide a more Django-like experience for FastAPI and provide the opportunity in the future to use the built-in API and auto-documentation of FastAPI instead of implementing something like Django REST framework, which I've found difficult to set up for automatic documentation of the API endpoints.
I've considered for a long time whether it’s better to just use Django with asyncio from the beginning and not build a Django-like interface around FastAPI. In Django 6, there is some support for background tasks. My primary motivation for moving to FastAPI occurred when I was using Django for asynchronous communication with LLM endpoints. Although Django works fine with asynchronous communication, its default synchronous communication style created a number of problems for me. For average people like me, it’s difficult to keep a method asynchronous and not have any synchronous calls to other libraries that might be synchronous or other synchronous communication channels, like a database access.
At this point, I wanted to simplify my code architecture and committed to FastAPI and make my code asynchronous from the beginning.
It seemed simple. I thought I just needed to use an asynchronous driver with PostgreSQL and everything would work.
I was wrong.
Problems Moving to Asynchronous Database Connections psycopg2, psycopg3, or asyncpg
The default way to connect to Python for many people is psycopg2. This is a very proven way. It is the default usage in most Django applications. Unfortunately, it is synchronous. The most common asynchronous PostgreSQL connector is asyncpg, but I couldn't get it to work in my deployment to Leapcell. As Leapcell had a psycopg2 example for SQLAlchemy, I used psycopg2 and rewrote the database connection to be synchronous while keeping everything around the connection asynchronous. As the latency with the LLM is much higher than the latency with the database, this seemed like a reasonable solution at the time. I just had to wait for the database to send me back the response, and then I was free to deal with other asynchronous problems, such as LLM query and Internet search status updates. The database latency was likely going to be less than 1,500ms in most queries, which was okay for my application.
Using a synchronous connection to the database is great in theory, and I’m sure that other, more experienced Python developers can easily solve this problem and keep the synchronous and asynchronous code nicely separated with clean use of async and await.
However, I ran into problems with organizing my code to use synchronous connections to the database within asynchronous methods that were talking to the LLM and storing the history in the database.
As I was familiar with async/await from using Dart for many years, I was surprised I was having these problems. The problem I had might have been due to my lack of experience in understanding which pre-made Python modules were sending back synchronous versus asynchronous responses.
I think that other Python developers might be able to understand my pain.
To keep to an asynchronous database connection for both SQLite and PostgreSQL, I moved from the synchronous psycopg2 to asyncpg.
SSL Security Not Needed in SQLite, But Needed in PostgreSQL Production
The asyncpg connector worked fine in development, but not in production.
Although establishing an SSL network connection seems obvious, I didn’t really appreciate this because I had been deploying to sites like Fly.io, Railway, and DigitalOcean Droplets with SQLite. For small prototype applications, SQLite works surprisingly well with FastAPI. I was trying to deploy to the free hobby tier of Leapcell to set up a tutorial for students who didn’t want to pay or didn’t want to put their credit card into a hosting service to go through a tutorial.
There’s no way to write to the project file system on the Leapcell service engine. Leapcell recommends using Object Storage and PostgreSQL for persistent data. They do offer a free tier that is pretty generous for PostgreSQL. Leapcell requires SSL communication between their PostgreSQL database and their engine, which they call the service.
Unfortunately, the syntax is different for the SSL mode between psycopg2 and asyncpg. I couldn’t just add ?sslmode=require to the end of the connection URL.
Leapcell did not have an example for asyncpg. Likely due to my limited skills, I wasn’t able to modify my application completely enough to put the SSL connections in all the required places.
In order to just use the URL connection point with sslmode=require, I decided to use psycopg3.
Prepared Statements Caused Application to Crash With SQLAlchemy
As I needed to use an async ORM in Python, I used SQLAlchemy. I didn’t have too much experience with it initially. I didn’t realize that even though I wasn’t making prepared statements in my Python application, the communication process between psycopg and PostgreSQL was storing prepared statements.
Due to the way the connections were pooled on Leapcell, I had to disable the prepared statements. It took me a while to isolate the problem and then implement the fix.
The problem never occurred when using SQLite because SQLite runs prepared statements in the same process using the same memory space as the Python program. This is different from PostgreSQL, where the network and session state can change.
As I was worried about the performance impact of disabling prepared statements, I did some research, and it appears that SQLAlchemy does statement caching on the Python side.
The real-world impact of disabling the prepared statement in PostgreSQL appears to be negligible.
Summary
Using SQLite in asynchronous mode has been quite easy. Getting PostgreSQL to work has been more difficult. There were three areas that I had trouble with for PostgreSQL:
- Asynchronous connection – how to write asynchronous Python code effectively to await the return data.
- Security – how to deal with both SQLite, which doesn’t require an SSL, and PostgreSQL in production, which does require an SSL.
- Prepared statements – I needed to learn to rely on the SQLAlchemy statement caching instead of the built-in prepared statements on the PostgreSQL server.
I like FastAPI, and there are many huge advantages to using it that I got in the first hour of use. I’m going to continue using it instead of Django. However, I’m starting to really appreciate how much Django shielded me from much of the infrastructure setup for my applications.
FastAPI is unopinionated in areas such as the database, connectors, authentication, and models. I find it difficult to gain expertise in any one area. Thus, I am focusing on a smaller set of open source components that work with FastAPI to gain a deeper understanding of their use.
I feel that many other Python developers are on a similar journey to experiment more with asynchronous Python web applications. I would appreciate feedback and ideas on which open source components or techniques to use to build effective asynchronous AI applications.
Resources
Opinions expressed by DZone contributors are their own.
Comments