DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly
  • Query SQL and NoSQL Databases Using Artificial Intelligence
  • Introduction to NoSQL Database
  • NoSQL vs SQL: What, Where, and How

Trending

  • Top Book Picks for Site Reliability Engineers
  • A Complete Guide to Modern AI Developer Tools
  • Automatic Code Transformation With OpenRewrite
  • Testing SingleStore's MCP Server
  1. DZone
  2. Data Engineering
  3. Databases
  4. JIT Compilation of SQL in NoSQL

JIT Compilation of SQL in NoSQL

Go on a journey of JIT compilation for SQL queries.

By 
Georgiy Lebedev user avatar
Georgiy Lebedev
·
Jun. 18, 22 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
6.6K Views

Join the DZone community and get the full member experience.

Join For Free

Hi everyone! My name is Georgy Lebedev, and I'm a part of Tarantool's kernel development team. In 2021, we participated in the Google Summer of Code (GSoC) for the first time: one of the proposed projects was migration of SQL from VDBE to a JIT platform — that's where my journey in Tarantool began.

Having a year of developing various toolchain components as educational projects under my belt and armed with the support of mentors (Nikita Pettik, Timur Safin, and Igor Munkin), I took on this project. While building a platform for JIT compilation of SQL queries in Tarantool virtually from scratch during just one summer, I have encountered some pitfalls and acquired, in my opinion, interesting knowledge and experience which I would like to share. This article will be of interest first and foremost to those who are interested in further maintaining this project, as well as to those who are considering implementing JIT compilation in their own SQL.

What I Had to Work With

To support the SQL language, Tarantool uses a virtual machine (VM) inherited from SQLite. It is also known as the Virtual Database Engine (VDBE). The schematic of this component:

https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-explaining-the-virtual-machine-2550ef6c5db


One of the most popular ways to implement virtual machines is to operate with a bytecode stream of instructions; VDBE is based on that principle. In this article, we will be looking at bytecode as an intermediate representation for virtual machines and its interpretation. Further on, speaking about bytecode and SQL VM implementation in Tarantool, I will be referring to VDBE.

Bytecode interpretation is represented by a for loop over an array of opcodes (a bytecode program generated by a SQL query) and a switch by the opcode's type:

Java
 
for(pOp = &aOp[p->pc]; 1; pOp++) {
    switch (pOp->opcode) {
    ...
    }
}

Our Task

In terms of the application server, Tarantool already has a virtual machine, LuaJIT. Another handwritten VM (without using frameworks to build the execution environment) to support SQL just doesn't look very graceful from an architectural perspective, not to mention the performance issues that it introduces.

The project suggested exploring one of the following possible alternatives to the VDBE:

  • LuaJIT infrastructure
  • MIR framework for JIT compilation
  • LLVM ORC framework

Expected results:

  • Test the idea of migrating VDBE to the selected JIT platform
  • Examine different scenarios for applying JIT compilation in the context of SQL
  • Measure the performance of JIT compilation via different SQL benchmarks

Since it seemed impossible to solve this problem in the given timeframe of three summer months, and Tarantool already had the first approach to this task through a 3-year-old patch based on LLVM ORC, I decided to use this framework and continue to develop Nikita Pettik's idea. In addition, there was the successful example of Postgres, which already implemented such a platform based on this framework.

And now there is also another DBMS, ClickHouse, which also implements a JIT platform for SQL based on LLVM ORC.

JIT Compilation of SQL

JIT compilation is by no means a cure-all: this approach has its drawbacks and does not always provide results superior to bytecode interpretation, which is why JIT compilation has a rather limited set of applications.

Its main advantages are:

  • Fewer jump instructions and localization of native code, which makes the best use of branch predictor
  • Lower number of branches and less machine code due to specialization
  • Higher performance for bytecode hot spots, which are bottlenecks in CPU processing
  • Better use of CPU caches
  • Generation of instructions for the target processor
  • Room for optimization of the generated IR (intermediate representation)

In the context of SQL, we can distinguish the main scenario of using JIT compilation: heavy analytical DQL (data query language) queries that contain lots of logic and expressions.

Example of VDBE bytecode listing: a Next instruction forms a loop starting with the instruction number 4 (Column).


Such queries are merely bytecode sequences executed repeatedly in a loop. Such sequence can be merged into a single opcode, which would constitute a call to a native code function, making it an ideal case for applying JIT compilation.

We can highlight a handful of computational scenarios where this would be useful:

  • Decoding tuples (Tarantool term for rows)
  • Arithmetic and logical expressions of various kinds
  • Aggregate functions

Here, aggregate functions are an especially interesting case as they are computed via separate loops — again, a very fitting condition for JIT compilation.

I highlighted some of the project's additional angles and challenges in my final report for GSoC.

The following is a summary of what I believe to be the most important conclusions I drew from my work on this project.

LLVM ORC

Using a third-party framework for JIT compilation, albeit with its own IR, allowed abstracting from the tasks of generating, optimizing, and storing native code. That helped to focus on generating LLVM IR and embedding the JIT compiler into the existing SQL infrastructure.

To Inline, or not to Inline, That is the Question

Embedding JIT compilation in SQL infrastructure raises the question of how to generate LLVM IR that is functionally equivalent to interpreting opcodes in C. There are two fundamentally different approaches:

  • Completely manual generation of LLVM IR for each opcode (Postgres)
  • Automatic generation via clang (Postgres)

In Postgres, the second approach is used only for built-in functions and operators.

In my opinion, the first approach is equivalent to writing assembler code: quite time-consuming and error-prone. Besides, it would be simply dull to do some mechanical work instead of an innovative summer project. I chose the following strategy: wrap the existing code blocks for SQL opcodes in callbacks called in the JIT-compiled code—this also allows getting rid of rather non-trivial code duplication in the first approach.

A reasonable question arises here: what about code specialization and other advantages of JIT compilation, which we seem to be deprived of by using such an approach? That's where the inlining features of LLVM were supposed to come to the rescue: as it turned out, the C API I was using, a small subset of the C++ API of LLVM, had no such features at all, and they were only introduced in version 13 of LLVM.

In addition, even to apply the C++ API, you need to build a whole infrastructure to:

  • Store, index and load the bitcode generated by clang
  • Search through bitcode modules (units of translation in LLVM) to find the necessary callbacks and pass their IR between modules
  • Explicitly inline callbacks

A good example of how to implement such infrastructure is Postgres.

Patching the generated LLVM IR

There is an important detail in JIT compilation of SQL in Tarantool: some VBDE opcodes are patched when the bytecode is generated. For instance, this is useful for optimization—direct table reads can be replaced by reads from an index:

VDBE bytecode listing before and


after patching.


You can see that the first argument corresponding to the cursor number has changed for all Column opcodes.

Another reason to patch bytecode instructions might be coroutines used for subqueries. The bytecode that loads columns from the coroutine table must be changed to a bytecode that copies columns from the registers of the returned coroutine values:

VDBE bytecode listing before and


after patching.


The same circumstances should be reflected in the generated LLVM IR. From the IR point of view, a solution to this problem is evident: use a separate base block for each opcode. This will make analysis, search, and patching of the required instructions much easier. It could even allow throwing entire blocks away and replacing them with other blocks. And all of it is non-linear!

LLVM IR for loading a column before and


after patching.


The LLVM IR above for the select A from DEMO query corresponds to loading a column into the register. You can see how the value of the store instruction argument in the base block OP_Column_begin has changed:

store i32 1, i32* %tab, align 4

The LLVM IR below is generated for the select * from (select A from DEMO) query using a coroutine:

LLVM IR for loading a column from the coroutine table before patching.


LLVM IR for copying a column from the coroutine return value register after patching.


SQL Benchmarks

The final part of the project was measuring performance in standard SQL benchmarks. I took TPC-H as the most representative: it reflects the typical analytical SQL queries.

These measurements also took into account the analysis and compilation time of the queries, while normally JIT compilation is supposed to be used only for prepared queries. If we neglect these overheads, we can assume that there is no performance loss with JIT compilation.

Unfortunately, due to the fact that I did not have time to make a JIT compilation of arithmetic expressions, and most queries in TPC-H contain them in large numbers, there wasn't much to see in action. Only Q17, Q19, and Q20 queries were suitable. The most remarkable among them is Q19: it contains a huge number of column links and literals—the increase in performance was almost twofold!

Results

  • Patchset with platform for JIT compilation of SQL based on version 12 of LLVM ORC.
  • JIT compilation of different SELECT expressions: literals, column references and aggregate columns.
  • JIT compilation of aggregate functions (e.g., SUM, COUNT, AVG) with some restrictions on query complexity.
  • The TPC-H benchmark showed a significant increase in performance on some types of queries and no noticeable degradation in the rest.
Database Information retrieval LLVM NoSQL Tarantool sql

Opinions expressed by DZone contributors are their own.

Related

  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly
  • Query SQL and NoSQL Databases Using Artificial Intelligence
  • Introduction to NoSQL Database
  • NoSQL vs SQL: What, Where, and How

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!