IValue: Efficient Representation of Dynamic Types in C++
Learn how IValue represents dynamic types efficiently.
Join the DZone community and get the full member experience.Join For Free
In traditional SQL systems, a column's type is determined when the table is created and never changes while executing a query. If you create a table with an integer-valued column, the values in that column will always be integers (or possibly
Rockset, however, is dynamically typed, which means that we often don't know the type of a value until we actually execute the query. This is similar to other dynamically typed programming languages where the same variable may contain values of different types at different points in time:
Bytes: taking a cue from Python, we distinguish between sequences of valid Unicode characters.(
string, which is internally represented as UTF-8) and sequences of arbitrary
- Date- and time-specific types (
There are other types that we use internally (and are never exposed to our users); also, the type system is extensible, with planned support for
decimal (base-10 floating-point), geometry/geography types, and others.
In the following example, the collection,
ivtest, has documents containing one field,
a, which takes a variety of types.
This post shows one of many challenges that we encountered while building a fully-dynamically typed SQL database: how we manipulated values of unknown types in our query execution backend (written in C++), while approaching the performance of using native types directly.
At first, we used protocol buffers similar to the definition below (simplified to only show integers, floats, strings, arrays, and objects; the actual
oneof that we use has a few extra fields).
However, we quickly realized that this is inefficient, both in terms of speed and memory usage. First, Protobuf requires a heap memory allocation for every object; creating a
Value that contains an array of 10 integers would perform:
- A memory allocation for the top-level
- An allocation for the
- An allocation for the list of values (
ArrayValue.values, which is a
- An allocation for each of the 10 values in the array.
for a total of 13 memory allocations.
Also, the 10 values in the array are not allocated contiguously in memory. This causes a further decrease in performance due to cache locality.
It was quickly clear that we needed something better, which we called
IValue. Compared to the Protobuf version,
- More memory efficient: while not as efficient as using native types directly,
IValuemust be small, and must avoid heap allocations wherever possible.
IValueis always 16 bytes and does not allocate heap memory for integers, booleans, floating-point numbers, or short strings.
- Faster: arrays of scalar
IValues are allocated contiguously in memory, leading to better cache locality. This is not as efficient as using native types directly, but it is a significant improvement over Protobuf.
Most of Rockset's query execution engine operates on
IValues (there are some parts that have specialized implementation for specific types, and this is an area of active improvement).
We'd like to share an overview of the
IValue design. Note that
IValue is optimized for Rockset's needs and is not meant to be portable — we use Linux and x86_64-specific tricks and assume a little-endian memory layout.
The idea is in itself not novel; the techniques that we use date back to at least 1993, as surveyed in “Representing Type Information in Dynamically Typed Languages”. We decided to make
IValue 128 bits instead of 64, as it allows us to avoid heap allocations in more cases (including all 64-bit integers); using the taxonomy defined in the paper,
IValue is a double-wrapper scheme with qualifiers.
IValue is represented as a 128-bit (16-byte) value, consisting of:
- A 64-bit field (called
- A 48-bit field (called
pointer, as it often, but not always, stores a pointer).
- Two 8-bit discriminator fields (called
Tag1 indicates the type of the value.
Tag0 is usually a subtype, and the meaning of the other two fields changes depending on the type. The
pointer field is often a pointer to some other data structure, allocated on the heap for the cases where heap allocations can't be avoided; as pointers are only 48 bits on x86_64, we are able to fit a pointer and the two discriminator fields in the same
We recognize two types of
- Immediate values: those that can fit within the 16 bytes of the
IValueitself (while still leaving room for the 16-bit tag):
- all scalars that fit within 128 - 16 = 112 bits:
NULL, integers, floating-point values, booleans, date, time, etc.
- strings shorter than 16 bytes.
- all scalars that fit within 128 - 16 = 112 bits:
- Non-immediate values, which require heap allocation.
Tag1 has bit seven clear (
tag1 < 0x80) for all immediate values and set (
tag1 >= 0x80) for all non-immediate values. This allows us to distinguish between immediate and non-immediate values very quickly, using one simple bit operation. We can then copy, hash, and compare for equality immediate values by treating them as a pair of
The representation for most scalar types is straightforward:
tag0 is usually zero,
tag1 identifies the type,
pointer is usually zero, and
data contains the value.
NULL is all zeros, which is convenient (
memset()ing a chunk of memory to zero makes it
NULL when interpreted as
data = 0 for
data = 1 for
tag1 = 0x01
Integers have the value stored in
tag1 = 0x02
And so on. The layouts for other scalar types (floating point, date/time, etc) are similar.
We handle character strings and byte strings similarly; the value of
tag1 is the only exception. For the rest of the section, we'll only focus on character strings.
IValue strings are immutable, as they maintain the string's length explicitly and are not null-terminated. In line with our goal to minimize heap allocations,
IValuedoesn't use any external memory for short strings (less than 16 bytes).
Instead, we implement the small string optimization. We store the string contents (padded with nulls) in the
tag0 fields; we store the string length in the
n is the string's length.
An empty string has
0x10 and all other bytes zero:
And, for example, the 11-byte string “Hello world” has
0x1b (note the little-endian representation; the byte
'H' is first):
Strings longer than 15 bytes are stored out-of-line:
Pointer points to the beginning of the string (allocated on the heap using
datacontains the string length. (There is also the possibility of referencing a “foreign” string, where
IValue doesn't own the memory but points inside a preallocated buffer, but that is beyond the scope of this post.)
For example, the 19-byte string “Rockset is awesome!”:
Vectors (which we call “arrays”, adopting JSON's terminology) are similarly allocated on the heap. They are similar to vectors in most programming languages (including C++'s
pointer points to the beginning of the vector (allocated on the heap using
datacontains the vector's size and capacity (32 bits each). The vector itself is a contiguously allocated block of
capacity() * 16bytes); when reallocation is needed, the vector grows exponentially (with a factor that is less than 2, for the reasons described in Facebook's
Maps (which we call “objects”, adopting JSON's terminology) are also allocated on the heap. We represent objects as open-addressing hash tables with quadratic probing; the size of the table is always a power of two, which simplifies probing. We probe with triangular numbers, just like Google's sparse hash, which, as Knuth tells us in The Art of Computer Programming (volume 3, chapter 6.4, exercise 20), automatically covers all slots.
Each hash table slot is 32 bytes — two
IValues, one for the key, one for the value. As is usually the case with open-addressing hash tables, we need two special keys — one to represent empty slots, and one to represent deleted elements (tombstones). We reserve two values of
tag1 for that purpose (
pointer field points to the beginning of the hash table (a contiguous array of slots, allocated on the heap using
malloc()). We store the current size of the hash table in the least-significant 32 bits of the
data field. The
tag0 field contains the number of allocated slots, as it's always a power of two, we store
log2(number of slots) + 1, or zero if the table is empty.
capacity field (most significant 32 bits of
data) deserves further interest. It is the number of slots available for storing user data. Initially, it is the same as the total number of slots, but, as in all open-addressing hash tables, erasing an element from the table marks the slot as “deleted” and renders it unusable until the next rehash. So erasing an element actually decreases the table's capacity.
IValue gives a substantial performance improvement over the old protobuf-based implementation:
- Creating arrays of strings is between 2x and 7x faster (depending on the string size; because of the small-string optimization,
IValueis significantly faster for small strings).
- Creating arrays of integers is also 7x faster (because we no longer allocate memory for every individual array element).
- Iterating over large arrays of integers is 3x faster (because the values in the array are now allocated contiguously).
Even though Rockset documents are allowed to contain data of multiple types in the same field, the situation shown in the introduction is relatively rare. In practice, most of the data is of the same type (or
NULL), and, to recognize this, we are extending
IValue to support homogeneous arrays.
All elements in a homogeneous array are of the same type (or
NULL). The structure is similar to the regular (heterogeneous) arrays (described above), but the
pointer field points directly to an array of the native type (
int64_t for an array of integers,
double for an array of floating-point values, etc). Similar to systems like Apache Arrow, we also maintain an optional bitmap that indicates whether a specific value is
NULL or not.
The query execution code recognizes the common case where it produces a column of values of the same type, in which case it will generate a homogeneous array. We have efficient, vectorized implementations of common database operations on homogeneous arrays, allowing us significant performance improvements in the common case.
This is still an area of active work, and benchmark results are forthcoming.
Published at DZone with permission of Tudor Bosman, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.