Before reading, a detailed explanation of Ignite.NET distributed queries can be helpful: Getting Started With Apache Ignite.NET Part 3: Cache Queries. If you are new to Ignite, please read that first.
Let’s get straight to the results!
Method | Median | StdDev | ------------------ |------------ |---------- | QueryLinq | 175.8261 us | 9.9202 us | QuerySql | 62.2791 us | 5.4908 us | QueryLinqCompiled | 57.9274 us | 3.1307 us |
This is a comparison of equivalent queries via SQL, LINQ, and Compiled LINQ. The query is very simple (
select Age from SqlPerson where (SqlPerson.Id < ?)), data set is very small (40 items, 20 returned): this exposes LINQ overhead better.
We can see right away that LINQ is a lot slower than raw SQL, but compiled LINQ is a bit faster. Note that results are in microseconds; real-world queries may take tens or even hundreds of milliseconds, so LINQ overhead will be hardly noticeable.
Anyway, how can we explain these results? Why compiled LINQ is faster than raw SQL?
How Ignite LINQ Works
ICache<int, SqlPerson> cache = ignite.GetCache<int, SqlPerson>("persons"); IQueryable<int> qry = cache.AsCacheQueryable().Select(x => x.Value.Age); IList<int> res = qry.GetAll();
If we run the above code in Visual Studio debugger and look at the
qry variable, we’ll see something like this:
The compiler has translated
.Select(x => x.Value.Age) to an Expression Tree and passed it to
CacheFieldsQueryProvider, which, as we can see, turns into a regular Ignite.NET
SqlFieldsQuery. Expression tree processing is not free — that’s where the overhead comes from.
We can get that
SqlFieldsQuery and run it manually:
IQueryable<int> qry = cache.AsCacheQueryable().Select(x => x.Value.Age); SqlFieldsQuery fieldsQry = ((ICacheQueryable)qry).GetFieldsQuery(); IQueryable<IList> res = cache.QueryFields(fieldsQry);
However, LINQ produces typed
IQueryable<int> instead of untyped
IQueryable<IList>. How is this achieved? You may think that LINQ engine iterates over
IQueryCursor returned from
QueryFields and populates
List<int>, but it is more clever than that.
This code produces zero extra allocations and zero type casts while reading query results. That is where LINQ advantage comes from — it is aware of resulting data types and can generate specialized deserialization code, while regular SQL query reads all field values as objects, which causes excessive allocations (
IList for each row, boxing of value types) and requires type casting.
LINQ is not only much nicer to work with than SQL but it can also be on par or faster when used properly! Just don’t forget to use
CompiledQuery when on a hot path.