{{announcement.body}}
{{announcement.title}}

Compiling Kotlin in Runtime

DZone 's Guide to

Compiling Kotlin in Runtime

jsr233: compile Kotlin code dynamically, after application start.

· Java Zone ·
Free Resource

Everybody knows tasks, which could be easily solved if you can generate and execute code instantly inside JVM runtime. However, sometimes you have to create separate library, so code isn't known during its compile time.

Here we observe an approach to generate code and execute it after the application start. We will use jsr233 standard for this.

As task we use popular approach - we create AOP system to parse SQL query response, to parse the result table. So developer is able to add annotations into the code and then our application will generate and execute the parser code in the runtime. However, this approach is much wider. You can create programmable configuration (like TeamCity Build DSL), you can optimize the existing code (the generated code can be constructed after settings reading, so it will be branch-free). And of course we can use this approach to avoid copy-paste in case when the language expressive power isn't enough to extract the generalized block.

All code is available on GitHub, however, you need to install Java 11 (or later) first. Article has simplified version, without logs, diagnostic, tests, etc.

Task

First of all: if you want to solve exact this task, please check the existing libraries. In the most cases there is already developed and supported solution. I can advice Hibernate or Spring Data, which can do the same.

What do we'd like to have: be able to mark some data object with attributes and some "SQL query result to DTO converter" is able to convert rows to the our class instances.

For instance, our client code can be like:

Kotlin


As you know, to read the database response with Spring JDBC, it is better to use ResultSet interface.

There are at least two methods to extract String from column:

Kotlin


Let's complicate our task a little:

  • In case of huge query result, it is better to use index-based approach (e.g. we retrieve indexes for all columns before the first row arrived, remember them and then use these indexes for the each row). This is highly important for high-performance applications, because string-based methods have to compare column names before, so it requires at least NxM unnecessary string equality calls, where N is row count and M is column count.
  • For another performance boost, we shouldn't use reflection. Therefore, we have to avoid BeanPropertyRowMapper or something like this, because they are reflection-based and too slow.
  • All property types could be not only primitive, like String, Int. They also can be complex, like self-written NonEmptyText (this class has single String field, which couldn't be null or empty).

As we observed above, it is better to extract our solution to the separate library. Therefore we don't know all types available during the our library compilation. And we'd like to have database response parsing in the way like:

Kotlin


One more reminder: please don't use this approach in your project without the internet observation. Moreover if you have task to parse the database rows, even you want to work with JDBC directly (without Hibernate or something) you can achieve this without code generation. And this is our homework - find a way to do this.

Kotlin Script Evaluation

For now there are two the most easy approaches to compile Kotlin in the runtime: you can use Kotlin Compiler directly or you can use jsr233 wrapper. First schema allows you to compile multiple files together, it has better extensibility power, however it is more complex for use. Second approach obviously allows you just to add new type into the current Class Loader. Of course it isn't safe, so please execute only trusted code there (also, Kotlin Script Compiler runs code in the separate restricted Class Loader, however the default security configuration doesn't prevent new process creation or file system access, so please be carefully there too).

First of all, let's define the our interface. We don't want to generate new code for each SQL query, so let's do this once per each object, which will be written from the database. For instance, interface could be:

Kotlin


inline method is required to have illusion that we have real generics in the JVM. It allows ResultSetMapper construction with code like : return mapperFactory.createMapper<MyClass>().

ResultSetMapper inherits standard Spring interface:

Java


Factory implementation is responsible to generate the code by using class annotations and then execute it. So we have mockup like:

Kotlin


We have to return ResultSetMapper<TMappingType>. And it is better to create class without the generic parameters to get type knowledge for JVM (in this case GraalVM and C2 compilers can use more optimization techniques). Therefore, we compile code like:

Kotlin


For code compilation, we need three steps:

  1. Add all necessary dependencies into the classpath.
  2. Instruct Java about the possible compilers (Kotlin Script compiler is our case).
  3. By using ScriptEngineManager - execute code, which returns object above.

For the first item, let's add the following lines with gradle script:

Kotlin


For the second item, let's add file "src/main/resources/META-INF/services/javax.script.ScriptEngineFactory" into the jar with the following line:

Plain Text


And then we have the last remaining item - execute script in the runtime:

Kotlin

Preparing the Model

As I wrote above, let's complicate our task. Let's generate code not only for embedded JVM types, but also for self-written. Therefore, let's dig deeper into the our data model.

Let's imaging that we try to write strongly-typed code, which prevents invalid data as earlier as possible. Therefore:

  1. Instead of field userName: String we have userName: UserName, where class UserName has just one field.
  2. UserName can't be empty, therefore we should check this value in constructor.
  3. We plan to have a lot of such classes, therefore this logic should be extracted to the common block.

As one approach, we can implement the following via this way:

Create class NonEmptyText,  which has necessary field and all required checks in the constructor:

Kotlin


Next let's add one more type construction approach:

Kotlin


Next we can create UserName class:

Kotlin


Here we have UserName, which is strongly typed.  And his companion object has ability for it instances construction, so for now we can create instances without direct constructor call:

Kotlin


Now we can give this interface to anyone who wants to create instance from input string. For instance, the call for method fun <TValue> createText(input: String?, constructor: NonEmptyTextConstructor<TValue>): TValue?is createText("123", UserName), which is intuitive. It is looks like type classes for JVM.

Let's define Email with the following way:

Kotlin


We divided it to the two different types here just for complex type example. We don't need this in real life for email. In our case let's do this to test approach "read single object from two columns". Not all ORM implementation can do this, however we can.

Next let's create the DbUser type. It is our DTO, which we read from the database:

Kotlin


To generate database result parsing code, we must:

  1. Define column matching. So for name field we have to define one column name.
  2. For email field we need to define two column names.
  3. Define database reading method (moreover - even String type can be read by the different ways).

If we have "one column - one type" matching, then database reading method can be defined with the simple interface:

Kotlin


So during the ResultSet reading, we can do the following:

  1. Once remember what index is responsible for what column.
  2. For each line:
    1. Call getValue for the each cell.
    2. Create object from the previous item results.

As we observed before, let's think that project has a lot of types, which can be marked as "non-empty string". Therefore, we can create common mapper for them:

Kotlin


We can see, we put the object constructor into this class. Next we can easily create mappers for the exact classes:

Kotlin


Unfortunately, I didn't find the way to express mapper with extension-methods, e.g. to have some kind of extension type. In Scala you can achieve this via implicit. However, this approach isn't explicit.

As we noticed, we have complex type - Email. And it requires two columns. Therefore, interface above isn't applicable for it. As an option, we can create the separate one:

Kotlin


Here we have two input columns with the single result object. This is exactly what we needed, however we should copy-paste these interfaces for the each column count option.

For now we can have combined mapper, which will be like this:

Kotlin


And next Email can be read in the following way:

Kotlin


We we have the last remaining item - define our annotations and write the code generation.

Kotlin

Code Generation From the Annotations

First of all let's define, which code do we like to see. I used the following, which complied with the all initial criteria:

Kotlin


This code is generated as monolith (variables are defined first and only then used), let's extract several blocks at least with different ideas:

  1. We have N input columns, which are in the different mappers. Therefore we need different variables for them (same columns can be used in the different mappers).
  2. First of all we should verify what we received from the database. If column count is different with expected, then it is better to raise an exception with a lot of details - what we received, should we expected, etc.
  3. SQL cursors works via approach like while(rs.next()) { do }, so let's create mutable list. Ideally we can set his side initially if we know, what the row count is returned from the database.
  4. On the each loop iteration we can to read all field values and then create the resulted object.

Finally, we have the following code:

Kotlin

Why Do We Need This?

As you can see, it is easy to generate executable code instantly in runtime. I spent just several hours for this small example library (and several more for the article). However here we have workable code which is able to read rows from the database faster than the most of intuitive approaches on the stackoverflow. Moreover, because of fully controlled code generation we can also add object interning, performance measurement at a lot of other improvements and performance optimizations. And the most important point - we know exact the code which will be executed here.

Kotlin DSL can be also used for the programmable configuration. If you love your users, you can stop forcing using them json/xml/yaml files and just give DSL. It will define configuration with type-safe abilities. Just for example, please, take a look on TeamCity Build DSL — you can develop you build, you can write condition/loop to avoid step copying 10 times. You have all code highlights in the IDE. Anyway the final application need the configuration model, there aren't any real restriction on it creation.

Not all ideas can be expresses in the your programming language. And often you don't want to copy-paste code, which isn't so simple to verify. And code generation can help us here too. If you can define your implementation with annotations then let's do it in common way and hide under the interface? This approach is highly useful for JIT compiler, which has code with the all explicit types, instead of generic ones (where it is impossible to do some optimizations, such as stack allocation).

However, the most important point: please estimate first, is it really necessary to play with code generation and runtime code execution. In some projects, reflection-based approach has enough performance, which means that it is better to avoid using non-standard techniques and overcomplicate the project.

Topics:
compiler, dsl, jdbc, kotlin, runtime, spring

Published at DZone with permission of Igor Manushin . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}