DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How To Handle 100k Rows Decision Table in Drools (Part 3)
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Trending

  • Feature Flag Debt: Performance Impact in Enterprise Applications
  • LLM-Powered Deep Parsing for Industrial Inventory Search
  • OpenAPI From Code With Spring and Java: A Recipe for Your CI
  • Lambda-Driven API Design: Building Composable Node.js Endpoints With Functional Primitives
  1. DZone
  2. Data Engineering
  3. Databases
  4. How To Handle 100k Rows Decision Table in Drools (Part 2)

How To Handle 100k Rows Decision Table in Drools (Part 2)

In this article, I created a prototype to demonstrate how to handle large rows in a decision table with reasonable performance.

By 
Ryan ZhangCheng user avatar
Ryan ZhangCheng
·
Mar. 02, 21 · Analysis
Likes (4)
Comment
Save
Tweet
Share
8.2K Views

Join the DZone community and get the full member experience.

Join For Free

As described in my previous article, we are handling a performance issue when solving 100k row decision tables.

Solution 2: Precompile the SpreadSheet Rule

Following the vertical thinking of solution 1, I think we can improve the situation corresponding to the problems:

  1. Don’t dynamically load Excel data at runtime, let’s precompile it at build time.
  2. Use drools spreadsheet decision table so that it can be 'version-controlled' by KIE workbench;

When drools using rule template + Excel to fire rules, what it actually doing under the hood is:

  1. Using ExternalSpreadsheetCompiler to compile rule template and rule data( ie the Excel file) into drl (Drools rule language).
  2. Drools engine compiles drl into Java byte code.
  3. Java byte code formed rules is fired in JVM.

So can we do the first step before the runtime? even better can we do even the first 2 steps of transformation?

Fortunately, the answer is yes, drools already provided a friendly maven plugin (kie-maven-plugin) to precompile drl, or drools awareness rule format into Java byte. It is called Drools Executable Model.

One stone two birds, it makes what is good even better. In order to apply the drools executable model solution, we need to convert the raw Excel format into Drools awareness spreadsheet decision table. It can be managed by 'kie-workbench,' so problem 2 is resolved. What we need to do is simply add the Drools syntax 'header' into the decision table.

The decision table

As you can see that:

B7: f1: ClientObject, it is the Fact declaration;

B8: descr matches $param, it is the condition logic

C8: f1.setPass($param), it is the action logic;

That’s it.

Let’s have a quick review of the change, then test the performance improvements.

Testing Improvements

Solution 2 is stored in the precompile-rule-solution branch.

1. In your rules pom.xml, drools-model-compiler is required.

XML
 




xxxxxxxxxx
1
11


 
1
<! -- This is required for compile execution model-->
2

          
3
<dependency>
4

          
5
  <groupId>org.drools</groupId>
6

          
7
  <artifactId>drools-model-compiler</artifactId>
8

          
9
  <version>7.39.0.Final</version>
10

          
11
</dependency>




2. Update kmodule.xml

Since we have converted the excel file into Drools awareness format excel file, we can get rid of the rule template. And update kmodule.xml as following:

XML
 




xxxxxxxxxx
1


 
1
<kmodule xmlns="http://www.drools.org/xsd/kmodule">
2

          
3
  <kbase name="template-db-KBase" default="true" packages="com.myspace.spreadsheet_decisiontable">
4

          
5
    <ksession name="mykiesession" default="true" />
6

          
7
  </kbase>
8

          
9
</kmodule>



3. Run mvn clean install

You can observe the kie-maven-plugin tasks build log which shows how we precompile the rules.

reStructuredText
 




xxxxxxxxxx
1
49


 
1
[INFO] - - kie-maven-plugin:7.39.0.Final:generateModel (default-generateModel) @ spreadsheet-decisiontable - -
2

          
3
[INFO] Artifact not fetched from maven: org.drools:drools-model-compiler:7.39.0.Final. To enable the KieScanner you need kie-ci on the classpath
4

          
5
[INFO] Found 10206 generated files in Canonical Model
6

          
7
[INFO] Generating /wdc/github/ryanzhang/drools-bigtable/rules/target/generated-sources/drools-model-compiler/main/java/./com/myspace/spreadsheet_decisiontable/P36/LambdaPredicate36A1EF91A79A800E8DCE48467E3FB5EF.java
8

          
9
…
10

          
11
[INFO] DSL successfully generated
12

          
13
[INFO]
14

          
15
[INFO] - - kie-maven-plugin:7.39.0.Final:generateDMNModel (default-generateDMNModel) @ spreadsheet-decisiontable - -
16

          
17
[INFO]
18

          
19
[INFO] - - maven-compiler-plugin:3.8.1:compile (default-compile-1) @ spreadsheet-decisiontable - -
20

          
21
[INFO] Changes detected - recompiling the module!
22

          
23
[INFO] Compiling 10207 source files to /wdc/github/ryanzhang/drools-bigtable/rules/target/classes
24

          
25
[INFO]
26

          
27
[INFO] - - kie-maven-plugin:7.39.0.Final:build (default-build) @ spreadsheet-decisiontable - -
28

          
29
[INFO] Artifact not fetched from maven: org.drools:drools-model-compiler:7.39.0.Final. To enable the KieScanner you need kie-ci on the classpath
30

          
31
[INFO] kieMap not present
32

          
33
[INFO] KieModule successfully built!
34

          
35
[INFO]
36

          
37
[INFO] - - maven-jar-plugin:3.2.0:jar (default-jar) @ spreadsheet-decisiontable - -
38

          
39
[INFO] Building jar: /wdc/github/ryanzhang/drools-bigtable/rules/target/spreadsheet-decisiontable-1.0-SNAPSHOT.jar
40

          
41
[INFO]
42

          
43
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
44

          
45
[INFO] BUILD SUCCESS
46

          
47
[INFO] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
48

          
49
[INFO] Total time: 01:35 min



What kie-maven-plugin does for us is:

  1. Use drools-model-compiler to convert spreadsheet decision table to generate 10206 Java code (Notice that it’s even better than drl file).
  2. Generated Java code was compiled into byte code and packaged into jars.

So when myapp client code loads the jar file, it would directly call the byte code without bothering to analyse the Excel file and do the drl parser, etc.

What’s important is that the business logic is still wrapped as its own project, not leaking into generic application code and lifecycle.

Performance

Let’s see the performance of solution 2:

10K rows table scenario:

Shell
 




xxxxxxxxxx
1


 
1
cd rules
2
mvn clean install -DskipTests
3
#You would notice that there 10000 java/class file was generated by kie-maven-plugin
4
# The rule package is spreadsheet-decisiontable-1.0-SNAPSHOT.jar
5
cd ../myapp
6
mvn clean package -DskipTests
7
Initialize Kie Session elapsed time: 1826
8
fired rules: 1 elapsed time: 167
9
Is Object Pass:false



100K rows table scenario:

Shell
 




xxxxxxxxxx
1


 
1
cd rules
2
mvn clean install -DskipTests
3
#It took me 16 mins in my laptop to compile, there are 100k java files generated
4
# The rule package is spreadsheet-decisiontable-1.0-SNAPSHOT.jar
5
cd ../myapp
6
mvn clean package -DskipTests
7
Initialize Kie Session elapsed time: 21885
8
fired rules: 1 elapsed time: 8603
9
Is Object Pass:false



Put the performance data into a table to compare:

Solution(Row size) Warm-up time One rule Execution Rule Compile(Package)
Rule Template + XLS (10k) ~8 s ~400ms 14 s
Precompile spreadsheet decision table (10k) ~1.7 s (4x faster) ~150 ms (2x faster) 1.5 mins (5x slower)
Rule Template + XLS (100k) 99s 9500ms 1.5 mins
Precompile spreadsheet decision table (100k) 21s (4x faster) 8500ms (similar) 15 mins (10x slower)

Pros

Two obvious advantages we have gained by applying Drools executable models.

  1. Runtime performance is obviously improved.
  2. Spreadsheet decision table can be governed by 'kie-workbench.'

Sometimes this is not obvious to some users when they start to adopt rules oriented application framework. But it’s quite important from a rules governance perspective, such as version-controlled your rules data, deploy testing, and release your business rules. With the help of kie-workbench, all those features are already provided out of the box.

Cons

Solution 2 has two shortcomings, I think.

  1. Compilation time is quite long.

For a 10k rows number, the compilation time of 1.5 mins seems acceptable. It actually generated and then compiled 10k small Java files.

But for 100k rows numbers, it does not come out a reasonable compile time. It takes ~15 mins to complete. It would become very awkward no matter for dev experiences or CICD experiences. It just took too much effort when the rules became a certain level of amounts.

     2. When the rows number is too big, like 100k rows, the performance improvement is very small.

Comparing the big effort to precompile it, the performance gain is not so big as we can see from the comparison data in the table.

Summary

For a certain number of decision tables, it seems that precompiling the rules can improve the runtime performance dramatically. And it’s worth a try, I think.

However, when decision tables come to 100k, it seems that it still doesn't produce very good results.

However, in reality, it’s quite common that keywords or condition values become very large. So we still need some better solutions to tackle 100k row decision tables.

In my next article, I will show a different approach to transform the dimension of fact and rule to improve performance.

Database Drools

Opinions expressed by DZone contributors are their own.

Related

  • How To Handle 100k Rows Decision Table in Drools (Part 3)
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook