How To Handle 100k Rows Decision Table in Drools (Part 3)

In this article, I created a prototype to demonstrate how to handle large rows in a decision table with reasonable performance. This is part 3 of the series.

Ryan ZhangCheng

Mar. 02, 21 · Analysis

Likes (5)

Comment

Save

7.6K Views

As described in the previous article, we are facing a very challenging performance issue when solving 100k row decision tables.

To view the previous discussion, please click:

Solution 1: Rule Template + XLS

Solution 2: Precompile Spreadsheet Decision Table

In this article, I am going to change the mindset and change the decision table row data from rule to fact. Again, I am using Drools as my framework.

Solution 3: Convert Row Data to Fact, Not Rule

Solution 2 (Precompile the rules) helps a lot when the decision table is 10k below row size. However, when the row number comes to very big such as 100k, solution 2's improvement is neglectable. What’s worse is that the compilation time is too long, this is because it’s not a normal task to compile 100k Java files in a project.

By its design nature, the decision table is popular since it allows you to manage one rule in one row so that you can manage multiple rows in an easy manner. However, the watch that floats the boat also could swallow it.

Could we somehow reduce the rule numbers if the rule numbers are just too big?

Could we read the Excel file in RAW format, insert those row data as Facts and then compute them in memory? Then you have 100k Facts instead of 100k rules in memory.

Let's try this:

DRL

    Java
   
          x
         
rule "Check Client Object"
    when
        $k:Keyword($v : value)
        $c:ClientObject( descr matches $v )
    then
        $c.setPass($k.isResult());
        System.out.println("Check Client Object fired");
end

Drools Model

    Java
   
xxxxxxxxxx

public class Keyword {
    private java.lang.String value;
    private boolean result;
}

Insert Row as Fact in Java

    Java
   
xxxxxxxxxx

ClientObject o1 = new ClientObject();
for(int i=0;i<100000;i++){
  ksession.insert(new Keyword("DangerObjectxxx", false);
}
o1.setDescr("999");
int fired = ksession.fireAllRules();

With a simple test on the above code, it works, and the performance is <50 ms!

However, we don’t throw away the bathwater with the child. We don’t want to hardcode the Excel Read and Parser in our generic application code. The business logic needs to be decoupled with a generic application.

Could we load Excel data in a DRL file and insert the fact?

The answer is yes!

DRL is very dynamic and flexible, basically, you can do it just like you are writing Java code.

However, in order to simplify the DRL editing, usually, we could provide a clean utility to read Excel files and convert them to a Customized Fact to serve your purpose.

Improved DRL

    Java
   
xxxxxxxxxx

rule "Load keyword"
    when
        $kr:KeywordReader(excelFile matches ".*.xls")
    then
        System.out.println("Load keyword rule fired");
        List<Map<String, String>> list = $kr.getKwList();
        for ( Map<String, String> m : list){
            Keyword k = new Keyword(m.get("1"), Boolean.valueOf(m.get("2")));
            insert(k);
        }
end
rule "Check Client Object"
    when
        $k:Keyword($v : value)
        $c:ClientObject( descr matches $v )
    then
        $c.setPass($k.isResult());
        System.out.println("Check Client Object fired");
end

I also provided a simple Excel utility in my branch, See KeywordReader.java, we can add the Excel parse in the Drools Model Getter and Setter.

    Java
   
xxxxxxxxxx

public class KeywordReader {
    private String excelFile;
    private List<Map<String, String>> kwList;
    public KeywordReader(String excelFile) {
        this.excelFile = excelFile;
        kwList= new ArrayList<Map<String, String>>();
    }
    public String getExcelFile() {
        return excelFile;
    }
    public List<Map<String,String>> getKwList() {
        if(this.excelFile.isEmpty())return null;
        parseFile(getClass().getClassLoader().getResourceAsStream(excelFile));
        return kwList;
    }
    
    private void parseFile( InputStream inStream ) {
        //Parser Excel format and conver them into HashMapList
    }

How to parse Excel file:

    Java
   
xxxxxxxxxx

Sheet sheet = workbook.getSheetAt( 0 );
int maxRows = sheet.getLastRowNum();
DataFormatter formatter = new DataFormatter( Locale.ENGLISH );
FormulaEvaluator formulaEvaluator = sheet.getWorkbook().getCreationHelper().createFormulaEvaluator();
        //define start row in rowFrom variable
        for ( int i = rowFrom; i <= maxRows; i++ ) {
            Row row = sheet.getRow( i );
            int lastCellNum = row != null ? row.getLastCellNum() : 0;
            
            Map<String, String> rowData= new HashMap<String, String>();
            int index =0;
          //define start column in colFrom variable
            for ( int cellNum = colFrom; cellNum < lastCellNum; cellNum++ ) {
                Cell cell = row.getCell( cellNum );
                if ( cell == null ) {
                    continue;
                }
                double num = 0;
                index++;
                rowData.put(String.valueOf(index), cell.getStringCellValue());
         }
            kwList.add(rowData);

Client Code

    Java
   
xxxxxxxxxx

ClientObject o1 = new ClientObject();
KeywordReader kr = new KeywordReader("100kTable.xls");
o1.setDescr("9999");
ksession.insert(o1);
ksession.insert(kr);
int fired = ksession.fireAllRules();

Let’s run these rules in the client application:

    Shell
   
xxxxxxxxxx

mvn clean compile exec:java 
Initial Kie Session elapsed time: 4943 
Load keyword rule fired
fired rules: 1 elapsed time: 2761 
Is Object Pass:true
Check Client Object fired
2second round fired rules: 1 elapsed time: 70

Great, the performance is very good.
As you can see, loading the excel file in rule might take 2.7 seconds. After that, each rule execution only costs 70ms.

Pros

The biggest advantage is good performance. As you can see the performance is over 100 times faster than previous solutions considering the large number of input data in Excel files;

What’s even better, there is no compilation overhead.

The excel data can still be decoupled from application logic. All business rules data can be packaged in the rules project as usual.

Cons

There are two shortcomings as far as I can see:

Raw Excel data can’t be handled in kie-workbench.
There is some complex Excel Reader logic in the rule project.

It might not be comfortable for business users to maintain the Excel parser login in the Drools model, however, I think, you can wrap the Excel Parse logic in a separate jar file which would decouple it from a business rule project.

Database Drools Java (programming language) file IO Data (computing)

Opinions expressed by DZone contributors are their own.

Related

Trending