I was challenged by my team lead in my previous company to create a Java application module that will be integrated into our current POC project to act as a rule engine. The idea is to have a module that will prove if the given URL is malicious, benign, or suspicious given all the information pertaining to the URL such as registrant, IP location, URL pattern, etc.
I started the formal development given the business requirements and as I went through development trying to hard code all the conditions using if-else statements, I realized that the code I was writing was not going to be effective. Then, I asked myself, "What if there’s a sudden change in the condition that needs to be implemented?" The first thing that came to my mind is to externalized those conditions that can be written inside a file or config. Eventually, I stopped the development right then and there and thought of something that would address my problem.
I began searching the web and came to existing rule engine applications like Drools and JRules, but I was hesitant because I didn't know how these technologies work and I really wanted to create my own that would satisfy the requirements. I’m aware that I should not repeat myself (DRY principle) in doing things that are already there, but I wanted to put my idea to the challenge.
Eventually, I came to Prolog, a general-purpose logic programming language associated with artificial intelligence and computational linguistics. The program logic is expressed in terms of relations, represented as facts and rules.
Now let's try to give a brief explanation what "facts" and "rules" mean. I'll try to keep it simple so that we can get the idea of how we can use Prolog as a rule engine.
Clauses with empty bodies are called facts. An example of a fact is:
domain("example.com"). ip("192.168.1.1"). registrant.email("email@example.com"). reputation.score(88).
A rule is in the form of:
Head :- Body.
Let's try to create our three rules: malicious, benign, and suspicious.
We could say that a domain is malicious if it belongs to the registrant firstname.lastname@example.org.
malicious :- registrant.email("email@example.com").
We could say that a domain is benign if it does not belong to the registrant firstname.lastname@example.org. We use negation here. (We could add more to the body of our rule to increase our confidence in our rule)
benign :- not(registrant.email("email@example.com")),reputation.score(SCORE),SCORE>70.
We could say that a domain is suspicious if the domain is not malicious and not benign.
suspicious :- not(malicious),not(benign).
Now that we already have our facts and rules, we can now proceed.
I'm using the TuProlog IDE so that we can test our rules based on the facts given.
You can view the source code here for your reference.
P.S. Sorry for the coding — this was created during my second year in my IT career.