It is a common scenario where your application or business process is dependent on another system. Equally common is the fact that the source system is a legacy system offering few options for integration. For the purposes of this article, assume the legacy system provides status information for an aspect of your business, which is vital to your application.
The most common way to handle this requirement is to have a process query the underlying database on a scheduled basis to determine if any updates have occurred since the last time the program was called. The diagram below illustrates this scenario:
The biggest challenge with this approach is that the updates are limited to the schedule for the batch process. So, if the job runs once an hour, updates made a second after the job finishes will not be available until an hour in the future. Of course, running the batch process on a more frequent basis can likely elevate other concerns.
How Apache Kafka Can Help
While often considered a messaging system, Kafka functions at a different level of abstraction - which is the structured commit log of database updates. As transactions are written to the logs, an Apache Kafka Producer monitors the log for conditions which meet specific criteria:
In our example, as database commits related to status changes are made to the legacy system, Kafka monitors and captures the updates. At that point, Kafka can be instructed on what to do with the event. Often times, Kafka feeds to a stream data platform. In our case, using Java with Kafka, it is possible to place a message on a queue for a processing by our Enterprise Service Bus (ESB). Our flow could then be illustrated as shown below:
With the above flow in place, as the changes are committed to the legacy system's relational database, the Kafka Producer identifies the updates via the transaction logs and places the messages on a message queue within the ESB. From there, listeners on the queue take the appropriate action.
Unlike the batch process, the Kafka Producer is listening by way of monitoring the transactions logs. This provides near real-time notification of when status changes occur.
It is likely that Big Data is on a road map in your IT infrastructure. It is also likely that your product of choice includes the option for streaming data. When building your list of desired functionality for Big Data in your organization, it is important to look beyond the analytical wins and to consider how Big Data can assist your transactional applications.
Have a really great day!