Simplifying AI With the 3 Ms Approach
I've been dealing with data for several years now, and the 3 Ms approach to AI that I recently tried has proven to be reliable, easy to maintain, robust, and optimized. What do you think they are?
Join the DZone community and get the full member experience.Join For Free
With the growing need for and popularity of AI, crawling data from the web and summarizing this data based business requirements is one of the most common problems many people have to deal with. The problem becomes more challenging when the data sources can be any website or even a large set of websites, as providing a generic solution that meets all business requirements is very difficult.
I have been dealing with such problems in the last couple of years and recently, I tried an approach that I found to be very reliable, easy to maintain, robust, and optimized.
The approach is the 3 M approach, where the 3 Ms mean following:
I used the microservices approach for separation of concern. I segregated the AI algorithms and business logic from engineering work to make it easier for the team to maintain the application. There are two main reasons behind this segregation:
Usually, data scientists/AI-ML researchers and Product engineers have different specializations. Former are expert in AI/ML while later later are expert in infrastructure and engineering works.
For most data scientists and ML researchers, the favorable programming language are Python and R because they contains a huge list of libraries for AI and is pretty easy compared to other programming languages. On the other hand, for enterprise server-side applications, Java is the most preferable, as it has very robust and powerful frameworks like Spring, Hibernate, etc., which have plenty of features to make developers' jobs easy and simple.
Data crawling from the web is a tedious and unreliable process, as every website is different from any other website. The time consumed in crawling data from one website to another also varies significantly and depends on many factors. To make this process reliable, I used RabbitMQ to queue the requests and process asynchronously. This approach helped me in processing the request in a controlled way and it helped the user not to wait for a long time while the request was being processed.
Multithreading is used in the application for parallel processing of the request queued in the message queue. Use a thread pool with configurable size. The MessageQ listener continuously observes the MessageQ and calls the thread manager to start a thread whenever it is available in the pool.
The entire application is divided into the three components mentioned below, along with the technologies used in them:
Advisor-app: Java, Spring Boot, Spring Data, MySQL,and MongoDB
Advisor-Msg: Spring Boot, and RabbitMQ
Advisor-AI: Python, Django, NLP, NER, and various AI algorithms
Opinions expressed by DZone contributors are their own.