What It Takes to Build a Production-Ready AI Solution
What It Takes to Build a Production-Ready AI Solution
Learn about the process that one organization took when building an AI solution for the first time and the lessons that they learned from it.
Join the DZone community and get the full member experience.Join For Free
Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.
When we decided to open up SmartCat, we already had various experiences working mostly for outsourcing web development companies. At first, we tried to apply the same processes and steps to build data science solutions. We realized soon enough that this would be a learning journey and we needed to adjust our way of thinking in order to bring the AI project to production successfully. What makes this process even harder is the team itself, consisting of engineers and scientists from different backgrounds, speaking completely different languages.
This article describes our experience story so far; it is not complete yet since this field is changing rapidly and we are adjusting to changes on a daily basis.
Initial Idea vs. Reality
We wanted to build a data science company and we wanted to help various businesses solve their problems by using smart solutions. Data science and big data were gaining popularity. Everybody was reading about machine learning and AI solutions and how those solutions had made other companies successful. Most of the clients that contacted us back then came with the same problems: "We are collecting data and we want to do something smart but we do not have a clear vision. Can you help us?"
This is how we came up with a business-insights-as-a-service offering, where we request the data and data specifications, jump on it, and do exploratory analysis in order to gain insights and produce key conclusions, visualizations, and a couple of ideas for the future. After such engagement, clients mostly rejected that idea because they realized it was not a priority for them at the time. And that was the majority of our data science engagements back then.
On the other front, we were mostly successful, since data engineering and data ops services were something that our clients requested often. They recognized the power of data, and they wanted to equip themselves with the right infrastructure and tools. So they came to us for help with building pipelines and storage solutions. At the beginning of SmartCat, our two teams — data science and data engineering/ops — were separated, working on different projects and not collaborating much.
Our vision from the start was to make a company that could bring data science solutions end-to-end — from the phase of working on ideas with the client over workshops where we choose the right idea and create a detailed plan to the prototyping phase, followed by the integration phase. But those projects were nowhere to be found. We were patient since we firmly believed there was a gap in the market, and companies that were not big enough to have an internal data science team could profit from our expertise.
And after a year or so, an end-to-end project appeared: a recommender system that would work on user analytics and real-time offers to produce personalized recommendations. Finally, all aspects we wanted in a single solution: a recommender to build personalized recommendations was the perfect task for the data science team, given that real-time offers are something our data engineering team solves each day, and all of this had to run on a reliable infrastructure, which was perfect for our data ops team.
But things did not go as smooth as we had hoped for.
The Process, or Lack of It
So, we started. The first phase was easy. A single data scientist sat down with the data, started to write Python Jupiter notebooks, and started reasoning about data. We were confident that we understood the problem, so we did not involve the client that much. We gained some key insights, plotted several graphs, and composed a document with insights, features, and ideas about implementation. We organized a call with the client, eager to present our findings. During the call, we realized we did not involve the client enough in this phase since some things that we found were not relevant for the domain, some things were not relevant for this particular company, and we missed the important details. Our report was half-finished, and it was back to the drawing board. However, we realized that this call was really important for us.
Lesson 1: We should involve the client more often in this phase.
We prepared a good report on the second attempt, and the client was happy with the results. We chose one of the approaches and decided to start with prototyping. Again, a lone data scientist started creating a web demo to showcase the idea. We tried a couple different algorithms and settled for the one that made sense to us and gave the best results. We put together a small web demo and the client started testing. They were satisfied with the results, with a few small remarks that we fixed, and we got the approval to proceed to the production phase.
The production phase, in our minds, should have been easy — we had the web demo, we would just make the API and integrate with the final solution. We hugely underestimated this phase. The prototyping code was not ready at all. It was done by a data scientist. I mean no offense, but they were focused on their own problem: clustering users and showing the best possible recommendations to each user. They are all about precision; if a test set shows above 90% accuracy, that is considered a good job. They do not think about the number of users, ways to integrate, dataset size, performance, availability of the systems around the data science solution, etc. One example is using Flask as Python web framework. It is good for this phase. You can put a demo together easily and quickly. But problems occur when you have more than one user. If we had done all this in Django, we could have avoided double work. Also, working with the database dump is not the same as working with the database itself. You are not just a user of the system, and you must take that into account.
Lesson 2: Involve data engineers with data scientists earlier, already in the prototyping phase, to avoid double work.
You should have heard how we communicated about problems. It took us a while to explain to each other how the system worked. The data science team could not grasp how data was ingested into the system, what should be cached, what hurt our performance, etc., while data engineers asked a bunch of times how the whole pipeline for calculating recommendations worked. We spoke different languages. Where the data science team used the word "bean," the data engineering team used the word "bucket," and they were explaining basically the same thing. We paid the price for not working closely together previously and not sharing knowledge enough.
Lesson 3: Do more internal knowledge-sharing sessions and adjust vocabulary and processes.
The other thing we realized in this phase was that we were building a complex system, which we took for granted. You must do a good job of automation and monitoring so that you can maintain the system with success. When working on data science systems, you need to monitor your tools and infrastructure, but you also need to monitor how your algorithm works in production (i.e. click rate, goals achieved, revenue generated). These are the things we know now and say upfront to the clients but which we did not know back then.
Lesson 4: Do not underestimate the complexity of the system, and speak upfront about all the details that are needed for a production-ready system with the client.
The first phase is really important; make sure to do the best job here to understand the client's needs and their business. In this phase, communication is the key ingredient. Our face-to-face workshop helps to build a strong relationship with the customer, understand the problem, and define an implementation plan in this phase. All upcoming phases depend on the success of this phase.
We know today much better what types of engagements there are, what the steps for each phase are, and what team member you need in each phase to finish the project successfully.
We have divided our projects into three groups: business insights as a service, prototyping, and production. For each phase, it is important to have people from different teams involved working together to have a successful delivery. From the initial idea, one more role has appeared in our technical team — a data wrangler, someone who does not have to be a data scientist with rich knowledge about algorithms and approaches but someone with strong business knowledge; a glorified business analyst armed with Jupyter and Python knowledge. They play an important role in the initial phase, work with clients in the business domain, and extract key findings from data. Their sole task in the initial phase is to come up with the best possible idea that can improve the business of the client.
We have improved our internal communication. We now know that "bucket" for engineers is "bean" for scientists. We are learning the vocabulary of each team and we are more efficient. What makes us better is the fact that we need to explain our solution to different team members not speaking our language, and in the end, this means we will do a better job explaining what we do to our client. It is easy to explain an approach you took to a fellow data scientist, but not so much when speaking with a business stakeholder interested in the profit that the said solution will realize. You need to adjust your language, so we started practicing that internally. Diagrams also look much better now, since people from different teams need to understand what is going on, so visualizations that we make have improved a lot these days.
As I said at the beginning, this is a field that is changing a lot nowadays, so I expect this is not the end of our learning journey. What is certain is that we are a much better company now than we were two years ago, thanks to projects done with mixed teams.
Published at DZone with permission of Nenad Bozic , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.