Are We There Yet? Effort Estimation in AI Projects

DZone 's Guide to

Are We There Yet? Effort Estimation in AI Projects

Deep Learning is hitting the mainstream, but if your traditional software team has limited experience of AI, you have a few engineering challenges ahead!

· AI Zone ·
Free Resource

Artificial Intelligence (AI) is regularly in the news, particularly Machine Learning and Deep Learning. We may not realize it, but most of us are using AI through applications from technology titans such as Google, Facebook, Netflix, Amazon, and Apple. Our uptake is enabled by the relatively huge capabilities of our home computers and mobile devices when compared to the computing power available to previous generations. The availability of large datasets, coupled with recent rapid advances in scientific research, has driven AI to deliver irresistible benefits to technology providers able to take advantage of them.

As Artificial Intelligence comes of age, it’s not just the software behemoths who are adopting it; businesses across the spectrum are considering expanding their conventional software projects to reap the rewards of AI. And why not? It shouldn’t be the preserve only of scientific researchers and large technology companies. However, if you or your organization are about to join the gold rush, you need to be aware of some of the potential challenges that many teams experience as they introduce Artificial Intelligence.

In this article, I’m going to summarise one such challenge, that of being able to estimate how much time a project using AI will need to reach maturity. My inspiration is drawn from a white paper to which I recently contributed, and which discusses a range of engineering challenges experienced by conventional software engineering teams that transition into the field of Deep Learning. Graphics and quotations included herein are by kind permission of Peltarion, with whom I authored the paper.

Common Terminology

“Artificial Intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals.”

The definition above is taken from the technical writer’s usual source, Wikipedia, but in this case, I feel it is somewhat lacking in depth (although the accompanying article is comprehensive). The problem of defining AI is many years old, and outside the scope of this article, and anyway, it has been tackled by experts with far better understanding and capacity to answer the question than I possess. Perhaps the best I can do is to use another famous definition, which is that computer intelligence is a simulation of human intelligence, and the AI in question passes the Turing test if it is able to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. Rather than provide anything more detailed here, I’d advise you to seek out a resource devoted to the subject at the level you need, of which plenty are available.

Machine Learning (ML) is a particular branch of AI that uses statistical techniques to parse data and learn from it, building a model from sample inputs that can then make predictions based on what it has learned. Machine Learning is used in a range of computing tasks such as email filtering and online recommendation engines.

Deep Learning (DL) is a subfield of ML, where the software is able to train itself, learn and make predictions automatically. It is still a nascent technology, although is gaining huge momentum. A few examples include Google Translate, DeepFace from Facebook, and Apple’s virtual personal assistant, Siri.

ML and DL models can be constructed using supervised learning where the computer is presented with example inputs and desirable outputs, and is “trained” to learn a general rule that maps the labelled inputs to the outputs, so that it can make predictions when presented with new, unseen data. For example, if you wanted to teach an AI model to identify every picture of a walrus on the Internet, you would input thousands of pictures and label those that have a walrus in them, and those that do not. After the model has been trained on a large number of photos, it should be able to determine whether an unlabelled photo that it has previously not “seen” contains a walrus. Much like humans who practice a skill intensely, the model is increasingly accurate with larger amounts of training data.

In unsupervised learning, no labels are specified and the learning algorithm is left to find its own pattern or structure for the input data.  The most common use case is for cluster analysis, which is used to find hidden patterns within data. It may build clusters of walruses and non-walruses, but may equally build clusters of images where the sun is shining, and those where it is cloudy. Or differentiate the images into clusters based on the most prevalent colours therein.

Pile on the Data

For an AI model to succeed, it requires training on plenty of good data. In our example of learning to pick out a walrus, the data would not only be thousands of photos, it would be thousands of photos that are labeled with the presence or absence of a walrus.

As a Deep Learning model is trained, it effectively guesses whether each photo contains a walrus or not. Each layer in the model works on a different level of walrus identification, from abstract lines and colors to higher-level shapes and shades, all through the image’s pixels. When the model is told whether it guessed right or wrong, the connection in the model adjusts its weighting accordingly. After a sufficient number of guesses, the model is weighted to the point that it has a good idea of what a walrus looks like.

Effort Estimation

Estimating the time and resources required to complete a traditional software project can be relatively easy when it is based on a modular design and worked upon by an experienced team following modern development practices. It will always be much harder to estimate the effort needed when you initiate an AI project. Although the goals may be well defined, there is no way of guaranteeing when a model will achieve the desired weightings; an unknown number of iterations will be needed before results reach acceptable levels. It is not usually possible to decrease scope and run the project in a time-boxed setting with a predefined delivery date since you may not have achieved the model’s efficiency goals by then.

It is essential to recognize that there may be uncertainty in this aspect. An added complication is the lack of transparency inherent in many models. While statistical algorithms such as logistic regression and Bayesian inference are rather well understood, Deep Learning possesses powerful but complex and poorly understood models, making them unpredictable.

One approach that can mitigate against delays is to ensure that the inputted data is in the optimum format for the AI. Additionally, running experimental models which are trained and evaluated in swift prototyping cycles may not eliminate all uncertainty but is likely to be effective in hitting milestones earlier.

Case Study: Poker Bot Identification

Online poker grew very quickly in the early 2000s; prompting nefarious attempts to use software “bots” to make a profit, successfully in some cases. Those running online poker sites needed to keep games free from bots for their real customers to stay loyal. One such organization initiated a Deep Learning project to detect the bots and quickly lock down their accounts. The data used included the statistics from gameplay activity (actions taken according to the state of the game) and game client technical details (connection IP, hardware fingerprints, player clicking/timing statistics).

Despite promising intermediate results, the project was canceled before it was completed because the model didn’t reach expected levels in the first few iterations. Although the goals of an ML project can be well defined, there is no way of guaranteeing when a satisfactory model will be reached, and it is not until that point that you can claim to have accomplished anything of value for the client. Being unable to set a final delivery date may lead to them being put on hold, despite promising intermediate results.

Additional Challenges of AI Projects

The white paper I mentioned earlier discusses a range of engineering challenges, such as effort estimation. These can be categorized into three areas: development, production, and organization, as shown in the graphic, which further subdivides them according to people, process and technology.

Image courtesy of Peltarion


Some large companies have begun to use AI and Deep Learning routinely in their products and services. However, there are a number of challenges to designing and implementing AI systems that are not always obvious to organizations unfamiliar with Deep Learning projects. They can be difficult to resolve if a team is inexperienced in the approach necessary to solve these kinds of challenges. While traditional software engineering teams may be experts in the use of high-quality tools and processes for coding, reviewing, debugging and testing, these are rarely sufficient for building production-ready systems containing Deep Learning components.

Teams familiar with traditional software engineering, together with the Deep Learning community, need to combine their knowledge to find solutions to these challenges. Only then will the advantages of Deep Learning technology become available to the majority of companies around the world.

artifical intelligence, deep learning, estimation, machine learning, supervised learning, unsupervised learning

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}