Concerns Using Machine Learning in Software Development
Data, bias, and skillsets.
Join the DZone community and get the full member experience.Join For Free
To learn about the current and future state of machine learning (ML) in software development, we gathered insights from IT professionals from 16 solution providers. We asked, "Do you have any concerns regarding using machine learning in the SDLC?" Here's what we learned:
You might also like: 3 Key Challenges to AI Adoption and How to Solve Them
- We’re all at the ideation and learning stage. Take the leap and find opportunities. Data is not specific to software development; it needs to be clean and governed. Be able to generate trusted insights with high-quality data.
- It’s a new area that requires companies to make adjustments in how to collect and analyze data around the SDLC. Move from a qualitative to quantitative procedures. Selenium was never architected to use ML.
- There are inherent biases that some of the training data sets can lead to. Similarly, some models can produce more false positives, and this is true in employing ML in SDLC as well. Selecting the right algorithms and tooling will be a critical aspect of leveraging ML in SDLC. Now for a bit more aggressive view of the future: As we live in an era, where AI and ML are becoming commonplace occurrences (think autonomous vehicles), companies will start to wonder if there is a need for a human developer at all. Can AI produce AI? Can Intelligent systems self-generate executable code to achieve a set objective? If so, the need for humans in the loop is greatly diminished. Would AI and ML replace the current breed of software developers and negate the need for them? I see this becoming a relevant discussion in the not so distant future; however, at present, there is still a need for human engagement, to train systems to pick the right kind of algorithmic choices, the right kind of gating measures, etc.
- Think of ML differently than an app. The underlying systems are changing because they are inherently dynamic. As people deploy ML, they need oversight. Make sure experts are competent and they receive continuous training and certification. Experts need to demonstrate expertise on an ongoing basis. Put processes in place to make sure the system is doing what you expect and doing well.
- Skills are a concern. Having the expertise and knowledge to apply ML correctly is key and top of mind for all tech companies today. Another big topic is to have the right data sets with clean, labeled data that can be used for the applications. There are certain issues that can arise when applying AI and ML to end-user applications that are also inherent in the SDLC. Issues such as the introduction of unintended bias into the algorithms that might skew the outcomes you are working towards, or the lack of availability of good data, whether for training or for optimization. Another aspect we all must be careful of is the ‘black box’ approach. We need to protect against ‘key person dependency’ and avoid scenarios where if a colleague leaves the business, the whole team is unaware of the outcomes that a programmer is working towards. Working in silos is dangerous and contributes to the ‘black box’ metaphor.
- There can be a propensity to view ML as a be-all, end-all solution, but it’s not. It’s imperative developers adhere to traditional SDLC protocols to produce quality products.
- Developing and optimizing an ML model has too many hyperparameters. It is hard to distinguish a failure of the method versus bad parameter choice. If the model is used for multiple tasks, it is hard to make sure incremental improvements for one tasks are not going to break others.
- When it fails over there’s too much hype and not enough knowledge about the way it works. AI ops does not translate to development. All three cloud platforms have great tutorials. You can learn how to do algorithm training and model development.
- ML is an incredible technology but at the same time as we use ML for critical situations like a medical diagnosis or self-driving cars, we need to think about the deeper questions of something going wrong. How to track and determine the root cause. More work and focus needs to be given to this. The cost of making a bad decision needs to be considered.
- One main concern I have is the understanding of the problem that is trying to be solved. First, there needs to be an understanding of whether ML in the SDLC is truly needed. You can do a lot with basic rule-based approaches, ML can create noise, especially when you’re trying to do something very general and broad. I think people tend to start with something that’s overkill for what they need. The second problem is building a “one size fits all” solution. It is really hard to build something that can be applied everywhere the problem exists because the context is always important. Always focus on very specific and tailored use cases first. If you can solve those well, then see if you can expand and generalize to others after.
Here’s who we heard from:
- Dipti Borkar, V.P. Products, Alluxio
- Adam Carmi, Co-founder & CTO, Applitools
- Dr. Oleg Sinyavskiy, Head of Research and Development, Brain Corp
- Eli Finkelshteyn, CEO & Co-founder, Constructor.io
- Senthil Kumar, VP of Software Engineering, FogHorn
- Ivaylo Bahtchevanov, Head of Data Science, ForgeRock
- John Seaton, Director of Data Science, Functionize
- Irina Farooq, Chief Product Officer, Kinetica
- Elif Tutuk, AVP Research, Qlik
- Shivani Govil, EVP Emerging Tech and Ecosystem, Sage
- Patrick Hubbard, Head Geek, SolarWinds
- Monte Zweben, CEO, Splice Machine
- Zach Bannor, Associate Consultant, SPR
- David Andrzejewski, Director of Engineering, Sumo Logic
- Oren Rubin, Founder & CEO, Testim.io
- Dan Rope, Director, Data Science and Michael O’Connell, Chief Analytics Officer, TIBCO
Opinions expressed by DZone contributors are their own.