DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • From Algorithms to AI: The Evolution of Programming in the Age of Generative Intelligence
  • Embracing AI for Software Development: Solution Strategies and Implementation
  • Python Bags the TIOBE Language of the Year 2021 in a Row
  • AI, ML, and Data Science: Shaping the Future of Automation

Trending

  • Unlocking Data with Language: Real-World Applications of Text-to-SQL Interfaces
  • How the Go Runtime Preempts Goroutines for Efficient Concurrency
  • Blue Skies Ahead: An AI Case Study on LLM Use for a Graph Theory Related Application
  • How to Practice TDD With Kotlin
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Machine Learning in Software Development — Techniques and Tools

Machine Learning in Software Development — Techniques and Tools

The ability to version-control ML models, automate testing, and provide better feedback.

By 
Tom Smith user avatar
Tom Smith
DZone Core CORE ·
Sep. 26, 19 · Analysis
Likes (2)
Comment
Save
Tweet
Share
9.3K Views

Join the DZone community and get the full member experience.

Join For Free

Image title

Machine learning techniques and tools

To learn about the current and future state of machine learning (ML) in software development, we gathered insights from IT professionals from 16 solution providers. We asked, "What machine learning techniques and tools are most effective for the SDLC?" Here's what we learned:

Tools

  • MLFlow, Bugspots, Helium, and Appvance are some pretty powerful tools. I particularly like MLFlow for its ease of use and ability to version-control ML models.
  • We adopted MLFlow for our data platform — ML data platform management system. Operational database real-time and transactional for in-database ML to track the workflow of the data scientists. If you adopt a culture of experimentation, create 50 experiments a day, each running and producing a different result,  you need to keep track of each. You need the ability to tag with parameters and metrics so you can go back and see why one model performed better than another.
  • We’re building those tools as part of our platform. Open source tools like SciLearn, Pytorch, TensorFlow, and build our own.
  • A lot of the new modern test automation tools allow you to have self-healing tests, automated tests, and automated crawlers to find bugs. Logging systems to find anomalies for security alerts. Most of the focus is around maintenance.
  • Tools simplify infrastructure and data engineering for developers. With ML an explosion of things needs to happen. Easy integration into the application. Debugging is more difficult because the ML modes are living entities and drift occurs as data and learning changes. The biggest challenge is the debuggability of code and application. Make sure you have the traceability of your model decisions. Model performance evaluation over time.

Feedback

  • The most effective technique is to define the task at hand as clearly as possible and immediately come up with an automatic evaluation method. Following this step, you ought to collect and label a small dataset for your problem, overfit to that dataset with any method, and try to close the whole production loop: dataset collection - training - evaluation - deployment. A majority of the time you’ll realize that your evaluation method is actually not what you had intended for your product, causing you to have to go through these stages again.
  • The answer for everything is DevOps but a better answer is thinking in terms of providing useful feedback loops. We tend to focus on ceremony and mechanics without instrumenting ops in a way that a dev finds value from the metrics. To prevent analysis paralysis, including ML on the ops level to give developers the information they need. Want anomaly rates that diverge from projections. Build anomaly detection models based on code. Ops is creating better feedback data for developers.
  • Python by default is the language for scripting the frameworks. There are a lot of models that can be used, or you can build your own. Reinforcement learning (Deep adversarial, Q), semi-supervised and using Closed-loop ML techniques have proven to be beneficial in different phases of SDLC. When organizations build models, the underlying premise is that the model’s accuracy and efficiency are based on certain assumptions and is dependent on the training data set it is privy to. If there is a change in data patterns or unanticipated scenarios, the model’s accuracy and efficiency may diminish over time. For example, in a manufacturing plant, a model can be deployed to detect defects on parts being manufactured and assembled in the assembly line. Over time, the model’s ability to accurately identify the errors may diminish. This results in severe challenges if the software uses traditional analytics exclusively. However, when equipped with closed-loop functionalities, the smart agents can auto-detect and trigger a re-learning and re-training process to improve the accuracy and performance of the models automatically, leading to increased productivity, efficiency and cost-savings. The closed-loop ML technique for the SDLC can use a reinforcement or unsupervised algorithms to train, test and validate ML models to improve accuracy. Post the initial deployment, as needed, the model can self-learn, self-adjust and detect variations in its own accuracy and performance. In short, it will tune itself so that the output is optimal.

Other

  • ML is becoming standardized across the SDLC — people are learning how to use it, getting vision into where things are going, and becoming more distributed.
  • We're seeing more around deep learning and specific ML methods.
  • It depends on the business case. Classic data science is needed to understand the right algorithm and ensure data management. You may need to choose a model that’s almost as good but computationally less expensive. Incorporate a desirability function to consider the cost of planning and deployment.
  • Techniques I am seeing include learning techniques such as concept learning, decision trees, neural networks (and convolutional neural networks), if/then rules, reinforcement learning, inductive logic programming, and the like.
  • Here are the main elements:
    • 1) Ensuring business requirements and expectations are set from the beginning. This helps define the ROI for the project and what you’re looking to solve for (i.e., better customer engagement, reduce churn, etc.).
    • 2) Converting the business problem into a technical problem. This lets you define what data is needed, the approach, where to start, etc. so you can set the scope of the solution. You take the business problem of improving customer satisfaction or gaining market share and you turn it into a data science problem: prediction for customer conversion/customer churn, user segmentation, product recommendation, etc. which is something that you can solve for using data and a model. 3) Establish what data is actually available to solve the problem. This can be one of the biggest limiting factors of applying ML in the SDLC. There needs to be sufficient and relevant data to solve the problem, and there needs to be a base level of normalization. Given the technical problem, you need to identify which entities can be relevant features to plug into the model. 4) Design the rotation process. Given your toolkit, start with the simplest approach possible and see how it performs. Based on those results, you have a sense of direction for where to go and how to add complexity. 5) Experimentation and Quality: Design experiments so you can test performance, make modifications, re-evaluate, then rinse and repeat. Make sure you pick the right metrics, so you measure what really matters.

Here’s who we heard from:

  • Dipti Borkar, V.P. Products, Alluxio
  • Adam Carmi, Co-founder & CTO, Applitools
  • Dr. Oleg Sinyavskiy, Head of Research and Development, Brain Corp
  • Eli Finkelshteyn, CEO & Co-founder, Constructor.io
  • Senthil Kumar, VP of Software Engineering, FogHorn
  • Ivaylo Bahtchevanov, Head of Data Science, ForgeRock
  • John Seaton, Director of Data Science, Functionize
  • Irina Farooq, Chief Product Officer, Kinetica
  • Elif Tutuk, AVP Research, Qlik
  • Shivani Govil, EVP Emerging Tech and Ecosystem, Sage
  • Patrick Hubbard, Head Geek, SolarWinds
  • Monte Zweben, CEO, Splice Machine
  • Zach Bannor, Associate Consultant, SPR
  • David Andrzejewski, Director of Engineering, Sumo Logic
  • Oren Rubin, Founder & CEO, Testim.io
  • Dan Rope, Director, Data Science and Michael O’Connell, Chief Analytics Officer, TIBCO
Machine learning Software development Data science

Opinions expressed by DZone contributors are their own.

Related

  • From Algorithms to AI: The Evolution of Programming in the Age of Generative Intelligence
  • Embracing AI for Software Development: Solution Strategies and Implementation
  • Python Bags the TIOBE Language of the Year 2021 in a Row
  • AI, ML, and Data Science: Shaping the Future of Automation

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!