Intelligently Automate Machine Learning, Artificial Intelligence, and Data Science
Automation will need to be augmented with human intelligence for the foreseeable future.
Join the DZone community and get the full member experience.Join For Free
It was great speaking with Michael Berthold, Founder and CEO at KNIME during their fall summit. Michael created KNIME after seeing all of the great data pharmaceutical companies were generating but also seeing the difficulty they had garnering insights due to the challenges of massaging and analyzing the data.
KNIME is an open platform that enables organizations to put their data to good use. Open data science platforms enable:
Seamless integration of data and tools
Freedom to use the frameworks and languages you choose
Intuitive ease of use by data scientists, analysts, developers, and business owners
Transparent visibility into the models being used which promotes honesty and trust in the process
Flexibility and agility
Recently, we've seen a proliferation of tools that claim to automate all or part of the data science cycle. Typically those tools automate only a few phases of the cycle and only a small subset of available models and are limited to relatively straightforward, simple data formats.
KNIME's vision for automation is no black boxes. Everything is open and enables tools and data to be automated and combined seamlessly. If the data science team works on a well-defined analysis scenario, then more automaton may make sense. However, more often than not, the interesting analysis scenarios are not that easy to control and a certain amount of interaction with users is highly desirable.
KNIME has laid out the principles of guided analytics in a thoughtful and thorough way including:
Uploading the dataset
Selecting the target
Establishing feature engineering settings
This way of creating analytical applications allows for automation and interaction to be mixed and matched based on the needs of the analysis. KNIME's workflow serves as the blueprint for anyone to build their own version of a guided analytics application. The workflow provides reusable pieces for data transformation, and cleaning, feature selection, and engineering, model optimization and selection, and even allows users to download and inspect the resulting scoring workflow.
The workflow is available on the KNIME workflow hub. The following video walks through the different steps and explains the underlying techniques.
According to Berthold, "Guided Analytics for Machine Learning Automation," is a starting point from which KNIME will provide more custom variations while encouraging their community to share on the community workflow hub.
He foresees AI hype cooling down in the coming year with some applications having success with pragmatic AI and making AI/ML part of the mix. He sees more hype around automation continuing with the potential for automation around well-defined problems. The move to production with big data will continue to be a challenge thus the importance of flexible platforms that productionize data science.
Opinions expressed by DZone contributors are their own.