Take a look at what active learning is and how it is a technique to engage AI to train AI.
Join the DZone community and get the full member experience.Join For Free
Besides plenty of practical benefits, we do love machine learning (ML) for the fun it brings via magic-looking performance of solutions found and implemented for tasks accompanying different aspects of an ML-model lifecycle. Having over half a century of history of development in Moore’s law pace, artificial intelligence (AI) and machine learning still have untapped depths to refine for performance and impressive findings. Active and proactive learning are miraculous techniques to engage AI to train an AI.
Active Learning (AL) is a model supervised training technique. Starting from the beginning — basically, model training iteration consists of feeding a labeled data to inputs, assessing an output (prediction or inference), and correcting the model’s weights and biases. Then, iteration repeats with the next labeled data sample. After a number of iterations, the model is evaluated by feeding it with some unlabeled data from a separate pool, prepared beforehand along with the training data pool. Then, it goes to work into production or back to another set of training iterations.
The key to a model good enough for real-world tasks is in the quality of training data pool (we will get to it further) and in the quality of labeling.
Labeling data samples requires a domain expertise, and the more complicated domain, the deeper (and more expensive) expertise is required and more time per sample it takes. For example, in biomolecular research, labeling one sample takes up to 15 seconds. Imagine the headache of labeling thousands of samples for one training iteration. Active Learning comes to help, targeting to minimize the time of domain expert presence in the training loop. The main point is using so-called query strategies to pass to an expert the most informative (hence most useful for training) samples to label. Those query strategies are backed up by statistical or AI algorithms like query by a committee of specially trained AI models.
The other point of application-intelligent techniques in the Active Learning loop is model validation step. Similar to query strategies techniques — called selecting strategies — are used to select most salient unlabelled samples from validation data pool to training data pool, sustaining latter one’s quality.
Long story short, Active Learning is a supervised learning of AI with AI (and an expert in the loop). See exhibit 1 for a schematic AL cycle display.
Exhibit 1: regular Active Learning cycle
Ok, done with training, let’s move to the production. There’s a hell of a story with that too, but let us say we’ve made it through that — the model performs in a production. Like in the life of humans, training is not capable of perfect preparation to real life environment. Training cases scope is limited and relies on skilled choice of a tutor, which is ML-engineer in our case. E.g. you’ve trained a natural language processor (NLP). To what degree is it prepared to deal with cultural specifics? How about Cockney rhyming?
For more complicated tasks like financial fraud detection or early-stage disease recognition, such kind of occasional novelties (in fact they are anomalies to expected production conditions) are invisible to a naked eye, causing unpredictable inferences on the output of application. That inference is basically faulty and error spreads further to all the consequent processes and operations depending on ML application.
It would be totally nice if an ML application could react on that kind of situation and learn reactively and proactively, preventing faults from occurring or, if occurred, from repeating.
That would take capabilities for three functions:
- detecting anomalies, novelties, and concepts drifts and building high-quality samples pool for retraining,
- retraining (Active Re-Learning),
- redeploy a new version of the model into production on the go.
Active [re]Learning today is a well-known practice, so no troubles here. The situation with anomalies detection and bumpless redeploy on the go is a bit more complicated, but good news. Solutions for those have emerged on the ML operations market. There are environments built to run on any infrastructural premises and allowing hot redeploy as many times a day as needed. Those environments might be seamlessly integrated with monitoring solutions implemented upon statistical methods and ML-based algorithms (e.g. GAN, MADE) to detect anomalies, concept drifts, and faults in input and output production spaces. Voila! It's magic, and it looks like free will is the last thing distincting human beings from multi-layered perceptron.
Exhibit 2: Proactive Learning
- Chen Y, et al., 2013, Applying active learning to high-throughput phenotyping algorithms for electronic health records data, J Am Med Inform Assoc 2013;20:e253–e259. doi:10.1136/amiajnl-2013–001945
- Lewis D, Catlett J, 1994, Heterogeneous uncertainty sampling for supervised learning. Proceedings of the Eleventh International Conference on Machine Learning;1994.
- Lukas Biewald, Staritng a Second Machine Learning Tools Company, Ten Years Later, 2018, [online], https://medium.com/@l2k/starting-a-second-machine-learning-tools-company-ten-years-later-21a40324d091
- Pushkarev S., 2018, “Monitoring AI with AI”, 2018, ODSC East 2018 speech, [online video], https://www.youtube.com/watch?v=4M_3oZlc2B4
- Smith K., Horvath P., 2014, Active Learning Strategies for Phenotypic Profiling of High-Content Screens — Journal of Biomolecular Screening 2014, Vol. 19(5) 685–695
Opinions expressed by DZone contributors are their own.