Machine Learning: If It’s Testable, It’s Teachable
Learn how algorithmic robots work with a simple example of teacher bots and student bots.
Join the DZone community and get the full member experience.Join For Free
Ever wondered how you went to YouTube to watch just a five-minute video but ended up there for three hours? Or saw an advertisement for exactly the same thing that you'd been planning to buy for a while and ended up finally buying it? Isn’t it great how your computer knows what you have been desiring? Well, it’s not your computer but the algorithmic bots that have watch you all the time — even now! So, the question is, How are these bots made and how do software testing concepts come into play?
It wasn’t very long ago when humans used to create algorithms that behaved like the real artificial intelligence. Simple and long code that basically said, "Do this if this…" or "Do that if that" for a while worked perfectly, and in many day-to-day activities, still work. However, as more and more people came onto the internet, creating bigger and bigger datasets, this approach simply didn’t work everywhere. Some problems are just too big to be solved by humans.
Let’s consider an example where you are shown two pictures: one is of a dog and another is number 5. You are asked what each is? Being a human, you just know that yes, this is a dog and that is number 5 — but how will the machine or the bots know?
We have to create a complex algorithm that will tell them to crawl the image pixel by pixel and find patterns in that picture that match pixel patterns usually found in known dog images and give an estimated guess. Human brains have a complex wiring of neurons that learns every second. Similarly, we have to create complex bots with evolving pattern recognition intelligence that they develop over time as they keep learning.
How Do They Learn?
Human programmers create a teacher bot and a builder bot that have simpler "brains." The builder bot builds student bots and keeps and discards them based on their test grades.
The teacher bot itself cannot distinguish between a dog and a number 5, but it can test whether student bots are right in identifying them.
Now, you get why automated testing is so important.
We give the teacher bots a bunch of photos of dogs and 5s and an answer key of which is what. Based on this, the teacher bot tests student bots and gives them the grades. Based on the test data, the builder bot keeps building on different student bots by adjusting different permutations and combinations of student bot algorithm mechanics. Sometimes, it even, at random, sees what sticks and what does not. The teacher bot keeps giving them tests and assigning them the grades. And the cycle continues.
The teacher bot keeps testing, and based on the grades of the student bots, the builder bot keeps the best-performing bots and ruthlessly discards the rest.
The "test, build, test" cycle keeps repeating in a loop and the grades are constantly assessed. Once a bot with approximately 99.9% accuracy is built (think of this as a grad of 99.9), the cycle is stopped.
Now, the question that comes to our minds is, How many times is the automated cycle of "test, build, and test" repeated? Well, it repeats as many times it is necessary until the bot with the best grades is built. The best bot is the best algorithm to distinguish between a dog and 5.
So, What’s the Problem?
Now, we have picked up the best algorithm to distinguish between a dog and the number 5. What is the problem that may occur? If we give the bot a video of a dog or 5 upside down or letter ‘S’ instead of 5, will the bot still be able to figure that out? No.
How to Solve It
To solve this, humans have to create longer automated test cases with more questions for the student bots to pass, including even the wrong and the right scenarios so that it is prepared for the wrong cases, too. Having more tests ensures better bots.
As there is not a single bot or some ten or twenty questions but millions of bots and zillions of questions, how does the "test, build, test" cycle repeat? In that case, you have to automate the process and keep testing it.
When a final bot is built, it works and it is the only one that survived among them all, as it is the only one whose algorithm was 0.01% better than the other bost. The algorithm that the student bot has built is not known by the teacher bot, not by the human overseer and not by even the student bot itself! It. Just. Works!
How the bot thinks or works, what it thinks, is not really knowable.
Let’s come back to the YouTube example that we discussed. We can understand this better now. The task here given to the student bots is to record the watch time of a user while keeping them engaged and the student bot that keeps the user engaged for the longest watch time will score the highest. The teacher bots assess all the student bots and the student bots keep giving recommendations to the users so that the user remains engaged. The one who gives the best recommendations and keeps the user engaged is the one with the best algorithm.
We’ve seen that the teacher bot is just testing the student bots and they are learning from the tests. So, in one line we can say that machine learning is teachable if it is testable.
Published at DZone with permission of Deeksha Agarwal. See the original article here.
Opinions expressed by DZone contributors are their own.