Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Machine Learning in iOS: Using the Core ML for Image Recognition

DZone's Guide to

Machine Learning in iOS: Using the Core ML for Image Recognition

In June 2017, Apple introduced Core ML, a framework that makes incorporating trained models into apps very easy. Read on to learn how to us it!

· AI Zone ·
Free Resource

EdgeVerve’s Business Applications built on AI platform Infosys Nia™ enables your enterprise to manage specific business areas and make the move from a deterministic to cognitive approach.

Artificial intelligence is a big trend right now — especially in regards to machine learning. The amount of available data on the internet is growing every day, so it makes sense that we would want to analyze it and turn it into useful information. Any insight that we can get from it will be helpful to discover new ways to enrich user experience in our apps.

What Is Machine Learning?

Machine learning is a science that allows computers to learn without being explicitly programmed. Many of the techniques used to accomplish this actually try to mimic the processes used by the human brain (like neural networks) or the process of natural selection (like genetic algorithms).

The general process, regardless of the algorithm chosen, consists of taking a set of raw data or “training data” and feeding it to the algorithm so that it can learn and then produce a trained model. This trained model will later serve as a tool to make inferences about new input data.

Machine Learning General Process

The training data could be anything from images to sound or text. This makes it very natural to think of it in a mobile context. At the 2017 WWDC, Apple introduced Core ML, a framework that makes incorporating trained models into apps very easy.

Core ML Architecture

How to Use Core ML

The first step to using this tool is to download one of the available trained models from Apple. If you have a custom model, you can also turn it into the Core ML format with Core ML tools or Apache MxNet.

For the purpose of this example, let’s download MobileNet, an open-source model for efficient on-device vision. According to Apple’s description, it detects the dominant objects present in an image from a set of 1,000 categories such as trees, animals, food, vehicles, people, and more.

After downloading the model, all you have to do is drag it to Xcode 9.0 anywhere under the project folder and it will recognize it. Make sure that the model is part of the main target so that Xcode also creates a Swift class for it. Once loaded, you can see the characteristics of the model as well as the inputs and outputs it will take and produce.

Once the model is ready, in order to get predictions, all you have to do is instantiate it and set the input data. MobileNet takes images as input and it will produce two different outputs: one that is called classLabel, a String that contains which one is the most likely image category, and the second classLabelProbs, a dictionary containing all the possible categories along with the probability for each one.

```
   func predictResult(pixelBuffer: CVPixelBuffer) {
        let classifier = MobileNet()

        guard let classifierOutput = try? classifier.prediction(image: pixelBuffer) else {
            fatalError()
        }

        guard let probs = classifierOutput.classLabelProbs[classifierOutput.classLabel] else {
            fatalError()
        }

        predictionResultLabel.text = "\(Int(probs * 100))% \(classifierOutput.classLabel)"

    }
```

Let’s run a test with different pictures to see the results:

Results

As we can see from the results, this model does extremely well with clear and prominent subjects. But with busy photos, it struggles a bit. However, these results are impressive for out-of-the-box functionality.

The applications for machine learning are endless. They include things like sentiment analysis, language detection, and voice recognition. Core ML Apple also released Vision, a framework that allows you to apply computer vision techniques to the device. This, with the existing NSLinguisticTagger for natural language processing, makes it so much easier to do many things like recognize dominant language in a text, identify names or places within a string, and even classify nouns, verbs, adjectives and other parts of speech.  

These new APIs are built upon Metal Performance Shaders and Accelerate, which work directly on the device and address privacy, data consumption, and availability concerns.

Takeaway

All of these APIs make incorporating machine learning into your app very easy and straightforward. It can only benefit us to  harness the power of this data and enhance the user experience in our apps.

Where to Go Next

Adopting a digital strategy is just the beginning. For enterprise-wide digital transformation to truly take effect, you need an infrastructure that’s #BuiltOnAI. Click here to learn more.

Topics:
machine learning ,ios ,ai ,core ml ,tutorial ,image recognition

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}