How to Build Your Own Visual Similarity App
How to Build Your Own Visual Similarity App
I'll be showing Clarifai's Face Embedding model three different photos: one of me, one of my mom, and one of my dad. It will tell me which parent I look most similar to.
Join the DZone community and get the full member experience.Join For Free
We recently released a new public model called Face Embedding, which was developed from our Face Detection model. If you’re not familiar with the term embedding, you can think of it as the numerical representation of a model’s input in 1024 dimensional space. Don’t worry if it’s your first time hearing this terminology — we’ll dive a little deeper in this tutorial. For now, just know we’ll be using embeddings to create a simple program that will let you compare how visually similar two people’s faces are! More specifically, this program will use the embedding model to see which one of my parents I look more visually similar to.
Here are the photos I will be using for this tutorial. Feel free to replace my photos with those of your own! You can also make your guess now on which parent I look more similar to. The bottom of this post will have Clarifai’s result for you to compare against.
- You need a Clarifai account and can sign up for free here.
- We'll be using our Python Client. The installation instructions are here. Make sure to click Py to see the Python instructions.
- Install Numpy with the instructions for your environment here.
Deeper Dive Into Embeddings
You can think of an embedding as a low-dimensional representation of a model’s input that has rich semantic information. That means the Face Embedding model will receive an image as an input, detect all faces in the image, and output a vector of length 1024 for each detected face. The response returned from the API will contain all the vectors generated from the image. These vectors can be thought of as coordinates in a 1024 dimensional space. Images that are visually similar will have embeddings that are “closer” to each other in that dimensional space.
We can find how visually similar two faces are by calculating the distance between their corresponding embeddings. In this example, we will be using the Numpy library.
Let's Write Some Code!
Start by setting an environment variable for your API Key so our Python module can use it to authenticate
Paste the following into a file named
import json from clarifai.rest import ClarifaiApp from math import sqrt from numpy import linalg from numpy import array # Initalize Clarifai and get the Face Embedding model app = ClarifaiApp() model = app.models.get("d02b4508df58432fbb84e800597b8959") # Dataset kunalPhoto = "http://imageshack.com/a/img922/6780/2ceUHj.jpg" momPhoto = "http://imageshack.com/a/img922/2448/tvuLfa.jpg" dadPhoto = "http://imageshack.com/a/img923/1862/G1VINZ.png" # Function to get embedding from image def getEmbedding(image_url): # Call the Face Embedding Model jsonTags = model.predict_by_url(url=image_url) # Storage for all the vectors in a given photo faceEmbed =  # Iterate through every person and store each face embedding in an array for faces in jsonTags['outputs']['data']['regions']: for face in faces['data']['embeddings']: embeddingVector = face['vector'] faceEmbed.append(embeddingVector) return faceEmbed # Get embeddings and put them in an array format that Numpy can use kunalEmbedding = array(getEmbedding(kunalPhoto)) momEmbedding = array(getEmbedding(momPhoto)) dadEmbedding = array(getEmbedding(dadPhoto)) # Get Distances useing Numpy momDistance = linalg.norm(kunalEmbedding-momEmbedding) print "Mom Distance: "+str(momDistance) dadDistance = linalg.norm(kunalEmbedding-dadEmbedding) print "Dad Distance: "+str(dadDistance) # Print results print "" print "**************** Results are In: ******************" if momDistance < dadDistance: print "Kunal looks more similar to his Mom" elif momDistance > dadDistance: print "Kunal looks more similar to his Dad" else: print "Kunal looks equally similar to both his mom and dad" print ""
Run the file with
How Does This Work?
The simple program above calls our Face Embedding model and gets the vectors for the three photos in our dataset. It then calculates the distance between vectors for myself and each of my parents. It then prints out the results on who I look more visually similar to. The smaller the distance, the more visually similar our faces are, according to the Face Embedding Model.
Moment of truth...
The results are in, and my mom is the winner!
To recap, we walked through embeddings and the use case for finding visual similarity. Try out the code above with your own images and let us know the results!
You can also try out some other interesting use cases, such as:
Creating your own search based on facial recognition.
Authenticating using the face embeddings.
Deduplicating similar photos in a large dataset.
...and much more!
If you have any questions from this post or Clarifai in general, please let me know in the comments section below!
Published at DZone with permission of Kunal Batra , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.