Over a million developers have joined DZone.

How to Build Your Own Visual Similarity App

DZone's Guide to

How to Build Your Own Visual Similarity App

I'll be showing Clarifai's Face Embedding model three different photos: one of me, one of my mom, and one of my dad. It will tell me which parent I look most similar to.

· AI Zone ·
Free Resource

The most visionary programmers today dream of what a robot could do, just like their counterparts in 1976 dreamed of what personal computers could do. Read more on MistyRobotics.com and enter to win your own Misty. 

We recently released a new public model called Face Embedding, which was developed from our Face Detection model. If you’re not familiar with the term embedding, you can think of it as the numerical representation of a model’s input in 1024 dimensional space. Don’t worry if it’s your first time hearing this terminology — we’ll dive a little deeper in this tutorial. For now, just know we’ll be using embeddings to create a simple program that will let you compare how visually similar two people’s faces are! More specifically, this program will use the embedding model to see which one of my parents I look more visually similar to.


Here are the photos I will be using for this tutorial. Feel free to replace my photos with those of your own! You can also make your guess now on which parent I look more similar to. The bottom of this post will have Clarifai’s result for you to compare against.

Image title

  1. You need a Clarifai account and can sign up for free here.
  2. We'll be using our Python Client. The installation instructions are here. Make sure to click Py to see the Python instructions.
  3. Install Numpy with the instructions for your environment here.

Deeper Dive Into Embeddings

You can think of an embedding as a low-dimensional representation of a model’s input that has rich semantic information. That means the Face Embedding model will receive an image as an input, detect all faces in the image, and output a vector of length 1024 for each detected face. The response returned from the API will contain all the vectors generated from the image. These vectors can be thought of as coordinates in a 1024 dimensional space. Images that are visually similar will have embeddings that are “closer” to each other in that dimensional space.

We can find how visually similar two faces are by calculating the distance between their corresponding embeddings. In this example, we will be using the Numpy library.

Let's Write Some Code!

Start by setting an environment variable for your API Key so our Python module can use it to authenticate export CLARIFAI_API_KEY=your_API_key.

Paste the following into a file named faceEmbed.py.

import json
from clarifai.rest import ClarifaiApp
from math import sqrt
from numpy import linalg
from numpy import array

# Initalize Clarifai and get the Face Embedding model
app = ClarifaiApp()
model = app.models.get("d02b4508df58432fbb84e800597b8959")

# Dataset
kunalPhoto = "http://imageshack.com/a/img922/6780/2ceUHj.jpg"
momPhoto = "http://imageshack.com/a/img922/2448/tvuLfa.jpg"
dadPhoto = "http://imageshack.com/a/img923/1862/G1VINZ.png"

# Function to get embedding from image
def getEmbedding(image_url):
    # Call the Face Embedding Model
    jsonTags = model.predict_by_url(url=image_url)

    # Storage for all the vectors in a given photo
    faceEmbed = []

    # Iterate through every person and store each face embedding in an array
    for faces in jsonTags['outputs'][0]['data']['regions']:
        for face in faces['data']['embeddings']:
            embeddingVector = face['vector']
    return faceEmbed[0]

# Get embeddings and put them in an array format that Numpy can use
kunalEmbedding = array(getEmbedding(kunalPhoto))
momEmbedding = array(getEmbedding(momPhoto))
dadEmbedding = array(getEmbedding(dadPhoto))

# Get Distances useing Numpy
momDistance = linalg.norm(kunalEmbedding-momEmbedding)
print "Mom Distance: "+str(momDistance)

dadDistance = linalg.norm(kunalEmbedding-dadEmbedding)
print "Dad Distance: "+str(dadDistance)

# Print results
print ""
print "**************** Results are In: ******************"
if momDistance < dadDistance:
    print "Kunal looks more similar to his Mom"
elif momDistance > dadDistance:
    print "Kunal looks more similar to his Dad"
    print "Kunal looks equally similar to both his mom and dad"
print ""

Run the file with python faceEmbed.py.

How Does This Work?

The simple program above calls our Face Embedding model and gets the vectors for the three photos in our dataset. It then calculates the distance between vectors for myself and each of my parents. It then prints out the results on who I look more visually similar to. The smaller the distance, the more visually similar our faces are, according to the Face Embedding Model.

Moment of truth...


The results are in, and my mom is the winner!


To recap, we walked through embeddings and the use case for finding visual similarity. Try out the code above with your own images and let us know the results!

You can also try out some other interesting use cases, such as:

  • Creating your own search based on facial recognition.

  • Authenticating using the face embeddings.

  • Deduplicating similar photos in a large dataset.

...and much more!

If you have any questions from this post or Clarifai in general, please let me know in the comments section below!

Robot Development Platforms: What the heck is ROS and are there any frameworks to make coding a robot easier? Read more on MistyRobotics.com

ai ,face recognition ,embedding ,face detection ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}