Object Recognition and Spatial Awareness for a SPOT Robotics System
Using a knowledge graph for a search and rescue robot.
Join the DZone community and get the full member experience.Join For Free
Object Recognition and Spatial Awareness for a SPOT Robotics System
TNO is the national research association in the Netherlands, where Joris and his team look to combine robotics and artificial intelligence.
They first started working with Grakn about two years ago. Up to that point, there was no suitable database for robotics that could accurately represent the real world. This is essential in autonomous systems that need to perform tasks and make decisions within a real-world environment. Too often, robotics projects are run in curated environments, and in TNO’s case, they wanted to get as close to real-world scenarios as possible.
Search and Rescue — Project SNOW
The project that TNO is working on is called SNOW, focusing on autonomous systems. The team defines autonomous systems as the combination of robotics with AI.
In practice, this means that the robot is given more autonomy, with less human intervention from a user point of view. This then enables the robot to operate in a more complex environment.
Traditionally in robotics, you would have remotely controlled systems. However, with SNOW, TNO is trying to reduce human intervention as much as possible, especially when the complexity of the environment increases.
The team working on the SNOW project is comprised of 15 scientists — primarily focused on 3 challenges:
- Situational Awareness — does the robot understand where it is operating
- Self Awareness — what can the robot do at this moment given circumstances
- Situational Planning — based on what the robot knows, how should it complete a task
The use case that TNO chose for the project is meant to be semi-realistic. Normally, their customers would ask for only a narrow Artificial Intelligence task to explore. However, by creating their own case, they’re able to test the full set of technical situations and integrate all technical solutions.
What is the environment and situation SPOT will be operating within?
TNO makes use of an on-premise villa, made up of four rooms, a hall, and a porch. There are four victims and a fireman on site. SPOT is sent in to get accurate information on the whereabouts of the victims, and any additional information on the situation useful for search and rescue.
The robot — SPOT from Boston Dynamics — is then sent into the villa to retrieve or localise the victims.
SPOT is then tasked with a set of objectives:
- locate family members
- medically assess them
For example, the robot might report that the daughter is in the hallway going to the kitchen, and that she is responsive. The action, then, would be to continue to search for other victims.
Some time later, the robot reports that it found the father in the living room, next to a chair, and was not responsive. In this instance, the action is to stay put and make noise for the rescue team.
There are some challenges that SPOT faces when operating in a real world environment:
- SPOT shouldn’t get confused by a toy dog — it should differentiate a real dog from a toy
- SPOT should assess when the situation is too dangerous — based on conditions in the room, how search and locate capabilities are diminished
- SPOT should be able to make a trade off of whom to rescue based on victims’ condition or other variables like location, or proximity to danger, etc.
SNOW’s Robotics System
How does SPOT observe and collect data — what’s the hardware used in SNOW?
- PTZ camera — pan tilt zoom camera mounted on top of SPOT
- speech array and speaker for speech interaction
These components are mounted on top of a mini PC. This mini PC has a lot of computing power, as most of the computations is done on the device itself to ensure as much autonomy for the robot as possible.
What Does a Typical Robotics System Look Like at TNO?
First, an image is captured from the PTZ camera, this is then passed to an image recognition module, which is then used to do some room characterization so that the robot can localize itself within the room.
The association module takes all the detections from the camera and associates them so that they can create better tracks instead of individual detections.
Grakn is used as the database, to orchestrate all the data and knowledge in their system.
Traditionally in robotics, ROS and Python are used to enable communication between the modules in the system. If you take one of your modules or software components and create it in Python, ROS then adds another layer to the Python code for communication with the rest of the system.
Because of this, they need to create a dedicated ROS client for Grakn as well, along with using the native Python client.
Joris notes that their database, while relatively small, is extremely dynamic in terms of handling administrative burden and adding new knowledge into the database.
Grakn ROS Client
What is ROS? What role does ROS play in the process?
ROS (robot state publisher) is a publish/subscribe mechanism. A setup consists of: a camera, image recognition, an association layer, a planner, and a controller. ROS facilitates the merging of that input date into your database. Let’s look at this example where a dog is identified:
- an image is published by the PTZ camera
- the Grakn client subscribes to the image in order to feed it back to the database
- the image is then passed to the recognition module
- a dog is detected
- the dog is then published on the ROS bus
- the Grakn client again subscribes to this output (dog) in order to send and write it back to the database
- the same happens with the association module where the speed and position of the dog are written into the database
- then the planner might request some piece of information
- the Grakn client will also subscribe to these requests coming in from the planner (object velocity) and vice-versa
How Do You Go About Building the ROS Grakn Client?
The team, setting out to build this ROS client, divided it into two parts: the ROS wrapper and the Grakn client session.
The ROS wrapper handles the publish/subscribe mechanisms, the Grakn client session is built on top of the Grakn client as an abstract layer.
What are the requirements for each?
- The ROS wrapper initializes the client and the topics to publish or subscribe to.
- The Grakn Client should automatically start Grakn and delete any existing keyspace. Deleting the keyspace is necessary as in robotics we want to start with an empty keyspace. We want to have a fresh scenario for the dynamically changing environmental situations that the robot finds itself in.
- The Grakn client session should also load the schema and instances that are already known, back in.
After a while, when things may have been running for some amount of time, a new ROS messages comes in (e.g. a request for object_velocity).
Here, they use Grakn utilities, which is based on the Grakn Python client.
Joris and his team are currently looking at one addition that may automate the query generation. This is noted as future work.
This slide shows us how a read function looks like under the hood. The function,
request_all_humans , fetches us all the humans that are known in the database. They’ve set this up to read the query from the database using Grakn utilities.
The data is then retrieved, and put in a nice format that can be given to the ROS wrapper which then publishes it again as a ROS topic.
This is a fairly simple function to create. However, it gets a bit more complex when we look at adding additional variables for the robot to observe.
Before we get there, let’s look at their schema — Joris walked through a portion of it as a schema diagram in the slide below.
Here we see that they have defined a
living_being, which is the parent entity for a
human or an
animal. These two are then the parent entities for:
child ; and
SPOT will need to report on the state of the
living_beings, both as a
physical state and
mental state. Specifically, SPOT is tasked with identifying the
mental-responsiveness of an identified
Let’s look at how they modeled relations in this snippet of the schema.
We can also see how an instance of this model looks, referring to the lower section of the slide above.
First, they instantiate the
mental concept with either
false. They have some people and a dog, fluffy. Notice that in this instance there are two adults,
man(with name Sam) and
man(with name George). Sam is the fireman, and George is the father of the family.
We can see that Sam is mentally responsive and George is not mentally responsive. How do we create the mental responsiveness in code?
Let’s say the mother is mentally responsive:
Here’s where you can see that the code grows quickly, and the reason for wanting to explore automated query generation.
Small but Dynamic
Remember when we talked about the database being small but dynamic? Here we get to see what happens as SPOT operates in a real-world scenario.
Setting the scene in the real world, we have a family, a fire-brigade, and a house with a hall and a set of rooms.
Here is, more or less, the full schema diagram for the SNOW project.
You can see all the relations in green and all the attributes in dark blue. All the concepts, or entities, are in light blue and organized into a type hierarchy.
At first glance, it’s not as large or complex as it may appear, but as Joris goes on to describe, once the robot is in a room the complexity grows quickly as the robot needs to orient itself.
They are able to model this by representing polygons in the database.
Looking at the slide below we see that the kitchen, kitchen door, and four walls are mapped as points and edges of a polygon.
A polygon is a mathematical concept that you can use to describe the borders of a room, like the kitchen. Written as a polygon, you have a set of points and edges or lines. These edges then correspond to either walls or the kitchen door.
A polygon is constructed of lines, and a line is defined by two points. If you were to model this polygon into Grakn, the polygon relates to a set of lines, each related to two points and a structural part.
Modeling this in Grakn we get the entities:
points, and two sub-types of the entity
These lines can also take some kind of physical form like a structural part: walls or doors.
Why is this useful information to know?
If SPOT is in the kitchen it should be able to localize itself within that room.
If it is in the kitchen and needs to exit the room via the kitchen door, it must know the position of the kitchen door. Using a lidar system, SPOT is able to measure the distances to the walls and thereby map the array to the polygon. Next, it should locate the door by retrieving the end-points of the door. Finally, to head towards the door, it finds a waypoint to exit.
In this way, using polygons are handy to have if you are working with robots and real environments according to a floor-plan.
Reasoning How to Move From One Room to Another
We saw how Joris modeled a specific room in order for SPOT to move within it; but what about moving throughout the building? How should we model a building such that SPOT can move freely between rooms and halls?
First, Joris needed to model the composition of a building in the schema — we can see how his diagram might look in Graql:
This means that when we have an instance of a
res-house like the villa in this case, it is composed of rooms:
living. These rooms are composed of
As Joris explains, we can make use of Grakn’s hyper-graph and rule-based reasoning to create room connections (relations) based on commonly associated
If we know that the
kitchen-door is a role-player in a relation (
c#2 in the slide above) with the room
kitchen, as well as the room
living; then we can infer a relation between the rooms
living via the
This gives SPOT the knowledge that it can move from the kitchen to the hall via the kitchen door. As you might infer yourself, we can then make a relation between the rooms:
kitchen and the
living, via the
living-door. This is an example of a transitive relation in Grakn.
Traditionally, a robotics team might use SLAM or other navigation techniques to achieve this mobility through spaces. In Grakn this becomes fairly simple to do.
You can see what this looks like in Grakn Workbase — Grakn’s IDE:
How do we address the fact that SPOT doesn’t yet know where the
living-beings are located when the robot enters a building. How do we model this lack of knowledge in our database?
In Grakn, we use a
locating relation with two sub relations:
actually-locating. These allow us to address the negative; once a space is searched, capturing in the database that a
living_being is not located in that room. We don't need SPOT re-checking rooms and wasting valuable time. This requires a frequent update to the database during the active search.
What About Adding New Knowledge to the Database?
In real-world scenarios, new facts are presented during an active search and rescue. Imagine a fire commander identifying that one of the bedrooms has a pinball machine, data that may not be currently known in the database.
Rather than adding this new knowledge to the database as an instance, you should update the underlying knowledge. You want to add this to the schema so that the knowledge can be reasoned over and used to help SPOT achieve its objectives.
Joris notes that this is something that would potentially be happening on a regular basis. Doing this in a suitable and automated way is essential. Grakn’s dynamic schema — able to be updated at any time without needing to do any migrations — makes this quite simple.
For me, robotics are extremely cool. Also, knowledge graphs are hot. So combining those, in my sense, gives me the fever. You can see some of my fevers here.
Concluding his talk, Joris talks about his interest in using Machine Learning over a Knowledge Graph to localize yourself through object recognition. For example, if we recognize two objects: an oven and a sink, we should be able to know that we are in the kitchen.
Joris and his team are currently collaborating with James Fletcher, Principle Scientist at Grakn Labs — whose research on knowledge graph convolutional networks (KGCN) is utilized in this project.
Special thank you to Joris and his team at TNO for their enthusiasm, inspiration and thorough explanations.
You can find the full presentation on the Grakn Labs YouTube channel here.
Published at DZone with permission of Daniel Crowe. See the original article here.
Opinions expressed by DZone contributors are their own.