Real-Object Detection at the Edge: AWS IoT Greengrass and YOLOv5
Real-time object detection at the edge using YOLOv5 and AWS IoT Greengrass enables fast, offline, and scalable processing in bandwidth-limited or remote environments.
Join the DZone community and get the full member experience.
Join For FreeEdge computing has transformed how we process and respond to data. By taking compute capability to the point of data, such as cameras, sensors, and machines, businesses can make decisions faster, reduce latency, save on bandwidth, and enhance privacy. AWS empowers this revolution with a set of edge-capable services, most notably AWS IoT Greengrass.
In this article, we'll give an example of how to run a machine learning model (YOLOv5) on an edge device via AWS IoT Greengrass v2 to identify objects in real-time within a retail setting. This is a fault-tolerant and scalable solution appropriate for an intermittently cloud-connected environment. Let's look at a Retail Store Video Analytics on the Edge.
Consider a chain of retail outlets wanting to:
- Detect shoplifters in real time
- Count customer traffic and interaction
- Run analytics offline during internet outages
Rather than streaming video continuously to the cloud and bearing significant cost and latency, the solution is to perform an ML model at the edge for real-time insights.
Architecture Overview
To facilitate edge object detection in real time, it is essential to have an optimally structured system whose workload offload and interaction with local infrastructure versus cloud services, such as AWS, is balanced. Besides facilitating ultra-low latency inference, this system can also ensure continuity of operations within environments without consistent connectivity, such as shopping stores, factories, or transportation stations.
The architecture consists of:
- Edge Device (such as NVIDIA Jetson Xavier, Raspberry Pi, and AWS Snowcone)
- IP Camera Feed (IP cameras installed within the store)
- ML Model (YOLOv5 for object detection)
- AWS IoT Greengrass V2 for Inference and Control of Lambda Functions
- AWS IoT Core to process events on the cloud side
We are going to decompose the architecture into steps.
1. Edge Device Layer
The main component of edge architecture is the edge device, a small yet versatile compute node close to data sources such as a security camera. Some examples of supported devices are:
- NVIDIA Jetson Nano/Xavier AGX: Ideal for Machine Learning Acceleration.
- Raspberry Pi 4: Ideal for light applications and prototyping.
- AWS Snowcone: Managed and rugged edge device for challenging environments.
This layer is responsible for ingesting video frames from mounted cameras and running inference on pre-deployed machine learning models, such as in our situation, YOLOv5. It’s also responsible for managing local decision logic and distributing actionable insights.
We remove the need to stream raw video to the cloud by processing video on the device, thus reducing bandwidth consumption and latency by a dramatic amount.
2. Edge Runtime Using AWS IoT Greengrass v2
AWS IoT Greengrass V2 is deployed on the edge device, a light-weight, feature-rich edge runtime serving as glue to connect cloud and local applications.
Core competencies encompass:
- Secure Lambda execution at edge: Allows for execution of a Python (or Java/Node.js) function on an event trigger like sensor change or camera input.
- Component management: You have your object detection code packaged and deployed as a Greengrass component. These components can be updated, rolled back, and monitored directly from the AWS Management Console or the CLI.
- Offline mode: Even when the device becomes disconnected from the internet, Greengrass keeps on running inference and keeps messages pending to be sent later.
This makes it extremely robust and suitable for retail spaces with poor Wi-Fi or cell phone signals.
3. Machine Learning Inference Pipeline
We deploy on the edge device a pre-existing PyTorch YOLOv5 object detection model for relevant image data (e.g., person detection, customer behavior, and interaction with product). An example of such a model is:
Either trained from scratch or fine-tuned with Amazon SageMaker, optimized for deployment on the edge via Amazon SageMaker Neo (ONNX or Torch Script converted). Embedded within the inference script, which is either run as a Lambda function or a system process. It processes each frame of the video and produces a list of detected objects, along with bounding boxes, class labels, and scores. Locally, predictions are parsed and filtered. For instance, detections with confidence scores of below 80% may be filtered out to eliminate noise.
4. Publishing Events to AWS IoT Core
Subsequently, actionable data is made available on the cloud via MQTT, a lightweight publish/subscribe messaging protocol suitable for edge devices. A message can be:
{
"label": "person",
"confidence": 0.91,
"timestamp": "2025-01-01T20:04:12Z"
"location": "Store#99 / location code 0x34"
}
This message is sent to a topic, such as edge/camera/events, on which AWS IoT Core subscribes for downstream routing and analytics. Optional services, such as Amazon Timestream and QuickSight, for time-series analytics and dashboards are worth considering for enhanced analytics and visualization. Let's proceed:
Steps
Step 1: Set up and install the Edge Device
Install AWS IoT Greengrass Core v2 on your edge device:
sudo apt update
sudo apt install openjdk-11-jdk python3-pip -y
wget https://d2s8p88vqu9w66.cloudfront.net/greengrass/v2/install.zip
unzip install.zip -d greeng
cd greeng
sudo ./greengrass-cli installer \
--aws-region us-west-2 \
--thing-name EdgeCamera001 \
--thing-group-name EdgeCameras \
--component-default-user ggc_user:ggc_group
This registers the device in AWS IoT Core, installs required components, and prepares it for deployment.
Step 2: Write the Edge Logic Script
Let's use YOLOv5 and boto3, and write our edge inference logic like this:
# inference.py
import torch, cv2, json
from datetime import datetime
import boto3
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # Load YOLOv5 model
client = boto3.client('iot-data', region_name='us-west-2') # AWS IoT Core
def detect():
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
results = model(frame)
for obj in results.xyxy[0]:
label = model.names[int(obj[5])]
confidence = float(obj[4])
if confidence > 0.8:
payload = {
"label": label,
"confidence": confidence,
"timestamp": datetime.utcnow().isoformat()
}
client.publish(
topic='edge/camera/events',
qos=1,
payload=json.dumps(payload)
)
This script:
- Captures live frames from a USB or RTSP camera.
- Runs object detection locally.
- Sends results to the edge/camera/events topic in AWS IoT Core.
Step 3: Package and Deploy
Zip your Python script:
zip object-detection.zip inference.py
Upload to S3:
aws s3 cp object-detection.zip s3://your-bucket-name/greengrass/
Create the component recipe:
{
"RecipeFormatVersion": "2020-01-25",
"ComponentName": "com.example.objectdetection",
"ComponentVersion": "1.0.0",
"ComponentDescription": "YOLOv5 object detection at the edge",
"Manifests": [
{
"Platform": { "os": "linux" },
"Lifecycle": {
"Run": "python3 inference.py"
},
"Artifacts": [
{
"URI": "s3://your-bucket-name/greengrass/object-detection.zip"
}
]
}
]
}
Deploy the component:
aws greengrassv2 create-deployment \
--target-arn arn:aws:iot:us-west-2:123456789012:thing/EdgeCamera001 \
--components '{"com.example.objectdetection": {"componentVersion": "1.0.0"}}' \
--deployment-name "EdgeObjectDetection"
You can subscribe to the edge/camera/events topic in AWS IoT Core to review the incoming detections.
Example payload:
{
"label": "person",
"confidence": 0.93,
"timestamp": "2025-06-08T14:21:35.123Z"
}
Use this data to:
- Create Alerts Using Amazon SNS.
- Store event streams within Amazon Timestream.
- Create dashboards in Amazon QuickSight.
Conclusion
In this article, we examined a sample use case and how it can be accomplished with AWS to deploy end-to-end, robust edge AI applications using well-known devices, such as IoT Greengrass and SageMaker.
You can deploy a solution fit for production with just hundreds of lines of code and run real-time object recognition on-site, responding immediately to events, and have it plugged into AWS for analytics and visualization. This is not specific to retail but can also be applied to factories, smart cities, logistics, and healthcare settings where data must be processed close to where it is being created.
Opinions expressed by DZone contributors are their own.
Comments