Serverless Image Processing Patterns With AWS
Take a look at how image processing with event-driven processing using AWS can work with or without cache in this article.
Join the DZone community and get the full member experience.Join For Free
Today a wide range of web, mobile, and desktop applications require the use of images to provide better user interfaces and experiences. This brings up various challenges in terms of processing images for color enhancements, scaling dimensions, size optimization, and more. With the advancement of serverless technologies and adoption of cloud, some of the common challenges in building highly scalable and reliable image processing applications have become more easily surpassable, where the development focus could be put into digital image processing to solve a specific problem, while the architecture and the underlying services take care of all the heavy lifting.
This article focus on providing some insights of several serverless image processing techniques in AWS which will be useful for different types of image processing applications.
Event-Driven (Asynchronous) Processing
This is the most widely used method of processing images asynchronously using AWS S3 and Lambda. The image processing flow works as follows:
Upload image to S3.
S3 Event will trigger a Lambda Function.
Lambda Function processes the image and saves it back to S3 (possibly to a different bucket).
This approach requires minimal code to process the images and, using the AWS SDK for S3, it's possible to read and write back to the S3 bucket easily with your programming language of choice. Another advantage of this approach is that since the original image is first saved to S3, the processing operation could be reapplied for the raw image especially if the image processing algorithm is changed later on.
Refer to the Thumbnail Generation Tutorial: Using AWS Lambda with Amazon S3 for an example of this image processing workflow.
Image Processing on the Fly with RESTful API
If you plan to implement an image processing RESTful API, you can use AWS API Gateway, Lambda and S3 to implement the workflow in request-response mode as follows.
Send an API request with Binary/Base64 encoded image, to the API Gateway.
API Gateway forwards the data to a Lambda Function.
Lambda Function processes the image and saves it to S3.
Although for an API consumer this approach is as straightforward as consuming an ordinary RESTful API, it will require converting the image to Binary or Base64 format before sending to the API Gateway. In addition, since AWS API Gateway has a payload size restriction of 10MB, it is possible to only process smaller images. However, since AWS API Gateway supports GZip compression, you can compress the image from the client and send it to the API Gateway, so that the API Gateway will uncompress it and made it available for the Lambda function to process.
From the perspective of usage, providing a RESTful API is quite easy to integrate since implementation handles the internal complexities in processing and the order of processing while providing a direct response after the image is processed. Because of this, it becomes a common pattern implemented by third-party services used for digital asset management, headless content management. However, these services will internally do the optimizations like caching, using optimized hardware (e.g; GPUs) to improve performance and reduce the costs for the required image processing operations.
Image Processing on the Fly with Cache
There are use cases where you need to dynamically resize the images, make it available on the fly, and cache and store the image for future use for optimization. For example, a website will require to use different size of images to optimize in SEO and improve the website performance which is a storage case for this approach. To implement this image processing workflow, it will require Amazon CloudFront, Lambda@Edge and S3 and the workflow as follows. Let's assume there is a raw image available in S3 which needs to be processed on demand based on query parameters.
Send an request to AWS CloudFront URL with the query parameters (‘pathPrefix/image-name?d=widthxheight).
Alternatively, you can manipulate the request parameters to normalize the parameters & etc by configuring a Lambda Edge Function triggered on CloudFront Viewer-Request.
CloudFront will forward the request to S3 in requesting the image.
Once CloudFront receives a response from S3, configure another Lambda Edge Function as a trigger on CloudFront Origin-Response, which will do the following (based on the blog article Resizing Images with Amazon CloudFront & Lambda@Edge).
The function gets invoked when CloudFront receives a response from the origin and before hitting the cache. The following is the sequence of steps followed in this function:
Check if the object exists in the Amazon S3 bucket by inspecting the status code from origin response. If the object exists then simply proceed with CloudFront response cycle.
If the object does not exist on S3 bucket, then fetch the source image into a buffer, apply the resize transformation and store the resized image back into the S3 bucket with correct prefix and metadata.
If the image was resized, a binary response is generated using the resized image in memory and sent back with appropriate status code and headers.
Once the image is returned, it will be cached for any future request based on the Time to Live (TTL) configurations.
CloudFront will return the processed image.
Image Processing on The Fly Without Cache
It is also possible to implement on the fly image processing without using CloudFront cache, using AWS S3, API Gateway, and Lambda. The image processing workflow as follows.
Request an image directly from S3 with required parameters.
Configure S3 Redirection Route Rule so that if the image is not available, redirect to an API Gateway. If the image with parameters available, send it immediately.
API Gateway will receive the required parameters and requested image so that it is possible to forward it to a Lambda Function and process the image and send the response on the fly.
Refer the Resize Images on the Fly with Amazon S3, AWS Lambda, and Amazon API Gateway as an example for more details.
Although several common approaches are described above, new image processing workflows can be configured using many combinations and integrations of the above approaches as well as utilize other AWS services addressing different use cases. For example, if it requires to process images as well as use Machine Learning approach for image identifications or to do some classifications, other services such as SageMaker can be integrated into the workflow.
Overall the main advantage of using serverless image processing workflows is that it allows to build both simple and complex workflows, focusing mainly on the use case, rather the technology which reduces the challenges that usually comes in implementing these kinds of systems.
Opinions expressed by DZone contributors are their own.