As of today, there is a handful of container cluster management platforms available for deploying applications in production using containers- Kubernetes, OpenShift Origin, DC/OS, and Docker Swarm, just to name a few. Almost all of them can be deployed on any infrastructure, including AWS. Nevertheless, AWS also provides its own container cluster management platform called EC2 Container Service (ECS). At a glance, some may think that ECS would be the right choice, as it can be tightly integrated with AWS services. However, before making a decision it might be worthwhile to go through the ECS architecture and see how things work internally. In this article, we will go through EC2's features, resources required for setting up an ECS cluster, and finally, when ECS is best suited for a container-based deployment.
EC2 Container Service Architecture
ECS uses tasks for scheduling containers on the container cluster, similar to DC/OS. A task definition specifies the container image, port mappings (container ports, protocols, host ports), networking mode (bridge, host), and memory limits. Once a task definition is created, tasks can be created either using the service scheduler, a custom scheduler, or by manually running tasks. The service scheduler is used for long running applications and manual task creation can be used for batch jobs. If any business specific scheduling is needed, a custom scheduler can be implemented. Consequently, a task would create a container on one of the container cluster hosts by pulling the container image from the given container registry and applying the port mappings, networking configuration, and resource limits.
Once a container is created, the ECS service will use the health checks defined in the load balancer and auto recover the containers in unhealthy situations. Healthy and unhealthy conditions of the containers can be fine tuned according to the application requirements by changing the health check configuration.
In ECS, CloudWatch alarms need to be used for setting up autoscaling. Here, AWS has utilized existing monitoring features for measuring the resource utilization and making scaling up/down decisions. It also seems to support scaling the EC2 instances of the ECS cluster.
Currently, in ECS, container ports are exposed using dynamic host port mappings and do not use an overlay network. As a result, each container port will have an ephemeral host port (between 49153 and 65535) exposed on the container host if the networking mode is set to bridge. If the host network mode is used, the container port will be directly opened on the host, and subsequently, only one such container will be able to run on a container host. Load balancing for the above host ports can be done by creating an application load balancer and linking it to an ECS service. The load balancer will automatically update the listener ports based on the dynamic host ports provided via the service.
It is important to note that due to this design, containers on different hosts might not be able to directly communicate with each other without discovering their corresponding host ports. The other solution would be to use the load balancer to route traffic if the relevant protocols support load balancing. Protocols such as JMS, AMQP, MQTT, and Apache Thrift, which use client-side load balancing, might not work with a TCP load balancer and would need to discover the host ports dynamically.
Container Image Management
ECS supports pulling container images from both public and private container registries that are accessible from AWS. When accessing private registries, Docker credentials can be provided via environment variables. ECS also provides a container registry service for managing container images within the same AWS network. This service would be useful for production deployments for avoiding any network issues that may arise when accessing external container registries.
AWS recommends setting up any deployment on AWS within a Virtual Private Cloud (VPC) for isolating its network from other deployments which might be running on the same infrastructure. The same applies to ECS. The ECS instances may need to use a security group for restricting the ephemeral port range only to be accessed by the load balancer. This will prevent direct access to container hosts from any other hosts. If SSH is needed, a key pair can be given at the ECS cluster creation time and port 22 can be added to the security group when needed. For both security and reliability, it would be better to use ECS container registry and maintain all required container images within ECS.
Depending on the deployment architecture of the solution, the load balancer security group might need to be configured to restrict inbound traffic from a specific network or open it to the internet. This design would ensure only the load balancer ports are accessible from the external networks.
Any container based deployment would need a centralized logging system for monitoring and troubleshooting issues as all users may not have direct access to container logs or container hosts. ECS provides a solution for this using CloudWatch Logs. At the moment, it does not seem to provide advanced query features such as with Apache Lucene in Elasticsearch. Nevertheless, Amazon Elasticsearch Service or a dedicated Elasticsearch container deployment could be used as an alternative.
EC2 Resources Needed for ECS
ECS pricing is calculated for the EC2 resources being used for a deployment. A typical ECS deployment would need following EC2 resources:
- A virtual private cloud (VPC).
- An ECS cluster definition.
- EC2 instances for the ECS cluster.
- A security group for the above EC2 instances.
- ECS task definitions for containers to be deployed.
- An ECS service for each task definition.
- An application load balancer.
- Target groups for the load balancer.
- A security group for the load balancer.
Choosing ECS on AWS Over Other Container Cluster Managers
At the time this article was written, I was only able to notice one advantage of using ECS on AWS over any other container cluster managers. With ECS, the container cluster manager controller is provided as a service. If Kubernetes, OpenShift Origin, DC/OS, or DockerSwarm is used on AWS, a set of EC2 instances would be needed to run the controller and its dependent components with high availability. The same may apply to Kubernetes when running on Google Cloud Platform (GCP) where the master and etcd nodes are provided as services. Nevertheless, in terms of container cluster management features, ECS still lacks some of the key features provided by other vendors, like overlay networking, service discovery via DNS, rollouts/rollbacks, secret/configuration management, and multi-tenancy.
In conclusion, it is evident that ECS provides core container cluster management features required for deploying containers in production. Most of them have been implemented by reusing existing AWS services such as EC2 instances, elastic load balancing, CloudWatch alarms/logs, and security groups. Therefore, a collection of AWS resources is needed for setting up a complete deployment. Nevertheless, a CloudFormation template can be used for automating this process. For someone who is evaluating ECS, it might be better to first identify the infrastructure requirements of the applications and verify their availability in ECS. If applications need container-to-container communication, use client-side load balanced protocols to expose multiple ports; ECS might not work well for those types of applications at the moment.