Application architectures are evolving from the era of large monoliths to a more distributed design based model. One of the key initiators of this movement is the advent of cloud computing and the ability it brings in terms of handling ever-increasing scale. When an enterprise primarily soaked (people and processes) with the model of building and managing monolithic applications, the journey to build new distributed systems requires re-learning some of the older design techniques and adopting some new patterns. As part of this, I will detail certain architecture concerns that become prominent when moving to a distributed model of application
- Scheduler/Orchestration management: Going from managing hundreds of instances to managing thousands of instances require the ability to orchestrate/schedule service instances/containers across hosts in a seamless manner. To handle increasing scale, workload scheduling/orchestration is a key ingredient of distributed system. Products like Docker Swarm, Kubernetes, Mesos, Marathon, etc. are some of the leading products in this space
- Service Discovery/Registration: As container-based services go up and down, there needs to be a mechanism to register/unregister the services along with the mechanism to discover the service endpoints at runtime. Products like Consul, Zookeeper, etcd, Confd, and Eureka are some of the leading products in this space. Most of these products support load balancing of the incoming traffic across the service instances.
- System State Management/Cluster Management: As the cluster grows, there is a need to manage the system state of the cluster. What are the SRV for each of the services, how many instances, on what hosts, what is the load, etc.? To manage this, there is a need for cluster management that keeps track of the system state. Products like Docker Swarm Agents, Kubernetes Nodes/Masters, Mesos Slaves, Containership, etc. are some of the leading products in this space.
- Data storage: Container storage is ephemeral, which means any data that needs to be retained beyond the container lifecycle needs to be persisted outside. Projects like Docker Volume Plugin, Flocker, Kubernetes Persistent volumes, etc. are some of the key products
- Network: With each of the containers running different processes, there is a need to manage and, at times, isolate which container services can access which other services. Multiple containers are running on the same host, so sharing the network's resources might require security groups to be created for container isolation. Similarly, containers might want to discover services that are hosted across hosts and need a simple model to access those. Products like Flannel, Weaveworks, and Calico are some of the products in this space.
- Monitoring/Auditing/Logging: With thousands of containers running, monitoring/auditing/logging each of the containers become a tough problem. Data/Logs need to be pulled from each of the containers for analysis. Products like Loggly, Fluentd, log entries, datadog, and ELK stack are some of the key products in this space.
Besides these, other factors that need to be considered are Container OS and Container Runtime when architecting a distributed application. Other factors like application runtime, deployment management, DNS, Security, SSO/OAuth, API Gateways, Circuit breakers, Performance/Scalability Patterns, etc. still need to be handled.
In your experience, is there anything else that is a key architecture concern for distributed applications? If so, please share them.