Cloud architecture refers to how technologies and components are built in a cloud environment. A cloud environment comprises a network of servers that are located in various places globally, and each serves a specific purpose. With the growth of cloud computing and cloud-native development, modern development practices are constantly changing to adapt to this rapid evolution. This Zone offers the latest information on cloud architecture, covering topics such as builds and deployments to cloud-native environments, Kubernetes practices, cloud databases, hybrid and multi-cloud environments, cloud computing, and more!
Most existing applications/projects are moving to cloud platforms/on-premise using microservices architectures. Assessment of the existing application is critical concerning application complexity and risk. An appropriate strategy has to define to analyze the current application to determine the risk and complexity of the existing application landscape. The assessment considers factors like overall architecture, presentation layer, business layer, data layer, security, deployment, technology, infrastructure, performance, and monitoring for existing applications. Collect the information from the different parameters and decide which is the best option for migration strategies like rehost, refactor, rearchitect, and rebuild. For each parameter, the architect fills in the information by studying the existing application/collecting the information from the application stack holders. Application assessment helps the organization to understand the following: Understanding of the existing application architecture and critical findings. Integration points across applications (upstream and downstream). Infrastructure and software usage. Pain areas related to business users. Road map to target system platform using any one of the migration strategies. There are a number of assessment criteria that information will need to be gathered for to provide an assessment of the applications as below: Business and technical value contributions like; reliability, scalability, and flexibility. Risk exposures like; strategic, operations, and technology risks. Cost of operations with respect to application, project, and business. The attractiveness of system, business, and delivery. Each and every application assessment team create a template to prepare and identify the appropriate questionnaire related to the above criteria. Based on the assessment report, the application owner can decide to migrate the application to the cloud/on-premise. The below diagram shows a very high-level process flow for application assessment. The below table describes questions for each of the assessment parameters listed above. The application architect's score must be based on his experience. For each and every question, the architect has to assign some score and finally decide whether the assessment application can be moved to the target architecture or not. The application architect has to answer the below questions before proposing the target architecture in the assessment phase. Assessment Parameter Questions to Ask as Part of the Assessment for Migration Application Architecture Is it distributed architecture? Follow the layered architectural model. Loosely coupled. Required level of abstraction without any vendor lock-in at any layer. Appropriate integration techniques between different systems for synchronous and asynchronous operations. Does the architecture follow SOA / microservices design principles? Presentation Layer GUI screens are more appealing in look and feel and easy to navigate with context-sensitive help enabled. The website is a portal based on standard products with out-of-box portal features. GUI design is based on templates such as Apache tiles which can be applied across the GUI screens for faster screen building and roll-out. GUI uses reusable render components. Supports cross-browser compatibility. Supports mobile-enabled services? View exercises any caching technique to cache the static content to improve performance. Business Layer Is the business layer loosely coupled with the presentation layer? Does it use any separation of concerns and loose coupling? Does the business layer invoke any remote objects / APIs running in a distributed environment? What is the mode of invoking? Does the business layer use any workflow? Does the business layer integrate with external systems? What is the mode of integration? Does the business layer consider the network latency and bandwidth while invoking remote objects? Does the business layer use any ESB for service orchestration or message transformation? Data Layer Does the architecture use any DAO pattern? Does this layer leverage the DB connection pool out-of-box feature of the application server? Does this layer use the ORM tool to persist Java objects? Does this layer use application server transaction management? Does this layer use JPA for portability? Does the application use Data base per business / single database for all microservices? Is there any downtime while deploying application changes to the production environment? Security Does the system enable a Single sign-on capability? Does the wire communication enable SSL encryption? Does user role administration control by a separate administration module? Does the system ensure Role-based access to various screens and menus of the system? Deployment Are firewalls protected? Does the system use a load balancer? Do the application servers launch in a cluster configuration? Does the system deploy in a container? Please elaborate The conclusion is that the questionnaire is not limited to the above. The questions can be extended, and assign values for each question to arrive appropriate assessment of the existing application. Then, based on the report, the application owner can make proper migration decisions. Finally, provide the application migration conclusions and recommendations as part of the migration road map document.
Kubernetes vs Docker: Differences Explained Containerization has existed for decades but has seen increasing adoption in recent years for application development and modernization. This article covers two container solutions and their uses: Docker, which is the container engine solution, its container orchestration solution Docker Compose, and Docker Swarm, which is a cluster-container orchestration solution. Kubernetes, the alternative cluster-container solution, compares it to Docker Swarm to help you choose the one that best meets your requirements. What Is Containerization? A containerization is a form of virtualization at the application level. It aims to package an application with all its dependencies, runtimes, libraries, and configuration files in one isolated executable package, which is called a container. The operating system (OS) is not included in the container, which makes it different from virtual machines (VMs), which are virtualized at the hardware level and include the OS. While the concept behind virtualization is the sharing of physical resources between several virtual machines, containers share the kernel of one OS between several containers. Unlike virtual machines, containers are lightweight precisely because they don’t contain the OS. This is why containers take seconds to boot. In addition, containers can easily be deployed on different operating systems (Windows, Linux, macOS) and in different environments (cloud, VM, physical server) without requiring any changes. In 2013, Docker Inc. introduced Docker in an attempt to standardize containers to be used widely and on different platforms. A year later, Google introduced Kubernetes as a solution to manage a cluster of container hosts. The definitions of the two solutions will show the difference between Kubernetes and Docker. What Is Docker? Docker is an open-source platform to package and run applications in standard containers that can run across different platforms in the same behavior. With Docker, containerized applications are isolated from the host, which offers the flexibility of delivering applications to any platform running any OS. Furthermore, the Docker engine manages containers and allows them to run simultaneously on the same host. Due to the client-server architecture, Docker consists of client- and server-side components (Docker client and Docker daemon). The client and the daemon (Dockerd) can run on the same system, or you can connect the client to a remote daemon. The daemon processes the API requests sent by the client in addition to managing the other Docker objects (containers, networks, volumes, images, etc.). Docker Desktop is the installer of Docker client and daemon and includes other components like Docker Compose, Docker CLI (Command Line Interface), and more. It can be installed on different platforms: Windows, Linux, and macOS. Developers can design an application to run on multiple containers on the same host, which creates the need to manage multiple containers at the same time. For this reason, Docker Inc. introduced Docker Compose. Docker vs Docker Compose can be summarized as follows: Docker can manage a container, while Compose can manage multiple containers on one host. Docker Compose Managing multi-containerized applications on the same host is a complicated and time-consuming task. Docker Compose, the orchestration tool for a single host, manages multi-containerized applications defined on one host using the Compose file format. Docker Compose allows running multiple containers at the same time by creating one YAML configuration file where you define all the containers. Compose allows you to split the application into several containers instead of building it in one container. You can split your application into sub-sub services called microservices and run each microservice in a container. Then you can start all the containers by running a single command through Compose. Docker Swarm Developers can design an application to run on multiple containers on different hosts, which creates the need for an orchestration solution for a cluster of containers across different hosts. For this reason, Docker Inc. introduced Docker Swarm. Docker Swarm or Docker in Swarm mode is a cluster of Docker engines that can be enabled after installing Docker. Swarm allows managing multiple containers on different hosts, unlike Compose, which allows managing multiple containers on the same host only. What Is Kubernetes? Kubernetes (K8s) is an orchestration tool that manages containers on one or more hosts. K8s clusters the hosts whether they are on-premises, in the cloud, or in hybrid environments and can integrate with Docker and other container platforms. Google initially developed and introduced Kubernetes to automate the deployment and management of containers. K8s provides several features to support resiliency, like container fault tolerance, load balancing across hosts, and automatic creation and removal of containers. Kubernetes manages a cluster of one or more hosts, which are either master nodes or worker nodes. The master nodes contain the control panel components of Kubernetes, while the worker nodes contain non-control panel components (Kubelet and Kube-proxy). The recommendation is to have at least a cluster of four hosts: at least one master node and three worker nodes to run your tests. Control Panel Components (Master Node) The master node can span across multiple nodes but can run only on one computer. It is recommended that you avoid creating application containers on the master node. The master is responsible for managing the cluster. It responds to cluster events, makes cluster decisions, schedules operations with containers, starts up a new Pod (a group of containers on the same host and the smallest unit in Kubernetes), runs control loops, etc. Apiserver is the control panel frontend, which exposes an API to other Kubernetes components. It handles the access and authentication of the other components. Etcd is a database that stores all cluster key/value data. Each master node should have a copy of etcd to ensure high availability. Kube scheduler is responsible for assigning a node for the newly created Pods. Kube control manager is a set of controller processes that run in a single process to reduce complexity. The controller process is a control loop that watches the shared state of the cluster through the apiserver. When the state of the cluster changes, it takes action to change it back to the desired state. The control manager monitors the state of nodes, jobs, service accounts, tokens, and more. Cloud controller manager is an optional component that allows the cluster to communicate with the APIs of cloud providers. It separates the components that interact with the cloud from those that interact with the internal cluster. Node Components (Working Nodes) The working nodes are the non-master nodes. There are two node components: kubelet and kube-proxy. They should run on each working node in addition to container runtime software like Docker. Kubelet is an agent that runs on the working node to make sure that each container runs in a Pod. It manages the containers that were created by Kubernetes to ensure they are running in a healthy state. Kube-proxy is a network proxy running on each working node and is part of the Kubernetes network service. It allows communication between Pods and the cluster or the external network. Other Components Service is a logical set of Pods that work together at a given time. Unlike Pods, the IP address of a service is fixed. This fixes the issue created when a Pod is deleted so that other Pods or objects can communicate with the service instead. The set of Pods of one service is selected by assigning a policy to the service to filter Pods based on labels. The label is a key/value pair of attributes that can be assigned to Pods, services, or other objects. Labels allow querying objects based on common attributes and assign tasks to the selection. Each object can have one or more labels. A key can only be defined one time in an object. Kubernetes vs Docker Swarm: What Is Better? Kubernetes and Docker are different scope solutions that can complete each other to make a powerful combination. Thus, Docker vs Kubernetes is not a correct comparison. Docker allows developers to package applications in isolated containers. Developers can deploy those containers to other machines without worrying about compatibility with operating systems. Developers can use Docker Compose to manage containers on one host. But Docker Compose vs Kubernetes is not an accurate comparison either since the solutions are for different scopes. The scope of Compose is limited to one host, while that of Kubernetes is for a cluster of hosts. When the number of containers and hosts becomes high, developers can use Docker Swarm or Kubernetes to orchestrate Docker containers and manage them in a cluster. Both Kubernetes and Docker Swarm are container orchestration solutions in a cluster setup. Kubernetes is more widely used than Swarm in large environments because it provides high availability, load balancing, scheduling, and monitoring to provide an always-on, reliable, and robust solution. The following points will highlight the differences that make K8s a more robust solution to consider. Installation Swarm is included in the Docker engine already. Using certain Docker CLI (command-line interface) standard commands, Swarm can easily be enabled. Kubernetes deployment is more complex, though, because you need to learn new non-standard commands to install and use it. In addition, you need to learn to use the specific deployment tools used in Kubernetes. The cluster nodes should be configured manually in Kubernetes, like defining the master, controller, scheduler, etc. Note: The complexity of Kubernetes installation can be overcome by using Kubernetes as a service (KaaS). Major cloud platforms offer Kaas; those include Google Kubernetes Engine (GKE), which is part of Google Cloud Platform (GCP), and Amazon Elastic Kubernetes Service (EKS). Scalability Both solutions support scalability. However, it is easier to achieve scalability with Swarm, while with Kubernetes, it is more flexible to do so. Swarm uses the simple Docker APIs to scale containers and services on demand in an easier and faster way. Kubernetes, on the other hand, supports auto-scaling, which makes scalability more flexible. But due to the unified APIs that it uses, the scalability is more complex. Load Balancing Swarm has a built-in load-balancing feature and is performed automatically using the internal network. All the requests to the cluster are load-balanced across hosts. Swarm uses DNS to load-balance the request to service names. No need for manual configuration for this feature in Swarm. Kubernetes should be configured manually to support load balancing. You should define policies in Pods for load balancing. Thus Pods should be defined as services. Kubernetes uses Ingress for load balancing, which is an object that allows accessing the Kubernetes services from an external network. High Availability Both solutions natively support high-availability features. The swarm manager monitors a cluster’s state and takes action to fix any change in the actual state to meet the desired state. Whenever a worker node crashes, the swarm manager recreates the containers on another running node. Kubernetes also automatically detects faulty nodes and seamlessly fails over to new nodes. Monitoring Swarm does not have built-in monitoring and logging tools. It requires third-party tools for this purpose, like Reimann or Elasticsearch, and Kibana (ELK). Kubernetes has the ELK monitoring tool built-in to natively monitor the cluster state. In addition, a number of monitoring tools are supported to monitor other objects like nodes, containers, Pods, etc. Conclusion Docker is a containerization platform for building and deploying applications in containers independently from the operating system. It can be installed using Docker Desktop on Windows, Linux, or macOS and includes other solutions like Compose and Swarm. When multiple containers are created on the same host, managing them becomes more complicated. Docker Compose can be used in this case to easily manage multiple containers of one application on the same host. In large environments, a cluster of multiple nodes becomes a need to ensure high availability and other advanced features. Here comes the need for a container orchestration solution like Docker Swarm and, alternatively, Kubernetes. The comparison between the features of these two platforms shows that both support scalability, high availability, and load balancing. However, Swarm is easier to install and use, while Kubernetes supports auto-scaling and built-in monitoring tools. This explains why most large organizations use Kubernetes with Docker for applications that are largely distributed across hundreds of containers.
Any modern website on the internet today receives thousands of hits, if not millions. Without any scalability strategy, the website is either going to crash or significantly degrade in performance—a situation we want to avoid. As a known fact, adding more powerful hardware or scaling vertically will only delay the problem. However, adding multiple servers or scaling horizontally without a well-thought-through approach may not reap the benefits to their full extent. The recipe for creating a highly scalable system in any domain is to use proven software architecture patterns. Software architecture patterns enable us to create cost-effective systems that can handle billions of requests and petabytes of data. The article describes the most basic and popular scalability pattern known as Load Balancing. The concept of Load Balancing is essential for any developer building a high-volume traffic site in the cloud. The article first introduces the Load balancer, then discusses the type of Load balancers; next is load balancing in the cloud, followed by Open-source options, and finally, a few pointers to choose load balancers. What Is a Load Balancer? A load balancer is a traffic manager that distributes incoming client requests across all servers that can process them. The pattern helps us realize the full potential of cloud computing by minimizing the request processing time and maximizing capacity utilization. The traffic manager dispatches the request only to the available servers; hence, the pattern works well with scalable cloud systems. Whenever a new server is added to the group, the load balancer starts dispatching requests and scales up. On the contrary, if a server goes down, the dispatcher redirects requests to other available servers in the group and scales down, which helps us save money. Types of Load Balancers After getting the basics of Load Balancer, the next is to familiarize with the load balancing algorithms. There are broadly two types of load-balancing algorithms. 1. Static Load Balancers Static load balancers distribute the incoming traffic equally as per the algorithms. Round Robin is the most fundamental and default algorithm to perform load balancing. It distributes the traffic sequentially to a list of servers in a group. The algorithm assumes that the application is stateless and each request from the client can be handled in isolation. Whenever a new request comes in, it goes to the next available server in the sequence. As the algorithm is basic, it is not suited for most cases. Weighted Round Robin is a variant of round robin where administrators can assign weightage to servers. A server with a higher capacity will receive more traffic than others. The algorithm can address the scenario where a group has servers of varying capacities. Sticky Session, also known as the Session Affinity algorithm, is best suited when all the requests from a client need to be served by a specific server. The algorithm works by identifying the requests coming in from a particular client. The client can be identified either by using the cookies or by the IP address. The algorithm is more efficient in terms of data, memory, and using cache but can degrade heavily if a server starts getting stuck with excessively long sessions. Moreover, if a server goes down, the session data will be lost. IP Hash is another way to route the requests to the same server. The algorithm uses the IP address of the client as a hashing key and dispatches the request based on the key. Another variant of this algorithm uses the request URL to determine the hash key. 2. Dynamic Load Balancers Dynamic load balancers, as the name suggests, consider the current state of each server and dispatch incoming requests accordingly. Least Connection dispatches the traffic to the server with the fewest number of connections. The assumption is that all the servers are equal, and the server having a minimum number of connections would have the maximum resources available. Weighted Least Connection is another variant of the least connection. It provides an ability for an administrator to assign weightage to servers with higher capacity so that requests can be distributed based on the capacity. Least Response Time considers the response time along with the number of connections. The requests are dispatched to the server with the fewest connections and minimum average response time. The principle is to ensure the best service to the client. Adaptive or Resource-based dispatches the load and makes decisions based on the resources, i.e., CPU and memory available on the server. A dedicated program or agent runs on each server that measures the available resources on a server. The load balancer queries the agent to decide and allocate the incoming request. Load Balancing in Cloud A successful cloud strategy is to use load balancers with Auto Scaling. Typically, cloud applications are monitored for network traffic, memory consumption, and CPU utilization. These metrics and trends can help define the scaling policies to add or remove the application instances dynamically. A load balancer in the cloud considers the dynamic resizing and dispatches the traffic based on available servers. The section below describes a few of the popularly known solutions in the cloud: AWS: Elastic Load Balancing (ELB) Amazon ELB is a highly available and scalable load-balancing solution. It is ideal for applications running in AWS. Below are four different choices of Amazon ELB to pick from: Application Load Balancer is used for load balancing of HTTP and HTTPS traffic. Network Load Balancer is used for load balancing both TCP, UDP, and TLS traffic. Gateway Load Balancer is used to deploy, scale, and manage third-party virtual appliances. Classic Load Balancer is used for load balancing across multiple EC2 instances. GCP – Cloud Load Balancing Google Cloud Load Balancing is a highly performant and scalable offering from Google. It can support up to 1 million+ query per second. It can be divided into two major categories, i.e., internal and external. Each major category is further classified based on the incoming traffic. Below are a few load balancer types. Internal HTTP(S) Load Balancing Internal TCP/UDP Load Balancing External HTTP(S) Load Balancing External TCP/UDP Network Load Balancing A complete guide to compare all the available load balancers can be found on the Google load balancer page. Microsoft Azure Load Balancer Microsoft Azure load balancing solution provides three different types of load balancers: Standard Load Balancer - Public and internal Layer four load balancer. Gateway Load Balancer - High performance and high availability load balancer for third-party Network Virtual Appliances. Basic Load Balancer - Ideal for small-scale applications. Open-Source Load Balancing Solution Although a default choice is always to use the vendor-specific cloud load balancer, there are a few open-source load balancer options available. Below are two of those. NGINX NGINX provides NGINX Plus and NGINX, modern load-balancing solutions. There are many popular websites, including Dropbox, Netflix, and Zynga, that use load balancers from NGINX. The NGINX load balancing solutions are high-performance and can help improve the efficiency and reliability of a high-traffic website. Cloudflare Cloudflare is another popular load-balancing solution. It offers different tiers of the load balancer to meet specific customer needs. Pricing plans are based on the services, health checks, and security provided. Zero Trust platform plans Websites and application services plans Developer platform plans Enterprise plan Network services Choosing Load Balancer It is evident from the sections above that a load balancer can have a big impact on the applications. Thus, picking up the right solution is essential. Below are a few considerations to make the decision. Identifying the short-term and long-term goals of a business can help drive the decision. The business requirements should help identify the expected traffic, growing regions, and region of the service. Business considerations should also include the level of availability, the necessity of encryptions, or any other security concerns that need to be addressed. There are ample options available in the market. Identifying the necessary features for the application can help pick the right solution. As an example, the load balancer should be able to handle the incoming traffic for the application, such as HTTP/HTTPS or SSL, or TCP. Another example is a load balancer used for internal traffic has different security concerns than external load balancers. Cloud vendors provide various support tiers and pricing plans. Therefore, a detailed comparison of the total cost of ownership, features, and support tier can help identify the right choice for the project. Most experts agree that it is a best practice to use a load balancer to manage the traffic, which is critical to cloud applications. With the use of a load balancer, applications can serve the requests better and also save costs.
"An Amazon EC2 instance is a virtual server in Amazon's Elastic Compute Cloud (EC2) for running applications on the Amazon Web Services (AWS) infrastructure." In this article, see a series of video tutorials that goes over how to create an EC2 instance on AWS as well as how to connect EC2 instances with SSH, WinSCP, and PuTTY. Video Tutorials See how to Create EC2 Instance in AWS: See how to connect to an EC2 instance on AWS via SSH: See how to connect to an EC2 instance on AWS using WinSCP: See how to connect to an EC2 instance on AWS using PuTTY:
Cloud computing allows you to experience the benefits of serverless architecture without worrying about the underlying infrastructure. In this article, we'll look at what serverless is and why it's a good fit for your business. We'll also explore some of the best use cases for serverless, including how to implement it in your organization and how you can scale up when you need to increase resources. Finally, we'll consider common patterns and antipatterns that are likely to trip up any organization that adopts this approach. What Are Azure Serverless/Functions? Azure Functions is a cloud service available on demand that provides all the infrastructure and resources needed to run your applications. It's a small piece of code that runs in the cloud using Azure Functions Service. Functions provide serverless computing for Azure, which means that there is no need for you to manage servers or Compute resources. This will help developers to maintain less infrastructure and save on costs. Why We Need This? Serverless functions improve your application's performance and the experience of your developers and customers. Benefits These are easier to write and deploy on an Azure environment. No need to worry about Server infra, runtime, etc. It is highly scalable. When demand increases automatically, the required resources will be allocated, and when the demand falls, the extra resources get dropped off automatically. These are very lightweight and serverless. It is event-based, i.e., trigger function based on an event. It is supported by different Azure security services like Azure active directory etc. These are fast to execute because there is no large application, start-up time, initialization, and other events fired before the code is executed. These support variety of programming languages, including C#, F#, Java, JavaScript, TypeScript, and Python. These can be built, tested, and deployed in the Azure portal using a browser. Using Visual Studio, developers can test these locally using the Azure Storage emulator. Serverless Functions Architecture A serverless architecture separates the code from its hosting environment, allowing you to define triggers that invoke functions, which can be manual or automated. The result of an execution is the execution of code. In addition, most serverless platforms provide access to pre-defined APIs and bindings to streamline tasks such as writing to a database or queueing results. Azure Compute Level Comparision Iaas PaaS Container Serverless Scale VM Instance App Function Abstracts Hardware Platform OS Host Runtime Unit VM Project Image Code Lifetime Months Days to Months Minutes to Days Milliseconds to Minutes Responsibility Application dependencies, runtime and OS Application dependencies Application dependencies, runtime a Functions Architecture The WebJobs Core provides an execution context for the function and a platform for executing the function. The Language Runtime is responsible for running scripts, executing libraries, and hosting the framework for the target language. For example, Node.js is used to run JavaScript functions, and the .NET Framework is used to run C# functions. Serverless Architecture Patterns 1. For Web Applications and Mobile A web application backend for a retail scenario is used to pick up online orders from a queue and process them. The resulting data is stored in a database. Example: The data can be mapped to Cosmos DB or saved to Blob Storage, depending on whether the data is structured or unstructured. 2. Real-Time and Batch Processing of Files Real-time file processing, like generating instant invoices and calculating revenues continuously. The files can be processed by using OCR detection and adds it to a database for easy queries. (OR) Analyzing duplicate data based on frequency defined on submitted applications for any course in an educational institution. Example: Batch processing and near real-time data processing. 3. Real-Time Stream Processing Independent Software Vendor (ISV) Scenario A near real-time data, like data generated from "Air Quality Sensors," to determine air quality categories. Example: Anti-patterns The Serverless architecture is not suitable for all use cases, but there are some circumstances in which it may not be appropriate. Shared Code/Logic Distributed Monoliths Complex Processing Serverless Big Data ETL Pipeline Long Processing Tasks Async calls Conclusion Microsoft Azure provides a variety of serverless services that help customers build applications quickly. Azure Functions plays a key role in building, testing, and deploying applications with low latency. Azure comes with high scalability and availability. Hosting Single Page Applications directly on Azure Blob Storage without involving any web servers is easy. Major benefits for cost control (PAAS), when compared to servers hosted physically at data, canter to deliver services. Azure Functions, SQL, and Logic Apps are the most common and highly utilized serverless services to design fault-tolerant applications.
A couple of weeks before, in AWS re:invent, Amazon made a lot of innovative announcements, and one of the announcements was the AWS Application Composer service, which allows a user to drag and drop elements to the Canvas and quickly design and deploy serverless applications. Introduction Application Composer service is in the preview phase as this is being written. It allows you to drag and drop a list of resources to a canvas, make connections between them and provide the required configuration. It allows you to design a workflow on the front end, and in the background, it generates the necessary code and template using the Serverless Architecture Model (SAM). SAM CLI is the tool you can use to quickly deploy this template to the AWS environment. Serverless Architecture Model (SAM) is an open-source framework for creating serverless applications in YAML format. You can think of SAM as a shorthand representation of CloudFormation template; SAM syntax allows you to define APIs, databases, functions, and other AWS resources in just a few lines without putting all details. So, when you create an application composer project, basically, you are creating a SAM template. Application Composer allows you to import any existing SAM template or a CloudFormation template. Implementation In this article, to demonstrate the Application Composer service, we are going to create 2 API gateway endpoints; one to get a list of employees and another to create an employee. These endpoints will point to 2 different AWS Lambda’s CreateEmployee and ListEmployees backed by an employee DynamoDb table. To follow along with this video, you need to have your environment setup and have Java11and Gradle installed on your machine. You also need your AWS Serverless Application Model (SAM) CLI installed on your system to build and deploy the generated template. You can download the SAM CLI from here Continue with the following video tutorial for step-by-step instructions for Application Composer Service. Demo Download You can download the source code for this video from GitHub. Conclusion With the introduction of the Application Composer, Amazon is putting a step forward to encourage developers to utilize their infrastructure as a code tools. This product completely relies and builds on top of CloudFormation, and the AWS team is trying to simplify that with SAM Model. Cloud formation also provides a designer for generating the template behind the science, but Application Composer is way more advanced than that and taking cloud formation to next level. Personally, I liked and enjoyed trying out this service; please let me know what you think about the product in the comments.
Jakarta EE is a unanimously adopted and probably the most popular Java enterprise-grade software development framework. With the industry-wide adoption of microservices-based architectures, its popularity is skyrocketing and during these last years, it has become the preferred framework for professional software enterprise applications and services development in Java. Jakarta EE applications used to traditionally be deployed in run-times or application servers like Wildfly, GlassFish, Payara, JBoss EAP, WebLogic, WebSphere, and others, which might have been criticized for their apparent heaviness and expansive costs. With the advent and the ubiquitousness of the cloud, these constraints are going to become less restrictive, especially thanks to the serverless technology, which provides increased flexibility, for standard low costs. This article demonstrates how to alleviate the Jakarta EE run-times, servers, and applications, by deploying them on AWS Serverless infrastructures. Overview of AWS Fargate As documented in the User Guide, AWS Fargate is a serverless paradigm used in conjunction with AWS ECS (Elastic Container Service) to run containerized applications. In a nutshell, this concept allows us to: Package applications in containers Specify the host operating system, the CPU's architecture, and capacity, the memory requirements, the network, and security policies Execute in the cloud the whole resulting stack. Running containers with AWS ECS requires handling a so-called launch type (i.e. an abstraction layer) defining the way to execute standalone tasks and services. There are several launch types that might be defined for AWS ECS-based containers, and Fargate is one of them. It represents the serverless way to host AWS ECS workloads and consists of components like clusters, tasks, and services, as explained in the AWS Fargate User Guide. The figure below, extracted from the AWS Fargate documentation, is emphasizing its general architecture: As the figure above shows, in order to deploy serverless applications running as ECS containers, we need a quite complex infrastructure consisting in: A VPC (Virtual Private Cloud) An ECR (Elastic Container Registry) An ECS cluster A Fargate launch type by ECS cluster node One or more tasks by Fargate launch type An ENI (Elastic Network Interface) by task Now, if we want to deploy Jakarta EE applications in the AWS serverless cloud as ECS-based containers, we need to: Package the application as a WAR. Create a Docker image containing the Jakarta EE-compliant run-time or application server with the WAR deployed. Register this Docker image into the ECR service. Define a task to run the Docker container built from the previously defined image. The AWS console allows us to perform all these operations in a user-friendly way; nevertheless, the process is quite time-consuming and laborious. Using AWS CloudFormation, or even AWS CLI, we could automatize it, of course, but the good news is that we have a much better alternative, as explained below. Overview of AWS Copilot AWS Copilot is a CLI (Command Line Interface) tool that provides application-first, high-level commands to simplify modeling, creating, releasing, and managing production-ready containerized applications on Amazon ECS from a local development environment. The figure below shows its software architecture: Using AWS Copilot, developers can easily manage the required AWS infrastructure, from their local machine, by executing simple commands which result in the creation of the deployment pipelines, fulfilling all the required resources enumerated above. In addition, AWS Copilot can also create extra resources like subnets, security groups, load balancers, and others. Here is how. Deploying Payara 6 Applications on AWS Fargate Installing AWS Copilot is as easy as downloading and unzipping an archive, such that the documentation is guiding you. Once installed, run the command above to check whether everything works: ~$ copilot --version copilot version: v1.24.0 The first thing to do in order to deploy a Jakarta EE application is to develop and package it. A very simple way to do that for test purposes is by using the Maven archetype jakartaee10-basic-archetype, as shown below: mvn -B archetype:generate \ -DarchetypeGroupId=fr.simplex-software.archetypes \ -DarchetypeArtifactId=jakartaee10-basic-archetype \ -DarchetypeVersion=1.0-SNAPSHOT \ -DgroupId=com.exemple \ -DartifactId=test This Maven archetype generates a simple, complete Jakarta EE 10 project with all the required dependencies and artifacts to be deployed on Payara 6. It generates also all the required components to perform integration tests of the exposed JAX-RS API (for more information on this archetype please see here). Among other generated artifacts, the following Dockerfile will be of real help in our AWS Fargate Cluster setup: Dockerfile FROM payara/server-full:6.2022.1 COPY ./target/test.war $DEPLOY_DIR Now that we have our test Jakarta EE application, as well as the Dockerfile required to run the Payara Server 6 with this application deployed, let's use AWS Copilot in order to start the process of the serverless infrastructure creation. Simply run the following command: Shell $ copilot init Note: It's best to run this command in the root of your Git repository. Welcome to the Copilot CLI! We're going to walk you through some questions to help you get set up with a containerized application on AWS. An application is a collection of containerized services that operate together. Application name: jakarta-ee-10-app Workload type: Load Balanced Web Service Service name: lb-ws Dockerfile: test/Dockerfile parse EXPOSE: no EXPOSE statements in Dockerfile test/Dockerfile Port: 8080 Ok great, we'll set up a Load Balanced Web Service named lb-ws in application jakarta-ee-10-app listening on port 8080. * Proposing infrastructure changes for stack jakarta-ee-10-app-infrastructure-roles - Creating the infrastructure for stack jakarta-ee-10-app-infrastructure-roles [create complete] [76.2s] - A StackSet admin role assumed by CloudFormation to manage regional stacks [create complete] [34.0s] - An IAM role assumed by the admin role to create ECR repositories, KMS keys, and S3 buckets [create complete] [33.3s] * The directory copilot will hold service manifests for application jakarta-ee-10-app. * Wrote the manifest for service lb-ws at copilot/lb-ws/manifest.yml Your manifest contains configurations like your container size and port (:8080). - Update regional resources with stack set "jakarta-ee-10-app-infrastructure" [succeeded] [0.0s] All right, you're all set for local development. Deploy: No No problem, you can deploy your service later: - Run `copilot env init` to create your environment. - Run `copilot deploy` to deploy your service. - Be a part of the Copilot community ! Ask or answer a question, submit a feature request... Visit https://aws.github.io/copilot-cli/community/get-involved/ to see how! The process of the serverless infrastructure creation conducted by AWS Copilot is based on a dialog during which the utility is asking questions and accepts your answers. The first question concerns the name of the serverless application to be deployed. We choose to name it jakarta-ee-10-app. In the next step, AWS Copilot is asking what is the new workload type of the new service to be deployed and proposes a list of such workload types, from which we need to select Load Balanced Web Service. The name of this new service is lb-ws. Next, AWS Copilot is looking for Dockerfiles in the local workspace and will display a list from which you have either to choose one, create a new one, or use an already-existent image, in which case you need to provide its location (i.e., a DockerHub URL). We choose the Dockerfile we just created previously, when we ran the Maven archetype. It only remains for us to define the TCP port number that the newly created service will use for HTTP communication. By default, AWS Copilot proposes the TCP port number 80, but we overload it with 8080. Now, all the required information is collected and the process of infrastructure generation may start. This process consists in creating two CloudFormation stacks, as follows: A first CloudFormation stack containing the definition of the required IAM security roles; A second CloudFormation stack containing the definition of a template whose execution creates a new ECS cluster. In order to check the result of the execution of the AWS Copilot initialization phase, you can connect to your AWS console, go to the CloudFormation service, and you will see something similar to this: As you can see, the two mentioned CloudFormation stacks appear on the screen copy above and you can click on them in order to inspect the details. We just finished the initialization phase of our serverless infrastructure creation driven by AWS Copilot. Now, let's create our development environment: Shell $ copilot env init Environment name: dev Credential source: [profile default] Default environment configuration? Yes, use default. * Manifest file for environment dev already exists at copilot/environments/dev/manifest.yml, skipping writing it. - Update regional resources with stack set "jakarta-ee-10-app-infrastructure" [succeeded] [0.0s] - Update regional resources with stack set "jakarta-ee-10-app-infrastructure" [succeeded] [128.3s] - Update resources in region "eu-west-3" [create complete] [128.2s] - ECR container image repository for "lb-ws" [create complete] [2.2s] - KMS key to encrypt pipeline artifacts between stages [create complete] [121.6s] - S3 Bucket to store local artifacts [create in progress] [99.9s] * Proposing infrastructure changes for the jakarta-ee-10-app-dev environment. - Creating the infrastructure for the jakarta-ee-10-app-dev environment. [create complete] [65.8s] - An IAM Role for AWS CloudFormation to manage resources [create complete] [25.8s] - An IAM Role to describe resources in your environment [create complete] [27.0s] * Provisioned bootstrap resources for environment dev in region eu-west-3 under application jakarta-ee-10-app. Recommended follow-up actions: - Update your manifest copilot/environments/dev/manifest.yml to change the defaults. - Run `copilot env deploy --name dev` to deploy your environment. AWS Copilot starts by asking us what name we want to give to our development environment and continues by proposing to use either the current user default credentials or some temporary credentials created for the purpose. We choose the first alternative. Then, AWS Copilot creates a new stack set, named jakarta-ee-10-app-infrastructure containing the following infrastructure elements: An ECR container to register the Docker image resulted further in the execution of the build operation on the Dockerfile selected during the previous step A new KMS (Key Management Service) key, to be used for encrypting the artifacts belonging to our development environment An S3 (Simple Storage Service) bucket, to be used in order to store inside the artifacts belonging to our development environment A new dedicated CloudFormation IAM role which aims at managing resources A new dedicated IAM role to describe the resources This operation may take a significant time, depending on your bandwidth, and, once finished, the development environment, named jakarta-ee-10-app-dev, is created. You can see its details in the AWS console, as shown below: Notice that the environment creation can be also performed as an additional operation of the first initialization step. As a matter of fact, the copilot init command, as shown above, ends by asking whether you want to create a test environment. Answering yes to this question allows you to proceed immediately with a test environment creation and initialization. For pedagogical reasons, here we preferred to separate these two actions. The next phase is the deployment of our development environment: Shell $ copilot env deploy Only found one environment, defaulting to: dev * Proposing infrastructure changes for the jakarta-ee-10-app-dev environment. - Creating the infrastructure for the jakarta-ee-10-app-dev environment. [update complete] [74.2s] - An ECS cluster to group your services [create complete] [2.3s] - A security group to allow your containers to talk to each other [create complete] [0.0s] - An Internet Gateway to connect to the public internet [create complete] [15.5s] - Private subnet 1 for resources with no internet access [create complete] [5.4s] - Private subnet 2 for resources with no internet access [create complete] [2.6s] - A custom route table that directs network traffic for the public subnets [create complete] [11.5s] - Public subnet 1 for resources that can access the internet [create complete] [2.6s] - Public subnet 2 for resources that can access the internet [create complete] [2.6s] - A private DNS namespace for discovering services within the environment [create complete] [44.7s] - A Virtual Private Cloud to control networking of your AWS resources [create complete] [12.7s] The CloudFormation template created during the previous step is now executed and it results in the creation and initialization of the following infrastructure elements: The new ECS cluster, grouping all the stateless required artifacts An IAM security group to allow communication between containers An Internet Gateway such that the new service be publicly accessible Two private and two public subnets A new routing table with the required rules such that to allow traffic between public and private subnets A private Route53 (DNS) namespace A new VPC (Virtual Private Cloud) which aims at controlling the whole bunch of the AWS resources created during this step Take some time to navigate through your AWS console pages and inspect the infrastructure that AWS Copilot has created for you. As you can see, it's an overladen one and it would have been laborious and time-consuming to create it manually. The sharp-eyed reader has certainly noticed that creating and deploying an environment, like our development one, doesn't activate any service to it. In order to do that, we need to proceed with our last step: the service deployment. Simply run the command below: Shell $ copilot deploy Only found one workload, defaulting to: lb-ws Only found one environment, defaulting to: dev Sending build context to Docker daemon 13.67MB Step 1/2 : FROM payara/server-full:6.2022.1 ---> ada23f507bd2 Step 2/2 : COPY ./target/test.war $DEPLOY_DIR ---> Using cache ---> f1b0fe950252 Successfully built f1b0fe950252 Successfully tagged 495913029085.dkr.ecr.eu-west-3.amazonaws.com/jakarta-ee-10-app/lb-ws:latest WARNING! Your password will be stored unencrypted in /home/nicolas/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded Using default tag: latest The push refers to repository [495913029085.dkr.ecr.eu-west-3.amazonaws.com/jakarta-ee-10-app/lb-ws] d163b73cdee1: Pushed a9c744ad76a8: Pushed 4b2bb262595b: Pushed b1ed0705067c: Pushed b9e6d039a9a4: Pushed 99413601f258: Pushed d864802c5436: Pushed c3f11d77a5de: Pushed latest: digest: sha256:cf8a116279780e963e134d991ee252c5399df041e2ef7fc51b5d876bc5c3dc51 size: 2004 * Proposing infrastructure changes for stack jakarta-ee-10-app-dev-lb-ws - Creating the infrastructure for stack jakarta-ee-10-app-dev-lb-ws [create complete] [327.9s] - Service discovery for your services to communicate within the VPC [create complete] [2.5s] - Update your environment's shared resources [update complete] [144.9s] - A security group for your load balancer allowing HTTP traffic [create complete] [3.8s] - An Application Load Balancer to distribute public traffic to your services [create complete] [124.5s] - A load balancer listener to route HTTP traffic [create complete] [1.3s] - An IAM role to update your environment stack [create complete] [25.3s] - An IAM Role for the Fargate agent to make AWS API calls on your behalf [create complete] [25.3s] - A HTTP listener rule for forwarding HTTP traffic [create complete] [3.8s] - A custom resource assigning priority for HTTP listener rules [create complete] [3.5s] - A CloudWatch log group to hold your service logs [create complete] [0.0s] - An IAM Role to describe load balancer rules for assigning a priority [create complete] [25.3s] - An ECS service to run and maintain your tasks in the environment cluster [create complete] [119.7s] Deployments Revision Rollout Desired Running Failed Pending PRIMARY 1 [completed] 1 1 0 0 - A target group to connect the load balancer to your service [create complete] [0.0s] - An ECS task definition to group your containers and run them on ECS [create complete] [0.0s] - An IAM role to control permissions for the containers in your tasks [create complete] [25.3s] * Deployed service lb-ws. Recommended follow-up action: - You can access your service at http://jakar-Publi-H9B68022ZC03-1756944902.eu-west-3.elb.amazonaws.com over the internet. The listing above shows the process of creation of the whole bunch of the resources required in order to produce our serverless infrastructure containing the Payara Server 6, together with the test Jakarta EE 10 application, deployed into it. This infrastructure consists of a CloudFormation stack named jakarta-ee-10-app-dev-lb-ws containing, among others, security groups, listeners, IAM roles, dedicated CloudWatch log groups, and, most important, an ECS task definition having a Fargate launch type that runs the Payara Server 6 platform. This way makes available our test application, and its associated exposed JAX-RS API, at its associated public URL. You can test it by simply running the curl utility: Shell curl http://jakar-Publi-H9B68022ZC03-1756944902.eu-west-3.elb.amazonaws.com/test/api/myresource Got it ! Here we have appended to the public URL our JAX-RS API relative URN, as displayed by AWS Copilot. You may perform the same test by using your preferred browser. Also, if you prefer to run the provided integration test, you may slightly adapt it by amending the service URL. Don't hesitate to go to your AWS console to inspect the serverless infrastructure created by the AWS Copilot in detail. And once finished, don't forget to clean up your workspace by running the command below which removes the CloudFormation stack jakarta-ee-10-app-lb-ws with all its associated resources: Shell $ copilot app delete Sure? Yes * Delete stack jakarta-ee-10-app-dev-lb-ws - Update regional resources with stack set "jakarta-ee-10-app-infrastructure" [succeeded] [12.4s] - Update resources in region "eu-west-3" [update complete] [9.8s] * Deleted service lb-ws from application jakarta-ee-10-app. * Retained IAM roles for the "dev" environment * Delete environment stack jakarta-ee-10-app-dev * Deleted environment "dev" from application "jakarta-ee-10-app". * Cleaned up deployment resources. * Deleted regional resources for application "jakarta-ee-10-app" * Delete application roles stack jakarta-ee-10-app-infrastructure-roles * Deleted application configuration. * Deleted local .workspace file. Enjoy!
Edge computing is an emerging paradigm leading to a major transformation in the networking world. The prime advantages that edge computing offers are reduced latency, bandwidth optimizations, and faster processing of data which leads to a better user experience. It all depends on how time-critical the application in question is. With COVID-19 and the way the work-from-home culture got popular — streaming applications used for edutech, collaborative tools, online healthcare, live training, and of course, the OTT platforms which were a savior for people socially disconnected due to being in home isolation or taking general COVID precautions — the criticality of an edge could be very well seen. All these applications could survive these tough times since the networking infrastructures across the world were mature enough to offer really low latency to these applications. Because of this, edge computing played a pivotal role in getting content closer to the real user. Edge use cases were no longer confined to public safety, military uses, or manufacturing sectors — edge had a way bigger role to play in the daily lives of people. This wasn't necessarily evident to a consumer directly, but the service providers offering them networks or real-world applications had a huge dependency on these emerging networking techniques. Edge Infra: "What All Is Involved?" Customers Ask Edge offered a lot of monetization opportunities to the small and the big players. There are equipment manufacturers who offer low-footprint edge devices — be it edge-in-a-box solutions or devices embedded with small sensor SoCs. And then there are big players like Amazon, Google, and Microsoft who started offering a lot of edge infra as IaaS across the world — for the edge application developers as well as the service providers who could leverage their edge infra to offer an immersive and interactive experience to the end users. These edge solutions heavily use virtualization techniques — containerized apps, VMs, and so on, which helps one have more scalability based on the load and more reliability with redundant infra getting activated only on failures. This is way more cost-effective. Of course, for all of this to work as a whole, most of the edge solutions bank a lot on two things — automation and orchestration. Edge infra is generally spread out — you don’t expect someone to go and configure boxes every now and then. It is important to have automated deployments controlled and orchestrated from some central points across the globe. The right resource distribution and configuration is critical to get the best out of these edge deployments. These edge infra might be placed in the customer premise or in the communication service provider’s network — it might be a private cloud deployment based on OpenStack or Kubernetes, or it might be hybrid cloud deployments. Edge Infra: Commercial Offerings The whole idea of the edge is a resource-constrained environment — the push was toward low-footprint data centers, densification of immense processing power in miniaturized form factors, moving toward containerized platforms, compact and self-contained data center building blocks which are optimized in terms of CPU or GPUs, memory, and so on. There have been offerings like an edge in the box. All of these offer cost optimizations when one is planning an edge deployment. There are products from vendors like Qualcomm, AWS, and so on — AWS offers portable edge devices like Snowball and Snowcone. All these solutions have an emerging market in the time to come. When it comes to market offerings, AWS Wavelength is one popular solution for media streaming or real-time streaming. It offers wavelength zones, and the apps deployed in these zones can connect to the rest of the applications running somewhere in the cloud — so these zones directly connect to the AWS infrastructure across the globe. There are about 70+ zones across 20+ AWS regions. Another big offering is from Google Cloud. They have started offering the Media CDN platform, which is completely oriented toward media applications. This allows Google Cloud customers to reuse the existing infra for edge streaming applications. The Google network is spread across more than 200 countries and has more than 1000 cities covered. The Media CDN platform also has APIs and automation tools that aid edge application developers to get more info and offer easy deployments with respect to their applications. Edge Infra: Open Source With the industry bending toward open-source solutions, edge infra and the edge platform market are also augmented by open source. The OpenInfra Foundation offers StarlingX, which is a complete private cloud-based edge platform. Another option is Akraino, which offers blueprints for various edge scenarios. These blueprints can be used to deploy and maintain edge platforms over COTS infrastructure. These blueprints have specific target markets, though — for instance, there is a radio edge cloud that offers a 5G RAN edge. There is an edge automation blueprint and the vertical edge applications blueprint, which can be leveraged for various real-time streaming applications. There are some more open-source options like Edge Foundry, applicable to the IoT edge focusing on the interoperability between heterogenous devices and applications, and EVE (Edge Virtualization Engine), which can offer orchestration for the cloud-native real-time streaming applications running in the context of containers, VMs, or unikernels. With the kind of involvement and interest across the industry, in the coming five years, we expect these open-source edge solutions to have more real-world deployments — thus helping on cost and a vaster range of use cases.
If you want to run your Java microservices on a public cloud infrastructure, you should take advantage of the multiple cloud regions. There are several reasons why this is a good idea. First, cloud availability zones and regions fail regularly due to hardware issues, bugs introduced after a cloud service upgrade, or banal human errors. One of the most well-known S3 outages happened when an AWS employee messed with an operational command! If a cloud region fails, so do your microservices from that region. But, if you run microservice instances across multiple cloud regions, you remain up and running even if an entire US East region is melting. Second, you may choose to deploy microservices in the US East, but the application gets traction across the Atlantic in Europe. The roundtrip latency for users from Europe to your application instances in the US East will be around 100ms. Compare this to the 5ms roundtrip latency for the user traffic originating from the US East (near the data centers running microservices), and don't be surprised when European users say your app is slow. You shouldn't hear this negative feedback if microservice instances are deployed in both the US East and Europe West regions. Finally, suppose a Java microservice serves a user request from Europe but requests data from a database instance in the USA. In that case, you might fall foul of data residency requirements (if the requested data is classified as personal by GDPR). However, if the microservice instance runs in Europe and gets the personal data from a database instance in one of the European cloud regions, you won't have the same problems with regulators. This was a lengthy introduction to the article's main topic, but I wanted you to see a few benefits of running Java microservices in multiple distant cloud locations. Now, let's move on to the main topic and see how to develop and deploy multi-region microservices with Spring Cloud. High-Level Concept Let’s take a geo-distributed Java messenger as an example to form a high-level understanding of how microservices and Spring Cloud function in a multi-region environment. The application (comprised of multiple microservices) runs across multiple distant regions: US West, US Central, US West, Europe West, and Asia South. All application instances are stateless. Spring Cloud components operate in the same regions where the application instances are located. The application uses Spring Config Server for configuration settings distribution and Spring Discovery Server for smooth and fault-tolerant inter-service communication. YugabyteDB is selected as a distributed database that can easily function across distant locations. Plus, as long as it’s built on the PostgreSQL source code, it naturally integrates with Spring Data and other components of the Spring ecosystem. I’m not going to review YugabyteDB multi-region deployment options in this article. Check out this article if you’re curious about those options and how to select the best one for this geo-distributed Java messenger. The user traffic gets to the microservice instances via a Global External Cloud Load Balancer. In short, the load balancer comes with a single IP address that can be accessed from any point on the planet. That IP address (or a DNS name that translates to the address) is given to your web or mobile front end, which uses the IP to connect to the application backend. The load balancer forwards user requests to the nearest application instance automatically. I’ll demonstrate this cloud component in greater detail below. Target Architecture A target architecture of the multi-region Java messenger looks like this: The whole solution runs on the Google Cloud Platform. You might prefer another cloud provider, so feel free to go with it. I usually default to Google for its developer experience, abundant and reasonably priced infrastructure, fast and stable network, and other goodies I’ll be referring to throughout the article. The microservice instances can be deployed in as many cloud regions as necessary. In the picture above, there are two random regions: Region A and Region B. Microservice instances can run in several availability zones of a region (Zone A and B of Region A) or within a single zone (Zone A of Region B). It’s also reasonable to have a single instance of the Spring Discovery and Config servers per region, but I purposefully run an instance of each server per availability zone to bring the latency to a minimum. Who decides which microservice instance will serve a user request? Well, the Global External Load Balancer is the decision-maker! Suppose a user pulls up her phone, opens the Java messenger, and sends a message. The request with the message will go to the load balancer, and it might forward it this way: Region A is the closest to the user, and it’s healthy at the time of the request (no outages). The load balancer selects this region based on those conditions. In that region, microservice instances are available in both Zone A and B. So, the load balancer can pick any zone if both are live and healthy. Let’s suppose that the request went to Zone B. I’ll explain what each microservice is responsible for in the next section. As of now, all you should know is that the Messaging microservice stores all application data (messages, channels, user profiles, etc.) in a multi-region YugabyteDB deployment. The Attachments microservice uses a globally distributed Google Cloud Storage for user pictures. Microservices and Spring Cloud Let’s talk more about microservices and how they utilize Spring Cloud. The Messenger microservice implements the key functionality that every messenger app must possess—the ability to send messages across channels and workspaces. The Attachments microservice uploads pictures and other files. You can check their source code in the geo-messenger’s repository. Spring Cloud Config Server Both microservices are built on Spring Boot. When they start, they retrieve configuration settings from the Spring Cloud Config Server, which is an excellent option if you need to externalize the config files in a distributed environment. The config server can host and pull your configuration from various backends, including a Git repository, Vault, and a JDBC-compliant database. In the case of the Java geo-messenger, the Git option is used, and the following line from the application.properties file of both microservices requests Spring Boot to load the settings from the Config Server: YAML spring.config.import=configserver:http://${CONFIG_SERVER_HOST}:${CONFIG_SERVER_PORT} Spring Cloud Discovery Server Once the Messenger and Attachments microservices are booted, they register with their zone-local instance of the Spring Cloud Discovery Server (that belongs to the Spring Cloud Netflix component). The location of a Discovery Server instance is defined in the following configuration setting that is transferred from the Config Server instance: YAML eureka.client.serviceUrl.defaultZone=http://${DISCOVERY_SERVER_HOST}:${DISCOVERY_SERVER_PORT}/eureka You can also open the HTTP address in the browser to confirm the services have successfully registered with the Discovery Server: The microservice register with the server using the name you pass via the spring.application.name setting of the application.properties file. As the above picture shows, I’ve chosen the following names: spring.application.name=messenger for the Messenger microservice spring.application.name=attachments for the Attachments service The microservice instances use those names to locate and send requests to each other via the Discovery Server. For example, when a user wants to upload a picture in a discussion channel, the request goes to the Messenger service first. But then, the Messenger delegates this task to the Attachments microservice with the help of the Discovery Server. First, the Messenger service gets an instance of the Attachments counterpart: Java List<ServiceInstance> serviceInstances = discoveryClient.getInstances("ATTACHMENTS"); ServiceInstance instance; if (!serviceInstances.isEmpty()) { instance = serviceInstances .get(ThreadLocalRandom.current().nextInt(0, serviceInstances.size())); } System.out.printf("Connected to service %s with URI %s\n", instance.getInstanceId(), instance.getUri()); Next, the Messenger microservice creates an HTTP client using the Attachments’ instance URI and sends a picture via an InputStream: Java HttpClient httpClient = HttpClient.newBuilder().build(); HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(instance.getUri() + "/upload?fileName=" + fileName)) .header("Content-Type", mimeType) .POST(HttpRequest.BodyPublishers.ofInputStream(new Supplier<InputStream>() { @Override public InputStream get() { return inputStream; } })).build(); The Attachments service receives the request via a REST endpoint and eventually stores the picture in Google Cloud Storage, returning a picture URL to the Messenger microservice: Java public Optional<String> storeFile(String filePath, String fileName, String contentType) { if (client == null) { initClient(); } String objectName = generateUniqueObjectName(fileName); BlobId blobId = BlobId.of(bucketName, objectName); BlobInfo blobInfo = BlobInfo.newBuilder(blobId).build(); try { client.create(blobInfo, Files.readAllBytes(Paths.get(filePath))); } catch (IOException e) { System.err.println("Failed to load the file:" + fileName); e.printStackTrace(); return Optional.empty(); } System.out.printf( "File %s uploaded to bucket %s as %s %n", filePath, bucketName, objectName); String objectFullAddress = "http://storage.googleapis.com/" + bucketName + "/" + objectName; System.out.println("Picture public address: " + objectFullAddress); return Optional.of(objectFullAddress); } If you’d like to explore a complete implementation of the microservices and how they communicate via the Discovery Server, visit the GitHub repo, linked earlier in this article. Deploying on Google Cloud Platform Now, let’s deploy the Java geo-messenger on GCP across three geographies and five cloud regions - North America ('us-west2,' 'us-central1,' 'us-east4'), Europe ('europe-west3') and Asia ('asia-east1'). Follow these deployment steps: Create a Google project. Create a custom premium network. Configure Google Cloud Storage. Create Instance Templates for VMs. Start VMs with application instances. Configure Global External Load Balancer. I’ll skip the detailed instructions for the steps above. You can find them here. Instead, let me use the illustration below to clarify why the premium Google network was selected in step #2: Suppose an application instance is deployed in the USA on GCP, and the user connects to the application from India. There are slow and fast routes to the app from the user’s location. The slow route is taken if you select the Standard Network for your deployment. In this case, the user request travels over the public Internet, entering and exiting the networks of many providers before getting to the USA. Eventually, in the USA, the request gets to Google’s PoP (Point of Presence) near the application instance, enters the Google network, and gets to the application. The fast route is selected if your deployment uses the Premium Network. In this case, the user request enters the Google Network at the PoP closest to the user and never leaves it. That PoP is in India, and the request will speed to the application instance in the USA via a fast and stable connection. Plus, the Cloud External Load Balancer requires the premium tier. Otherwise, you won’t be able to intercept user requests at the nearest PoP and forward them to the nearby application instances. Testing Fault Tolerance Once the microservices are deployed across continents, you can witness how the Cloud Load Balancer functions at normal times and during outages. Open an IP address used by the load balancer in your browser and send a few messages with photos in one of the discussion channels: Which instance of the Messenger and Attachments microservices served your last requests? Well, it depends on where you are in the world. In my case, the instances from the US East (ig-us-east) serve my traffic: What would happen with the application if the US East region became unavailable, bringing down all microservices in that location? Not a problem for my multi-region deployment. The load balancer will detect issues in the US East and forward my traffic to another closest location. In this case, the traffic is forwarded to Europe as long as I live in the US East Coast near the Atlantic Ocean: To emulate the US East region outage, I connected to the VM in that region and shut down all of the microservices. The load balancer detected that the microservices no longer responded in that region and started forwarding my traffic to a European data center. Enjoy the fault tolerance out of the box! Testing Performance Apart from fault tolerance, if you deploy Java microservices across multiple cloud regions, your application can serve user requests at low latency regardless of their location. To make this happen, first, you need to deploy the microservice instances in the cloud locations where most of your users live and configure the Global External Load Balancer that can do routing for you. This is what I discussed in "Automating Java Application Deployment Across Multiple Cloud Regions." Second, you need to arrange your data properly in those locations. Your database needs to function across multiple regions, the same as microservice instances. Otherwise, the latency between microservices and the database will be high and overall performance will be poor. In the discussed architecture, I used YugabyteDB as it is a distributed SQL database that can be deployed across multiple cloud regions. The article, "Geo-Distributed Microservices and Their Database: Fighting the High Latency" shows how latency and performance improve if YugabyteDB stores data close to your microservice instances. Think of that article as the continuation of this story, but with a focus on database deployment. As a spoiler, I improved latency from 450ms to 5ms for users who used the Java messenger from South Asia. Wrapping Up If you develop Java applications for public cloud environments, you should utilize the global cloud infrastructure by deploying application instances across multiple regions. This will make your solution more resilient, performant, and compliant with the data regulatory requirements. It‘s important to remember that it’s not that difficult to create microservices that function and coordinate across distant cloud locations. The Spring ecosystem provides you with the Spring Cloud framework, and public cloud providers like Google offer the infrastructure and services needed to make things simple
A Word About Kubernetes Cluster Resources Kubernetes is a container orchestration platform. It is very popular for deploying container-based workloads. Kubernetes clusters could spread across many nodes. These nodes are physical or virtual machines spread across geographies and deployed at various data centers, ensuring high availability for the cluster. These machines have a lot of computing resources aggregated at the cluster level and are at the disposal of the workloads that get deployed in the cluster. These computing resources include CPU and memory capabilities. Apart from that, the cluster also has a constraint on the number of API objects it can hold. The latest Kubernetes version supports 110 pods per node, as there are also constraints on assigning IP addresses. A Kubernetes cluster is shared among multiple development teams and users. Different teams might have to deploy a different number of workloads. The resource requirements of those workloads might also vary. Some teams or users might need a higher or lower share of the entire cluster's resources. If there would not any restrictions on teams, one team might end up consuming the legitimate share of resources from other teams. One more use case: if two teams try to deploy an API object with the same type and name, the team that deploys it last may end up overriding the first one or failing. So, to manage the resources better, a Kubernetes administrator must assign some restrictions to the teams so that each team will have the necessary amount of resources at their disposal to flawlessly carry out their work. How to Control Resources Fortunately for us, Kubernetes provides two API objects with which we can solve these issues. We don’t need to use any third-party tools. With the API objects, we can isolate teams or users inside the cluster. Also, we can definitely enable them with the computing resources they need. In addition, we will limit their consumption. These two API objects are Namespace and ResourceQuota. A namespace is an API object that is created at the cluster level. ResourceQuotas are created and applied at the namespace level. Kubernetes API objects are divided into two broad categories, namely, namespaced and non-namespaced. Namespace objects are created and visible within it. The cluster scope applies to non-namespaced objects. Namespace As mentioned earlier, Namespace is a way to isolate multiple teams or users in a single Kubernetes cluster. The idea is to divide the cluster's resources into groups, where each group is a namespace. When a cluster is created, a few namespaces are added within it. There will always be a default namespace. Any namespaced object we create is automatically created in the default namespace. The Kubernetes control plane's objects are added to namespaces that begin with the "kube-" prefix. They are kube-system, kube-public, and kube-node-lease. Aside from the namespaces listed above, administrators can create new namespaces based on their needs. In a production environment, teams will create the API objects in the namespaces allocated to them and not in the default namespace. Within the namespace, the name of a certain type of object must be unique. That is, two teams could create a pod named "alpha-pod" in their respective namespaces without causing a collision. This couldn’t have been possible if they had created the object at the cluster level. Only namespaced objects like pods, services, deployments, and other similar objects are capable of object scoping at the namespace. That is not applicable to cluster-level objects like PersistentVolume or StorageClass. When a user logs into the cluster, he is assigned to the default namespace. The user could change his namespace to a different one by using the “kubectl set-context” command. When the namespace is changed, it gets reflected in the user’s current context and in the ~/.kube/.config file. Any subsequent commands the user issues are only meant for his namespace. Users could execute the kubectl commands in another namespace by suffixing “-n [name-of-namespace]”. That includes creating API objects inside other namespaces as well. However, access to namespaces could again be controlled by Kubernetes RBAC. ResourceQuota ResourceQuota is a Kubernetes API object using which administrators allocate resources per namespace. ResourceQuotas are namespaced objects, meaning their scope is within the namespace in which they are created, edited, and applied. It restricts what is the maximum and minimum CPU and Memory each Pod will consume. On top of that, it could also restrict how many objects of what type could be created in the namespace. That is, how many maximum number of Pods, services, deployments, configmaps, and such can be created in a namespace? When a ResourceQuota is applied in a namespace with restrictions on CPU, memory, and object counts, the respective controller ensures that the quota must be honored. It keeps count of the number of objects part and validates if, for each new create request, the count is not getting exhausted. And whenever a create request is received for a Pod, it is verified that the Pod exclusively requests for CPU and memory. If any Pod creation request comes without the necessary resource request, it ends with a forbidden error. So it is for the object count. A resource quota could be applied with a combination of all three, i.e., CPU, memory, and different object counts. It may be created with at least one restriction or any combination of restrictions out of the three. LimitRange LimitRange is another Kubernetes API object that works well with Resource Quota. It becomes worrisome for the administrator to include memory and CPU requests in all Pod specifications. If by mistake, a single Pod specification misses the memory and CPU requests with the namespace having a quota enforced within the namespace for CPU and memory, the Pod creation will fail. It may frequently lead to unexpected and unpleasant situations if proper care is not taken. The solution is to use LimitRange. The LimitRange object could enforce the namespace's default, minimum, and maximum compute resources for each pod. If there is no request for CPU or memory in any Pod specification, the quota object assumes that whatever resource requests are defined by the LimitRange object are also requested by the Pod. So it doesn’t complain anymore. LimitRange objects could have either the CPU or memory or both of them in their specification. A Small Tutorial To help readers, we will try to simulate namespace and resource quota usage. We will also try to share a few kubectl commands that could be helpful for users while dealing with namespaces. It is assumed that the readers have access to a Kubernetes cluster and are aware of kubectl commands. If users don’t have access to a Kubernetes cluster of their own, they could use the Oreilly Kubernetes Sandbox, which is available for free. I am using it for demo purposes. Once ready, we will first use the commands related to the namespace. kubectl gets namespaces kubectl config get-contexts kubectl create namespace test kubectl config set-context –current –namespace test kubectl config get-contexts The following output will be produced initially. If we observe the output, we can see that the value for the namespace was initially blank. But after we set the namespace in the current context, the namespace value becomes "test," which is the new namespace we have created. Now we'll make a few pods and see how they come together. Use the below commands for Pod creations and observe the output. kubectl run test1 --image=nginx kubectl run test2 --image=nginx kubectl gets pods kubectl get pods -n default kubectl get pods -n kube-system For the first two commands, we are creating two pods named test1 and test2 in the test namespace using image Nginx. In the next command, we retrieved the pods from the namespace. The last two commands retrieved pods from the default and kube-system namespaces. So if the user is in the test namespace and he wants to execute commands for another namespace, he could do so by appending “-n [namespace-name]” in the command. This applies to any kubectl command where we are dealing with namespaced objects. You can execute create, delete, get, describe, and update commands inside another namespace, from within your current namespace by appending the namespace flag and namespace name. It doesn’t make a difference when operating on cluster-level objects. As we get a better idea of how to operate with namespaces, we will proceed and create a resource quota in the test namespace and observe the behavior. We need to execute the below commands to do so. kubectl gets pods kubectl create quota test-quota --hard=cpu=1,memory=1G,pods=2 kubectl gets quota The last command shows the details of the quota. As for the quota, we can create max 2 pods, and all pods put together could use 1 CPU core and 1 gig of memory. The output displays that we have exhausted the pod counts. Let me check if I can create one more pod; if I fail, I will try deleting one pod and creating one more. Let me execute the below command and see what I get. kubectl run test3 --image=nginx So it complained about CPU and memory requests in the pod specification. Let’s add that to Pod specification and retry. Now we need to create a Pod manifest with below content in a file called pod.yaml. YAML apiVersion: v1 kind: Pod metadata: name: nginx3 spec: containers: - name: nginx3and image: nginx resources: limits: memory: "200Mi" cpu: ".25" In the above file, we have requested for 200Mib of memory and 250 milli cpu. Execute the below command and observe the output. kubectl apply -f pod.yaml Here it no longer complains about CPU or memory, but it clearly says that the pod quota is exhausted. So we need to delete one pod and create one more. Let me do that and observe the output. kubectl delete pod test2 kubectl apply -f pod.yaml kubectl gets quota From the output, you can observe what the current state of the quota usage is. Conclusion So, with this little How to article, I have tried to give a glimpse of how to maturely handle resources within a Kubernetes cluster. I hope aspirants of Kubernetes technology will benefit from this and take it forward in their work.
Boris Zaikin
Senior Software Cloud Architect,
Nordcloud GmBH
Ranga Karanam
Best Selling Instructor on Udemy with 1 MILLION Students,
in28Minutes.com
Samir Behara
Senior Cloud Infrastructure Architect,
AWS
Pratik Prakash
Master Software Engineer (SDE-IV),
Capital One