Modern Digital Website Security: Prepare to face any form of malicious web activity and enable your sites to optimally serve your customers.
Containers Trend Report: Explore the current state of containers, containerization strategies, and modernizing architecture.
In the SDLC, deployment is the final lever that must be pulled to make an application or system ready for use. Whether it's a bug fix or new release, the deployment phase is the culminating event to see how something works in production. This Zone covers resources on all developers’ deployment necessities, including configuration management, pull requests, version control, package managers, and more.
What Is CI/CD? Beginner’s Guide To Continuous Integration and Deployments
Feature Flags for CI/CD
In my previous article, I hinted at explaining how Ansible can be used to expose applications running inside a high availability K8s cluster to the outside world. This post will show how this can be achieved using a K8s ingress controller and load balancer. This example uses the same setup as last time around: virtual machines running under the default Windows Hypervisor (Hyper-V). To make room for the addition of a proxy each VM had to give up some RAM. With the exception of the initial master and the Ansible runner, each of the remaining nodes received an allocation of 2000MB. A new version of the sample project is available at GitHub with a new playbook called k8s_boot.yml. This yaml boots up the entire cluster instead of having to run multiple playbooks one after the other. It configures the cluster according to the specification of the inventory file. The flow of execution can be better, but I changed the underlying copybooks as little as possible so readers of previous posts can still find their way. Since the architecture of this post might seem complex at first encounter, an architectural diagram is included towards the very bottom to clarify the landscape. Master and Commanders In the previous article I alluded to the fact that a high availability cluster requires multiple co-masters to provide backup should the current master act up. We will start off by investigating how this redundancy is used to establish high availability. The moment a co-master loses comms with the master, it nominates itself to become the next master. Each of the remaining masters then has to acknowledge its claim upon receiving news of its candidacy. However, another co-master can also notice the absence of the current master before receiving word of a candidacy and nominating itself. Should 50% of the vote be the requirement to assume control, it is possible for two control planes to each attract 50% and think itself the master. Such a cluster will go split-brained with two masters orchestrating a bunch of very confused worker nodes. For this reason, K8s implements the raft protocol from, which follows the typical requirement that a candidate should receive a quorum of 50%+1 before it gains the respect to boss all and sundry. Consequently, a high availability K8s cluster should always comprise of an unequal number of masters. For the project, this means that the inventory should always contain an equal number of co-masters, with the initial master then assuring the inequality. The bootup playbook imports the older k8s_comasters.yml playbook into its execution to prepare and execute the well-known "kubeadm join" command on each of the co-masters: kubeadm join k8scp:6443 --token 9ei28c.b496t8c4vbjea94h --discovery-token-ca-cert-hash sha256:3ae7abefa454d33e9339050bb26dcf3a31dc82f84ab91b2b40e3649cbf244076 --control-plane --certificate-key 5d89284dee1717d0eff2b987f090421fb6b077c07cf21691089a369781038c7b Joining workers nodes to the cluster uses a similar join command but omits the --control-plane switch, as can be seen in k8s_workers.yml, also imported during bootup. After running the bootup playbook, the cluster will comprise both control-plane and worker nodes: Control At All Times At this point in time, all nodes refer to the original master by hostname, as can be seen from the "kube init" command that starts the first master: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint k8scp:6443 --upload-certs Clearly, this node is currently the single point of failure of the cluster. Should it fall away, the cluster's nodes will lose contact with each other. The Ansible scripts mitigate for this by installing the kube config to all masters so kubectl commands can be run from any master by such designated user. Changing the DNS entry to map k8scp to one of the other control planes will hence restore service. While this is easy to do using the host file, additional complexities can arise when using proper DNS servers. Kubernetes orthodoxy, consequently, has that a load balancer should be put in front of the cluster to spread traffic across each of the master nodes. A control plane that falls out will be removed from the duty roster by the proxy. None will be the wiser. HAProxy fulfills this role perfectly. The Ansible tasks that make this happen are: - name: Install HAProxy become: true ansible.builtin.apt: name: haproxy=2.0.31-0ubuntu0.2 state: present - name: Replace line in haproxy.cfg1. become: true lineinfile: dest: /etc/haproxy/haproxy.cfg regexp: 'httplog' line: " option tcplog" - name: Replace line in haproxy.cfg2. become: true lineinfile: dest: /etc/haproxy/haproxy.cfg regexp: 'mode' line: " mode tcp" - name: Add block to haproxy.cfg1 become: true ansible.builtin.blockinfile: backup: false path: /etc/haproxy/haproxy.cfg block: |- frontend proxynode bind *:80 bind *:6443 stats uri /proxystats default_backend k8sServers backend k8sServers balance roundrobin server cp {{ hostvars['host1']['ansible_host'] }:6443 check {% for item in comaster_names -%} server {{ item } {{ hostvars[ item ]['ansible_host'] }:6443 check {% endfor -%} listen stats bind :9999 mode http stats enable stats hide-version stats uri /stats - name: (Re)Start HAProxy service become: true ansible.builtin.service: name: haproxy enabled: true state: restarted The execution of this series of tasks is triggered by the addition of a dedicated server to host HAProxy to the inventory file. Apart from installing and registering HAProxy as a system daemon, this snippet ensures that all control-plane endpoints are added to the duty roster. Not shown here is that the DNS name (k8scp) used in the "kubeadm join" command above is mapped to the IP address of the HAProxy during bootup. Availability and Accessibility Up to this point, everything we have seen constitutes the overhead required for high-availability orchestration. All that remains is to do a business Deployment and expose a K8s service to track its pods on whichever node they may be scheduled on: kubectl create deployment demo --image=httpd --port=80 kubectl expose deployment demo Let us scale this deployment to two pods, each running an instance of the Apache web server: This two-pod deployment is fronted by the demo Service. The other Service (kubernetes) is automatically created and allows access to the API server of the control plane. In a previous DZone article, I explained how this API can be used for service discovery. Both services are of type ClusterIP. This is a type of load balancer, but its backing httpd pods will only be accessible from within the cluster, as can be seen from the absence of an external ip. Kubernetes provides various other service types, such as NodePort and LoadBalancer, to open up pods and containers for outside access. A NodePort opens up access to the service on each Node. Although it is possible for clients to juggle IP addresses should a node fall out, the better way is to use a LoadBalancer. Unfortunately, Kubernetes does not provide an instance as it is typically provided by cloud providers. Similarly, an on-premise or bare-metal cluster has to find and run its own one. Alternatively, its clients have to make do as best they can by using NodePorts or implementing its own discovery mechanism. We will follow the first approach by using MetalLB to slot K8s load balancing into our high availability cluster. This is a good solution, but it is not the best solution. Since every K8s deployment will be exposed behind its own LoadBalancer/Service, clients calling multiple services within the same cluster will have to register the details of multiple load balancers. Kubernetes provides the Ingress API type to counter this. It enables clients to request service using the HTTP(S) routing rules of the Ingress, much the way a proxy does it. Enough theory! It is time to see how Ansible can declare the presence of an Ingress Controller and LoadBalancer: - hosts: masters gather_facts: yes connection: ssh vars_prompt: - name: "metal_lb_range" prompt: "Enter the IP range from which the load balancer IP can be assigned?" private: no default: 192.168.68.200-192.168.69.210 tasks: - name: Installing Nginx Ingress Controller become_user: "{{ ansible_user }" become_method: sudo # become: yes command: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.5/deploy/static/provider/cloud/deploy.yaml run_once: true - name: Delete ValidatingWebhookConfiguration become_user: "{{ ansible_user }" become_method: sudo # become: yes command: kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission run_once: true - name: Install Metallb1. become_user: "{{ ansible_user }" become_method: sudo become: yes shell: 'kubectl -n kube-system get configmap kube-proxy -o yaml > /home/{{ ansible_user }/kube-proxy.yml' - name: Install Metallb2. become_user: "{{ ansible_user }" become_method: sudo become: yes command: kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.11/config/manifests/metallb-native.yaml - name: Prepare L2Advertisement. become_user: "{{ ansible_user }" become_method: sudo copy: dest: "~/l2advertisement.yml" content: | apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: metallb-system - name: Prepare address pool. become_user: "{{ ansible_user }" become_method: sudo copy: dest: "~/address-pool.yml" content: | apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: metallb-system spec: addresses: - {{ metal_lb_range } - pause: seconds=30 - name: Load address pool become_user: "{{ ansible_user }" become_method: sudo command: kubectl apply -f ~/address-pool.yml - name: Load L2Advertisement become_user: "{{ ansible_user }" become_method: sudo command: kubectl apply -f ~/l2advertisement.yml ... First off, it asks for a range of IP addresses that are available for use by the LoadBalancers. It subsequently installs the Nginx Ingress Controller and, lastly, MetallLB to load balance behind the Ingress. MetalLB uses either the ARP (IPv4)/NDP(IPv6) or the BGP to announce the MAC address of the network adaptor. Its pods attract traffic to the network. BGP is probably better as it has multiple MetalLB speaker pods announcing. This might make for a more stable cluster should a node fall out. ARP/NDP only has one speaker attracting traffic. This causes a slight unresponsiveness should the master speaker fail and another speaker has to be elected. ARP is configured above because I do not have access to a router with a known ASN that can be tied into BGP. Next, we prepare to boot the cluster by designating co-masters and an HAProxy instance in the inventory. Lastly, booting with the k8s_boot.yml playbook ensures the cluster topology as declared in the inventory file is enacted: Each node in the cluster has one MetalLB speaker pod responsible for attracting traffic. As stated above, only one will associate one of the available IP addresses with its Mac address when using ARP. The identity of this live wire can be seen at the very bottom of the Ingress Controller service description: Availability in Action We can now test cluster stability. The first thing to do is to install an Ingress: kubectl create ingress demo --class=nginx --rule="www.demo.io/*=demo:80" Browse the URL, and you should see one of the Apache instances returning a page stating: "It works!": This IP address spoofing is pure magic. It routes www.demo.io to the Apache web server without it being defined using a DNS entry outside the cluster. The Ingress can be interrogated from kubectl: One sees that it can be accessed on one of the IP addresses entered during bootup. The same can also be confirmed using wget, the developer tools of any browser worth its salt, or by inspecting the ingress controller: Should the external IP remain in the pending state, Kubernetes could not provision the load balancers. The MetalLB site has a section that explains how to troubleshoot this. We confirmed that the happy case works, but does the web server regain responsiveness in case of failure? We start off by testing whether the IngressController is a single point of failure by switching the node where it ran: Kubernetes realized that the node was no longer in the cluster, terminated all the pods running on that cluster, and rescheduled them on the remaining worker node. This included the IngressController. The website went down for a while, but Kubernetes eventually recovered service. In other words, orchestration in action! Next up, we remove the MetalLB speaker by taking down the cluster where it runs: Another speaker will step up to the task! What about HAProxy? It runs outside the cluster. Surely, this is the single point of failure. Well... Yes and no. Yes, because one loses connection to the control planes: No, because all that is required is to map the IP address of k8scp from that of the HAProxy to that of one of the masters. The project has an admin playbook to do this. Run it and wait for the nodes to stabilize into a ready state. Ingress still routes, MetalLB still attracts, and httpd still serves: Due to the HAProxy being IAC, it is also no trouble to boot a new proxy and slot out the faulty/crashed one. The playbook used above to temporarily switch traffic to a master can also be used during such a proxy replacement. Unfortunately, this requires human interaction, but at least the human knows what to monitor with the utmost care and how to quickly recover the cluster. Final Architecture The final architecture is as follows: Note that all the MetalLB speakers work as a team to provide LoadBalancing for the Kubernetes Services and its Deployments. Conclusion There probably are other ways to install a high availability K8s cluster, but I like this double load balancer approach: HAProxy abstracts and encapsulates the redundancy of an unequal number of control planes, e.g., it ensures 99.9999% availability for cluster controlling commands coming from kubectl; MetalLB and Nginx Ingress Controller working together to track the scheduling of business pods. Keep in mind that the master can move a pod with its container(s) to any worker node depending on failure and resource availability. In other words, the MetalLB LoadBalancer ensures continuity of business logic in case of catastrophic node failure. In our sample, the etcd key-value store is located as part of the control-planes. This is called the stacked approach. The etcd store can also be removed from the control-planes and hosted inside its own nodes for increased stability. More on this here. Our K8s as Ansible project is shaping nicely for use as a local or play cloud. However, a few things are outstanding that one would expect in a cluster of industrial strength: Role based access control (RBAC); Service mesh to move security, observability, and reliability from the application into the platform; Availability zones in different locations, each with its one set of HAProxy, control-planes, and workers separated from each other using a service mesh; Secret management; Ansible lint needs to be run against the Ansible playbooks to identify bad and insecure practices requiring rectification; Choking incoming traffic when a high load of failure is experienced to allow business pods to continue service or recover gracefully. It should be noted, though, that nothing prevents one to add these to one's own cluster.
In today's fast-evolving technology landscape, the integration of Artificial Intelligence (AI) into Internet of Things (IoT) systems has become increasingly prevalent. AI-enhanced IoT systems have the potential to revolutionize industries such as healthcare, manufacturing, and smart cities. However, deploying and maintaining these systems can be challenging due to the complexity of the AI models and the need for seamless updates and deployments. This article is tailored for software engineers and explores best practices for implementing Continuous Integration and Continuous Deployment (CI/CD) pipelines for AI-enabled IoT systems, ensuring smooth and efficient operations. Introduction To CI/CD in IoT Systems CI/CD is a software development practice that emphasizes the automated building, testing, and deployment of code changes. While CI/CD has traditionally been associated with web and mobile applications, its principles can be effectively applied to AI-enabled IoT systems. These systems often consist of multiple components, including edge devices, cloud services, and AI models, making CI/CD essential for maintaining reliability and agility. Challenges in AI-Enabled IoT Deployments AI-enabled IoT systems face several unique challenges: Resource Constraints: IoT edge devices often have limited computational resources, making it challenging to deploy resource-intensive AI models. Data Management: IoT systems generate massive amounts of data, and managing this data efficiently is crucial for AI model training and deployment. Model Updates: AI models require periodic updates to improve accuracy or adapt to changing conditions. Deploying these updates seamlessly to edge devices is challenging. Latency Requirements: Some IoT applications demand low-latency processing, necessitating efficient model inference at the edge. Best Practices for CI/CD in AI-Enabled IoT Systems Version Control: Implement version control for all components of your IoT system, including AI models, firmware, and cloud services. Use tools like Git to track changes and collaborate effectively. Create separate repositories for each component, allowing for independent development and testing. Automated Testing: Implement a comprehensive automated testing strategy that covers all aspects of your IoT system. This includes unit tests for firmware, integration tests for AI models, and end-to-end tests for the entire system. Automation ensures that regressions are caught early in the development process. Containerization: Use containerization technologies like Docker to package AI models and application code. Containers provide a consistent environment for deployment across various edge devices and cloud services, simplifying the deployment process. Orchestration: Leverage container orchestration tools like Kubernetes to manage the deployment and scaling of containers across edge devices and cloud infrastructure. Kubernetes ensures high availability and efficient resource utilization. Continuous Integration for AI Models: Set up CI pipelines specifically for AI models. Automate model training, evaluation, and validation. This ensures that updated models are thoroughly tested before deployment, reducing the risk of model-related issues. Edge Device Simulation: Simulate edge devices in your CI/CD environment to validate deployments at scale. This allows you to identify potential issues related to device heterogeneity and resource constraints early in the development cycle. Edge Device Management: Implement device management solutions that facilitate over-the-air (OTA) updates. These solutions should enable remote deployment of firmware updates and AI model updates to edge devices securely and efficiently. Monitoring and Telemetry: Incorporate comprehensive monitoring and telemetry into your IoT system. Use tools like Prometheus and Grafana to collect and visualize performance metrics from edge devices, AI models, and cloud services. This helps detect issues and optimize system performance. Rollback Strategies: Prepare rollback strategies in case a deployment introduces critical issues. Automate the rollback process to quickly revert to a stable version in case of failures, minimizing downtime. Security: Security is paramount in IoT systems. Implement security best practices, including encryption, authentication, and access control, at both the device and cloud levels. Regularly update and patch security vulnerabilities. CI/CD Workflow for AI-Enabled IoT Systems Let's illustrate a CI/CD workflow for AI-enabled IoT systems: Version Control: Developers commit changes to their respective repositories for firmware, AI models, and cloud services. Automated Testing: Automated tests are triggered upon code commits. Unit tests, integration tests, and end-to-end tests are executed to ensure code quality. Containerization: AI models and firmware are containerized using Docker, ensuring consistency across edge devices. Continuous Integration for AI Models: AI models undergo automated training and evaluation. Models that pass predefined criteria are considered for deployment. Device Simulation: Simulated edge devices are used to validate the deployment of containerized applications and AI models. Orchestration: Kubernetes orchestrates the deployment of containers to edge devices and cloud infrastructure based on predefined scaling rules. Monitoring and Telemetry: Performance metrics, logs, and telemetry data are continuously collected and analyzed to identify issues and optimize system performance. Rollback: In case of deployment failures or issues, an automated rollback process is triggered to revert to the previous stable version. Security: Security measures, such as encryption, authentication, and access control, are enforced throughout the system. Case Study: Smart Surveillance System Consider a smart surveillance system that uses AI-enabled cameras for real-time object detection in a smart city. Here's how CI/CD principles can be applied: Version Control: Separate repositories for camera firmware, AI models, and cloud services enable independent development and versioning. Automated Testing: Automated tests ensure that camera firmware, AI models, and cloud services are thoroughly tested before deployment. Containerization: Docker containers package the camera firmware and AI models, allowing for consistent deployment across various camera models. Continuous Integration for AI Models: CI pipelines automate AI model training and evaluation. Models meeting accuracy thresholds are considered for deployment. Device Simulation: Simulated camera devices validate the deployment of containers and models at scale. Orchestration: Kubernetes manages container deployment on cameras and cloud servers, ensuring high availability and efficient resource utilization. Monitoring and Telemetry: Metrics on camera performance, model accuracy, and system health are continuously collected and analyzed. Rollback: Automated rollback mechanisms quickly revert to the previous firmware and model versions in case of deployment issues. Security: Strong encryption and authentication mechanisms protect camera data and communication with the cloud. Conclusion Implementing CI/CD pipelines for AI-enabled IoT systems is essential for ensuring the reliability, scalability, and agility of these complex systems. Software engineers must embrace version control, automated testing, containerization, and orchestration to streamline development and deployment processes. Continuous monitoring, rollback strategies, and robust security measures are critical for maintaining the integrity and security of AI-enabled IoT systems. By adopting these best practices, software engineers can confidently deliver AI-powered IoT solutions that drive innovation across various industries.
This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report Cloud-native architecture is a transformative approach to designing and managing applications. This type of architecture embraces the concepts of modularity, scalability, and rapid deployment, making it highly suitable for modern software development. Though the cloud-native ecosystem is vast, Kubernetes stands out as its beating heart. It serves as a container orchestration platform that helps with automatic deployments and the scaling and management of microservices. Some of these features are crucial for building true cloud-native applications. In this article, we explore the world of containers and microservices in Kubernetes-based systems and how these technologies come together to enable developers in building, deploying, and managing cloud-native applications at scale. The Role of Containers and Microservices in Cloud-Native Environments Containers and microservices play pivotal roles in making the principles of cloud-native architecture a reality. Figure 1: A typical relationship between containers and microservices Here are a few ways in which containers and microservices turn cloud-native architectures into a reality: Containers encapsulate applications and their dependencies. This encourages the principle of modularity and results in rapid development, testing, and deployment of application components. Containers also share the host OS, resulting in reduced overhead and a more efficient use of resources. Since containers provide isolation for applications, they are ideal for deploying microservices. Microservices help in breaking down large monolithic applications into smaller, manageable services. With microservices and containers, we can scale individual components separately. This improves the overall fault tolerance and resilience of the application as a whole. Despite their usefulness, containers and microservices also come with their own set of challenges: Managing many containers and microservices can become overly complex and create a strain on operational resources. Monitoring and debugging numerous microservices can be daunting in the absence of a proper monitoring solution. Networking and communication between multiple services running on containers is challenging. It is imperative to ensure a secure and reliable network between the various containers. How Does Kubernetes Make Cloud Native Possible? As per a survey by CNCF, more and more customers are leveraging Kubernetes as the core technology for building cloud-native solutions. Kubernetes provides several key features that utilize the core principles of cloud-native architecture: automatic scaling, self-healing, service discovery, and security. Figure 2: Kubernetes managing multiple containers within the cluster Automatic Scaling A standout feature of Kubernetes is its ability to automatically scale applications based on demand. This feature fits very well with the cloud-native goals of elasticity and scalability. As a user, we can define scaling policies for our applications in Kubernetes. Then, Kubernetes adjusts the number of containers and Pods to match any workload fluctuations that may arise over time, thereby ensuring effective resource utilization and cost savings. Self-Healing Resilience and fault tolerance are key properties of a cloud-native setup. Kubernetes excels in this area by continuously monitoring the health of containers and Pods. In case of any Pod failures, Kubernetes takes remedial actions to ensure the desired state is maintained. It means that Kubernetes can automatically restart containers, reschedule them to healthy nodes, and even replace failed nodes when needed. Service Discovery Service discovery is an essential feature of a microservices-based cloud-native environment. Kubernetes offers a built-in service discovery mechanism. Using this mechanism, we can create services and assign labels to them, making it easier for other components to locate and communicate with them. This simplifies the complex task of managing communication between microservices running on containers. Security Security is paramount in cloud-native systems and Kubernetes provides robust mechanisms to ensure the same. Kubernetes allows for fine-grained access control through role-based access control (RBAC). This certifies that only authorized users can access the cluster. In fact, Kubernetes also supports the integration of security scanning and monitoring tools to detect vulnerabilities at an early stage. Advantages of Cloud-Native Architecture Cloud-native architecture is extremely important for modern organizations due to the evolving demands of software development. In this era of digital transformation, cloud-native architecture acts as a critical enabler by addressing the key requirements of modern software development. The first major advantage is high availability. Today's world operates 24/7, and it is essential for cloud-native systems to be highly available by distributing components across multiple servers or regions in order to minimize downtime and ensure uninterrupted service delivery. The second advantage is scalability to support fluctuating workloads based on user demand. Cloud-native applications deployed on Kubernetes are inherently elastic, thereby allowing organizations to scale resources up or down dynamically. Lastly, low latency is a must-have feature for delivering responsive user experiences. Otherwise, there can be a tremendous loss of revenue. Cloud-native design principles using microservices and containers deployed on Kubernetes enable the efficient use of resources to reduce latency. Architecture Trends in Cloud Native and Kubernetes Cloud-native architecture with Kubernetes is an ever-evolving area, and several key trends are shaping the way we build and deploy software. Let's review a few important trends to watch out for. The use of Kubernetes operators is gaining prominence for stateful applications. Operators extend the capabilities of Kubernetes by automating complex application-specific tasks, effectively turning Kubernetes into an application platform. These operators are great for codifying operational knowledge, creating the path to automated deployment, scaling, and management of stateful applications such as databases. In other words, Kubernetes operators simplify the process of running applications on Kubernetes to a great extent. Another significant trend is the rise of serverless computing on Kubernetes due to the growth of platforms like Knative. Over the years, Knative has become one of the most popular ways to build serverless applications on Kubernetes. With this approach, organizations can run event-driven and serverless workloads alongside containerized applications. This is great for optimizing resource utilization and cost efficiency. Knative's auto-scaling capabilities make it a powerful addition to Kubernetes. Lastly, GitOps and Infrastructure as Code (IaC) have emerged as foundational practices for provisioning and managing cloud-native systems on Kubernetes. GitOps leverages version control and declarative configurations to automate infrastructure deployment and updates. IaC extends this approach by treating infrastructure as code. Best Practices for Building Kubernetes Cloud-Native Architecture When building a Kubernetes-based cloud-native system, it's great to follow some best practices: Observability is a key practice that must be followed. Implementing comprehensive monitoring, logging, and tracing solutions gives us real-time visibility into our cluster's performance and the applications running on it. This data is essential for troubleshooting, optimizing resource utilization, and ensuring high availability. Resource management is another critical practice that should be treated with importance. Setting resource limits for containers helps prevent resource contention and ensures a stable performance for all the applications deployed on a Kubernetes cluster. Failure to manage the resource properly can lead to downtime and cascading issues. Configuring proper security policies is equally vital as a best practice. Kubernetes offers robust security features like role-based access control (RBAC) and Pod Security Admission that should be tailored to your organization's needs. Implementing these policies helps protect against unauthorized access and potential vulnerabilities. Integrating a CI/CD pipeline into your Kubernetes cluster streamlines the development and deployment process. This promotes automation and consistency in deployments along with the ability to support rapid application updates. Conclusion This article has highlighted the significant role of Kubernetes in shaping modern cloud-native architecture. We've explored key elements such as observability, resource management, security policies, and CI/CD integration as essential building blocks for success in building a cloud-native system. With its vast array of features, Kubernetes acts as the catalyst, providing the orchestration and automation needed to meet the demands of dynamic, scalable, and resilient cloud-native applications. As readers, it's crucial to recognize Kubernetes as the linchpin in achieving these objectives. Furthermore, the takeaway is to remain curious about exploring emerging trends within this space. The cloud-native landscape continues to evolve rapidly, and staying informed and adaptable will be key to harnessing the full potential of Kubernetes. Additional Reading: CNCF Annual Survey 2021 CNCF Blog "Why Google Donated Knative to the CNCF" by Scott Carey Getting Started With Kubernetes Refcard by Alan Hohn "The Beginner's Guide to the CNCF Landscape" by Ayrat Khayretdinov This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report
This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report Kubernetes celebrates its ninth year since the initial release this year, a significant milestone for a project that has revolutionized the container orchestration space. During the time span, Kubernetes has become the de facto standard for managing containers at scale. Its influence can be found far and wide, evident from various architectural and infrastructure design patterns for many cloud-native applications. As one of the most popular and successful open-source projects in the infrastructure space, Kubernetes offers a ton of choices for users to provision, deploy, and manage Kubernetes clusters and applications that run on them. Today, users can quickly spin up Kubernetes clusters from managed providers or go with an open-source solution to self-manage them. The sheer number of these options can be daunting for engineering teams deciding what makes the most sense for them. In this Trend Report article, we will take a look at the current state of the managed Kubernetes offerings as well as options for self-managed clusters. With each option, we will discuss the pros and cons as well as recommendations for your team. Overview of Managed Kubernetes Platforms Managed Kubernetes offerings from the hyperscalers (e.g., Google Kubernetes Engine, Amazon Elastic Kubernetes Service, Azure Kubernetes Service) remain one of the most popular options for administering Kubernetes. The 2019 survey of the Kubernetes landscape from the Cloud Native Computing Foundation (CNCF) showed that these services from each of the cloud providers make up three of the top five options that enterprises use to manage containers. More recent findings from CloudZero illustrating increased cloud and Kubernetes adoption further solidifies the popularity of managed Kubernetes services. All of the managed Kubernetes platforms take care of the control plane components such as kube-apiserver, etcd, kubescheduler, and kube-controller-manager. However, the degree to which other aspects of operating and maintaining a Kubernetes cluster are managed differs for each cloud vendor. For example, Google offers a more fully-managed service with GKE Autopilot, where Google manages the cluster's underlying compute, creating a serverless-like experience for the end user. They also provide the standard mode where Google takes care of patching and upgrading of the nodes along with bundling autoscaler, load balancer controller, and observability components, but the user has more control over the infrastructure components. On the other end, Amazon's offering is more of a hands-off, opt-in approach where most of the operational burden is offloaded to the end user. Some critical components like CSI driver, CoreDNS, VPC CNI, and kube-proxy are offered as managed add-ons but not installed by default. Figure 1: Managed Kubernetes platform comparison By offloading much of the maintenance and operational tasks to the cloud provider, managed Kubernetes platforms can offer users a lower total cost of ownership (especially when using something like a per-Pod billing model with GKE Autopilot) and increased development velocity. Also, by leaning into cloud providers' expertise, teams can reduce the risk of incorrectly setting Kubernetes security settings or fault-tolerance that could lead to costly outages. Since Kubernetes is so complex and notorious for a steep learning curve, using a managed platform to start out can be a great option to fast-track Kubernetes adoption. On the other hand, if your team has specific requirements due to security, compliance, or even operating environment (e.g., bare metal, edge computing, military/medical applications), a managed Kubernetes platform may not fit your needs. Note that even though Google and Amazon have on-prem products (GKE on-prem and EKS anywhere), the former requires VMware's server virtualization software, and the latter is an open-source, self-managed option. Finally, while Kubernetes lends itself to application portability, there is still some degree of vendor lock-in by going with a managed option that you should be aware of. Overview of Self-Managed Kubernetes Options Kubernetes also has a robust ecosystem of self-managing Kubernetes clusters. First, there's the manual route of installing "Kubernetes the Hard Way," which walks through all the steps needed for bootstrapping a cluster step by step. In practice, most teams use a tool that abstracts some of the setup such as kops, kubeadm, kubespray, or kubicorn. While each tool behaves slightly differently, they all automate the infrastructure provisioning, support maintenance functions like upgrades or scaling, as well as integrate with cloud providers and/or bare metal. The biggest advantage of going the self-managed route is that you have complete control over how you want your Kubernetes cluster to work. You can opt to run a small cluster without a highly available control plane for less critical workloads and save on cost. You can customize the CNI, storage, node types, and even mix and match across multiple cloud providers if need be. Finally, self-managed options are more prevalent in non-cloud environments, namely edge or on-prem. On the other hand, operating a self-managed cluster can be a huge burden for the infrastructure team. Even though open-source tools have come a long way to lower the burden, it still requires a non-negligible amount of time and expertise to justify the cost against going with a managed option. PROS AND CONS OF MANAGED vs. SELF-MANAGED KUBERNETES Options Pros Cons Managed Lower TCO Increased development velocity Lean on security best practices Inherit cloud provider's expertise Less maintenance burden Fully customizable to satisfy compliance requirements Can use latest features Flexible deployment schemes Self-managed May not be available on-prem or on the edge Not open to modification Requires support from service provider in case of outage Requires significant Kubernetes knowledge and expertise Maintenance burden can be high Table 1 Considerations for Managed vs. Self-Managed Kubernetes For most organizations running predominantly on a single cloud, going with the managed offering makes the most sense. While there is a cost associated with opting for the managed service, it is a nominal fee ($0.10 per hour per cluster) compared to the engineer hours that may be required for maintaining those clusters. The rest of the cost is billed the same way as using VMs, so cost is usually a non-factor. Also, note that there will still be a non-negligible amount of work to do if you go with a vendor who provides a less-managed offering. There are few use cases where going with a self-managed Kubernetes option makes sense: If you need to run on-prem or on the edge, you may decide that the on-prem offerings from the cloud providers may not fit your needs. If you are running on-prem, likely this means that either cost was a huge factor or there is a tangible need to be on-prem (i.e., applications must run closer to where it's deployed). In these scenarios, you likely already have an infrastructure team with significant Kubernetes experience or the luxury of growing that team in-house. Even if you are not running on-prem, you may consider going with a self-managed option if you are running on multiple clouds or a SaaS provider that must offer a flexible Kubernetes-as-a-Service type of product. While you can run different variants of Kubernetes across clouds, it may be desirable to use a solution like Cluster API to manage multiple Kubernetes clusters in a consistent manner. Likewise, if you are offering Kubernetes as a Service, then you may need to support more than the managed Kubernetes offerings. Also, as mentioned before, compliance may play a big role in the decision. You may need to support an application in regions where major US hyperscalers do not operate in (e.g., China) or where a more locked-down version is required (e.g., military, banking, medical). Finally, you may work in industries where there is a need for either cutting-edge support or massive modifications to fit the application's needs. For example, for some financial institutions, there may be a need for confidential computing. While the major cloud providers have some level of support for them at the time of writing, it is still limited. Conclusion Managing and operating Kubernetes at scale is no easy task. Over the years, the community has continually innovated and produced numerous solutions to make that process easier. On one hand, we have massive support from major hyperscalers for production-ready, managed Kubernetes services. Also, we have more open-source tools to self-manage Kubernetes if need be. In this article, we went through the pros and cons of each approach, breaking down the state of each option along the way. While most users will benefit from going with a managed Kubernetes offering, opting for a self-managed option is not only valid but sometimes necessary. Make sure your team either has the expertise or the resources required to build it in-house before going with the self-managed option. Additional Reading: CNCF Survey 2019: Deployments Are Getting Larger as Cloud Native Adoption Becomes Mainstream "101+ Cloud Computing Statistics That Will Blow Your Mind (Updated 2023)" by Cody Slingerland, Cloud Zero This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report
This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report Kubernetes, a true game-changer in the domain of modern application development, has revolutionized the way we manage containerized applications. Some people tend to think that Kubernetes is an opposing approach to serverless. This is probably because of the management bound in deploying applications to Kubernetes — the node management, service configuration, load management, etc. Serverless computing, celebrated for its autoscaling power and cost-efficiency, is known for its easy application development and operation. Yet, the complexities Kubernetes introduces have led to a quest for a more automated approach — this is precisely where serverless computing steps into Kubernetes. In this exploration, we'll delve into the serverless trend advantages and highlight key open-source solutions that bridge the gap between serverless and Kubernetes, examining their place in the tech landscape. Factors Driving Kubernetes' Popularity Kubernetes has experienced a meteoric rise in popularity among experienced developers, driven by several factors: Extensibility – Kubernetes offers custom resource definitions (CRDs) that empower developers to define and manage complex application architectures according to their requirements. Ecosystem – Kubernetes fosters a rich ecosystem of tools and services, enhancing its adaptability to various cloud environments. Declarative configuration – Kubernetes empowers developers through declarative configuration, which allows developers to define desired states and lets the system handle the rest. Kubernetes Challenges: Understanding the Practical Complexities That being said, experienced developers navigating the intricate landscape of Kubernetes are familiar with the complexities of setting up, configuring, and maintaining Kubernetes clusters. One of the common challenges is scaling. While manual scaling is becoming a thing of the past, autoscaling has become the de facto standard, with organizations who deploy in Kubernetes benefiting from native autoscaling capabilities such as horizontal pod autoscaling (HPA) and vertical pod autoscaling (VPA). Figure 1: HorizontalPodAutoscaler Nonetheless, these solutions are not without their constraints. HPA primarily relies on resource utilization metrics (e.g., CPU and memory) for scaling decisions. For applications with unique scaling requirements tied to specific business logic or external events, HPA may not provide the flexibility needed. Furthermore, consider the challenge HPA faces in scaling down to zero Pods. Scaling down to zero Pods can introduce complexity and safety concerns. It requires careful handling of Pod termination to ensure that in-flight requests or processes are not disrupted, which can be challenging to implement safely in all scenarios. Understanding Serverless Computing Taking a step back in time to 2014, AWS introduced serverless architectures, which fully exemplified the concept of billing accordingly and using resources only when, how much, and for as long as they're needed. This approach offers two significant benefits: Firstly, it frees up teams from worrying about how applications run, enabling them to concentrate solely on business matters; secondly, it minimizes the hardware, environmental impact, and thus has a positive financial impact on running applications to the absolute minimum. It's essential to understand that "serverless" doesn't imply that there is no server. Instead, it means you don't have to concern yourself with the server responsible for executing tasks; your focus remains solely on the tasks themselves. Serverless Principles in Kubernetes Serverless computing is on the rise, and in some ways, it's getting along with the growing popularity of event-driven architecture, which makes this pairing quite potent. Figure 2: Serverless advantages Event-driven designs are becoming the favored method for creating robust apps that can respond to real-world events in real time. In an event-driven pattern, the crucial requirement is the capability to respond to varying volumes of events at different rates and to dynamically scale your application accordingly. This is where serverless technology perfectly aligns and dynamically scales the application infrastructure accordingly. When you combine the event-driven approach with serverless platforms, the benefits are twofold: You not only save on costs by paying only for what you need, but you also enhance your app's user experience and gain a competitive edge as it syncs with real-world happenings. Who Needs Serverless in Kubernetes? In practical terms, serverless integration in Kubernetes is beneficial for software development teams aiming to simplify resource management and reduce operational complexity. Additionally, it offers advantages to organizations looking to optimize infrastructure costs while maintaining agility in deploying and scaling applications. Explore a Real-World Scenario To illustrate its practicality, imagine a data processing pipeline designed around the producer-consumer pattern. The producer-consumer pattern allows independent operation of producers and consumers, efficient resource utilization, and scalable concurrency. By using a buffer and coordination mechanisms, it optimizes resource usage and ensures orderly data processing. In this architectural context, Kubernetes and KEDA demonstrate significant potential. Producer-Consumer Serverless Architecture With KEDA The system works as the following — producers generate data that flows into a message queue, while consumers handle this data asynchronously. KEDA dynamically fine tunes the count of consumer instances in response to changes within the message queue's activity, ensuring optimal resource allocation and performance. Figure 3: Producer-consumer architecture based on KEDA This efficient serverless architecture includes: Message queue – Selected message queue system that is compatible with Kubernetes. Once chosen, it has to be configured to enable accessibility for both producers and consumers. Producer – The producer component is a simple service that is responsible for generating the tasks and pushing data into the message queue. Consumer– Consumer applications are capable of pulling data asynchronously from the message queue. These applications are designed for horizontal scalability to handle increased workloads effectively. The consumers are deployed as Pods in Kubernetes and utilized by KEDA for dynamic scaling based on queue activity. It's essential to note that while the application operates under KEDA's management, it remains unaware of this fact. Other than that, in this kind of dynamic scale, it is important to highly prioritize robust error handling, retries, and graceful shutdown procedures within the consumer application to ensure reliability and fault tolerance. KEDA– The KEDA system contains scalers that are tailored to the message queue and scaling rules that cater to the system's unique requirements. KEDA offers multiple options to configure the events delivery to the consumers on various metrics such as queue length, message age, or other relevant indicators. For example, if we choose setting the queueLength as target and if one Pod can effectively process 10 messages, you can set the queueLength target to 10. In practical terms, this means that if the actual number of messages in the queue exceeds this threshold, say it's 50 messages, the scaler will automatically scale up to five Pods to efficiently handle the increased workload. Other than that, an upper limit can be configured by the maxReplicaCount attribute to prevent excessive scaling. The triggers are configured by the following format: triggers: - type: rabbitmq metadata: host: amqp://localhost:5672/vhost protocol: auto mode: QueueLength value: "100.50" activationValue: "10.5" queueName: testqueue unsafeSsl: true Let's go over this configuration: It sets up a trigger for RabbitMQ queue activity. This monitors the testqueue and activates when the queue length exceeds the specified threshold of 100.50. When the queue length drops below 10.5, the trigger deactivates. The configuration includes the RabbitMQ server's connection details, using the auto protocol detection and potentially unsafe SSL settings. This setup enables automated scale in response to queue length changes. The architecture achieves an effortlessly deployable and intelligent solution, allowing the code to concentrate solely on essential business logic without the distraction of scalability concerns. This was just an example; the producer-consumer serverless architecture can be implemented through a variety of robust tools and platforms other than KEDA. Let's briefly explore another solution using Knative. Example: Architecture Based on Knative The implementation of the Knative-based system distinguishes itself by assuming the responsibility for data delivery management, in contrast to KEDA, which does not handle data delivery and requires you to set up data retrieval. Prior to deployment of the Knative-based system, it is imperative to ensure its environment is equipped with Knative Serving and Eventing components. Figure 4: Producer-consumer architecture based on Knative The architecture can include: Message broker – Selected message queue that seamlessly integrates as a Knative Broker like Apache Kafka or RabbitMQ. Producer – The producer component is responsible for generating the tasks and dispatching them to a designated message queue within the message broker, implemented as Knative Service. Trigger – The Knative trigger establishes the linkage between the message queue and the consumer, ensuring a seamless flow of messages from the broker to the consumer service. Consumer – The consumer component is configured to efficiently capture these incoming messages from the queue through the Knative trigger, implemented as Knative Service. All of this combined results in an event-driven data processing application that leverages Knative's scaling capabilities. The application automatically scales and adapts to the ever-evolving production requirements of the real world. Indeed, we've explored solutions that empower us to design and construct serverless systems within Kubernetes. However, the question that naturally arises is: What's coming next for serverless within the Kubernetes ecosystem? The Future of Serverless in Kubernetes The future of serverless in Kubernetes is undeniably promising, marked by recent milestones such as KEDA's acceptance as a graduated project and Knative's incubating project status. This recognition highlights the widespread adoption of serverless concepts within the Kubernetes community. Furthermore, the robust support and backing from major industry players underscores the significance of serverless in Kubernetes. Large companies have shown their commitment to this technology by providing commercial support and tailored solutions. It's worth highlighting that the open-source communities behind projects like KEDA and Knative are the driving force behind their success. These communities of contributors, developers, and users actively shape the projects' futures, fostering innovation and continuous improvement. Their collective effort ensures that serverless in Kubernetes remains dynamic, responsive, and aligned with the ever-evolving needs of modern application development. In short, these open-source communities promise a bright and feature-rich future for serverless within Kubernetes, making it more efficient, cost-effective, and agile. This is an article from DZone's 2023 Kubernetes in the Enterprise Trend Report.For more: Read the Report
Tools and platforms form the backbone of seamless software delivery in the ever-evolving world of Continuous Integration and Continuous Deployment (CI/CD). For years, Jenkins has been the stalwart, powering countless deployment pipelines and standing as the go-to solution for many DevOps professionals. But as the tech landscape shifts towards cloud-native solutions, AWS CodePipeline emerges as a formidable contender. Offering deep integration with the expansive AWS ecosystem and the agility of a cloud-based platform, CodePipeline is redefining the standards of modern deployment processes. This article dives into the transformative power of AWS CodePipeline, exploring its advantages over Jenkins and showing why many are switching to this cloud-native tool. Brief Background About CodePipeline and Jenkins At its core, AWS CodePipeline is Amazon Web Services' cloud-native continuous integration and continuous delivery service, allowing users to automate the build, test, and deployment phases of their release process. Tailored to the vast AWS ecosystem, CodePipeline leverages other AWS services, making it a seamless choice for teams already integrated with AWS cloud infrastructure. It promises scalability, maintenance ease, and enhanced security, characteristics inherent to many managed AWS services. On the other side of the spectrum is Jenkins – an open-source automation server with a storied history. Known for its flexibility, Jenkins has garnered immense popularity thanks to its extensive plugin system. It's a tool that has grown with the CI/CD movement, evolving from a humble continuous integration tool to a comprehensive automation platform that can handle everything from build to deployment and more. Together, these two tools represent two distinct eras and philosophies in the CI/CD domain. Advantages of AWS CodePipeline Over Jenkins 1. Integration with AWS Services AWS CodePipeline: Offers a native, out-of-the-box integration with a plethora of AWS services, such as Lambda, EC2, S3, and CloudFormation. This facilitates smooth, cohesive workflows, especially for organizations already using AWS infrastructure. Jenkins: While integration with cloud services is possible, it usually requires third-party plugins and additional setup, potentially introducing more points of failure or compatibility issues. 2. Scalability AWS CodePipeline: Being a part of the AWS suite, it natively scales according to the demands of the deployment pipeline. There's no need for manual intervention, ensuring consistent performance even during peak loads. Jenkins: Scaling requires manual adjustments, such as adding agent nodes or reallocating resources, which can be both time-consuming and resource-intensive. 3. Maintenance AWS CodePipeline: As a managed service, AWS handles all updates, patches, and backups. This ensures that the latest features and security patches are always in place without user intervention. Jenkins: Requires periodic manual updates, backups, and patching. Additionally, plugins can introduce compatibility issues or security vulnerabilities, demanding regular monitoring and adjustments. 4. Security AWS CodePipeline: One of the key benefits of AWS's comprehensive security model. Features like IAM roles, secret management with AWS Secrets Manager, and fine-grained access controls ensure robust security standards. Jenkins: Achieving a similar security level necessitates additional configurations, plugins, and tools, which can sometimes introduce more vulnerabilities or complexities. 5. Pricing and Long-Term Value AWS CodePipeline: Operates on a pay-as-you-go model, ensuring you only pay for what you use. This can be cost-effective, especially for variable workloads. Jenkins: While the software itself is open-source, maintaining a Jenkins infrastructure (servers, electricity, backups, etc.) incurs steady costs, which can add up in the long run, especially for larger setups. When Might Jenkins Be a Better Choice? Extensive Customization Needs With its rich plugin ecosystem, Jenkins provides a wide variety of customization options. For unique CI/CD workflows or specialized integration needs, Jenkins' vast array of plugins can be invaluable, including integration with non-AWS services. On-Premise Solutions Organizations with stringent data residency or regulatory requirements might prefer on-premise solutions. Jenkins offers the flexibility to be hosted on local servers, providing complete control over data and processes. Existing Infrastructure and Expertise Organizations with an established Jenkins infrastructure and a team well-versed in its intricacies might find transitioning to another tool costly and time-consuming. The learning curve associated with a new platform and migration efforts can be daunting. The team needs to weigh in on the transition along with other items in their roadmap. Final Takeaways In the ever-evolving world of CI/CD, selecting the right tool can be the difference between seamless deployments and daunting processes. Both AWS CodePipeline and Jenkins have carved out their specific roles in this space, yet as the industry shifts more towards cloud-native solutions, AWS CodePipeline indeed emerges at the forefront. With its seamless integration within the AWS ecosystem, innate scalability, and reduced maintenance overhead, it represents the future-facing approach to CI/CD. While Jenkins has served many organizations admirably and offers vast customization, the modern tech landscape is ushering in a preference for streamlined, cloud-centric solutions like AWS CodePipeline. The path from development to production is critical, and while the choice of tools will vary based on organizational needs, AWS CodePipeline's advantages are undeniably compelling for those looking toward a cloud-first future. As we navigate the challenges and opportunities of modern software delivery, AWS CodePipeline offers a promising solution that is more efficient, scalable, secure, and worth considering.
The ELK stack is an abbreviation for Elasticsearch, Logstash, and Kibana, which offers the following capabilities: Elasticsearch: a scalable search and analytics engine with a log analytics tool and application-formed database, perfect for data-driven applications. Logstash: a log-processing tool that collects logs from various sources, parses them, and sends them to Elasticsearch for storage and analysis. Kibana: A powerful visualization tool that allows you to explore and analyze the data stored in Elasticsearch using interactive charts, graphs, and dashboards. The Infrastructure of Elasticsearch Before we dive into deploying the ELK Stack, let's first understand the critical components of Elasticsearch's infrastructure: Nodes: elasticsearch runs on dedicated servers called nodes, which operate as binaries for search and analytics tasks. Shards: the database space is logically divided into shards, enabling faster data accessibility and distribution. Indices: elasticsearch organizes the stored data into indices, facilitating efficient data management. Configuring the ELK stack: you'll need a Kubernetes cluster to deploy the ELK Stack on Kubernetes. If you already have one, you can proceed with the deployment. Alternatively, you can use the provided GitHub repository with Terraform files to set up a Kubernetes cluster. Deploying elasticsearch: utilizing Helm charts, we can efficiently deploy Elasticsearch. Modify the values file to match your specific requirements, such as adjusting the number of replicas or turning certain features on/off. Download them from Artifactory Hub. values-elasticsearch.yaml YAML clusterName: "itsyndicateblog" replicas: 1 minimumMasterNodes: 1 createCert: true secret: enabled: true password: "" # generated randomly if not defined image: "docker.elastic.co/elasticsearch/elasticsearch" imageTag: "8.5.1" resources: requests: cpu: "200m" memory: "500Mi" limits: cpu: "300m" memory: "1Gi" ingress: enabled: false # enable ingress only if you need external access to elasticsearch cluster hosts: - host: elastic.itsyndicate.org paths: - path: / Once you've customized the values, use the Helm chart to install Elasticsearch: Shell helm install elasticsearch -f elasticsearch-values.yaml <chart-name> Note: Ensure you have configured the drivers (EBS or EFS) for persistent volumes. Deploying Kibana Kibana deployment is straightforward using Helm charts. In the values file, specify the URL and port of the Elasticsearch service: values-kibana.yaml YAML elasticsearchHosts: "https://elasticsearch-master:9200" enterpriseSearch: host: "https://elasticsearch-master:9200" Shell helm install kibana -f kibana-values.yaml <chart-name> Check if Kibana is installed correctly, port forward the container’s port to the local network (I am using K8s Lens) Deploying Logstash and Filebeat To manage logs effectively, we use Logstash and Filebeat. Filebeat collects records from various sources and Logstash processes and sends them to Elasticsearch. Deploy Logstash Clone repository with configs: logstash-k8s Move to tf-modules/eks/manifests/logstash-k8s Edit configmap.yaml file add elasticsearch host, user and password(you can take them from “Secrets” Kubernetes resource) Apply templates: Shell kubectl apply -f logstash-k8s -n $CHANGE_TO_ELASTIC_NS Deploy Filebeat Ensure Filebeat's configuration points to the correct log files on your nodes. Usually, in EKS, it’s the /var/log/containers folder. To check it, log in to one of your nodes and move to the /var/log/containers directory; if there are no files, try to change the directory. In case everything is correct, apply Kubernetes templates: Shell kubectl apply -f filebeat-k8s Deploy a Simple Application to Check How Logs Are Streaming Into Elasticsearch Enter the eks/manifests folder from the cloned repository. Execute command: Shell kubeclt apply -f app -n default After installation is complete, revisit Kibana and create an elasticsearch index. Creating an index: Create logstash index pattern: logstash-[namespace]* Now, you should see logs from the deployed application. If not, make some requests to this app and try to troubleshoot the issue; refer to the video guide in case help is required. Conclusion You've successfully deployed the ELK Stack on Kubernetes, empowering your applications with robust log analysis and data-driven insights. Elasticsearch, Logstash, and Kibana seamlessly handle large data streams and provide meaningful visualizations. Now that you have a robust logging solution, you can efficiently manage your logs and gain valuable insights. Happy analyzing! Thank you for reading this guide on deploying the ELK Stack. Feel free to reach out if you have any questions or require further assistance. Happy coding!
Delivering new features and updates to users without causing disruptions or downtime is a crucial challenge in the quick-paced world of software development. This is where the blue-green deployment strategy is useful. Organizations can roll out new versions of their software in a secure and effective way by using the release management strategy known as “blue-green deployment.” Organizations strive for quick and dependable deployment of new features and updates in the fast-paced world of software development. Rolling out changes, however, can be a difficult task because there is a chance that it will introduce bugs or result in downtime. An answer to this problem can be found in the DevOps movement’s popular blue-green deployment strategy. Blue-green deployment enables uninterrupted software delivery with little interruption by utilizing parallel environments and careful traffic routing. In this article, we will explore the principles, benefits, and best practices of blue-green deployment, shedding light on how it can empower organizations to release software with confidence. In this article, we will explore the concept of blue-green deployment, its benefits, and how it can revolutionize the software development process. Understanding Blue-Green Deployment In order to reduce risks and downtime when releasing new versions or updates of an application, blue-green deployment is a software deployment strategy. It entails running two parallel instances of the same production environment, with the “blue” environment serving as a representation of the current stable version and the “green” environment. With this configuration, switching between the two environments can be done without upsetting end users. without disrupting end-users. The fundamental idea behind blue-green deployment is to automatically route user traffic to the blue environment to protect the production system's stability and dependability. Developers and QA teams can validate the new version while the green environment is being set up and thoroughly tested before it is made available to end users. The deployment process typically involves the following steps: Initial Deployment: The blue environment is the initial production environment running the stable version of the application. Users access the application through this environment, and it serves as the baseline for comparison with the updated version. Update Deployment: The updated version of the application is deployed to the green environment, which mirrors the blue environment in terms of infrastructure, configuration, and data. The green environment remains isolated from user traffic initially. Testing and Validation: The green environment is thoroughly tested to ensure that the updated version functions correctly and meets the desired quality standards. This includes running automated tests, performing integration tests, and potentially conducting user acceptance testing or canary releases. Traffic Switching: Once the green environment passes all the necessary tests and validations, the traffic routing mechanism is adjusted to start directing user traffic from the blue environment to the green environment. This switch can be accomplished using various techniques such as DNS changes, load balancer configuration updates, or reverse proxy settings. Monitoring and Verification: Throughout the deployment process, both the blue and green environments are monitored to detect any issues or anomalies. Monitoring tools and observability practices help identify performance problems, errors, or inconsistencies in real-time. This ensures the health and stability of the application in a green environment. Rollback and Cleanup: In the event of unexpected issues or unsatisfactory results, a rollback strategy can be employed to switch the traffic back to the blue environment, reverting to the stable version. Additionally, any resources or changes made in the green environment during the deployment process may need to be cleaned up or reverted. The advantages of blue-green deployment are numerous. By maintaining parallel environments, organizations can significantly reduce downtime during deployments. They can also mitigate risks by thoroughly testing the updated version before exposing it to users, allowing for quick rollbacks if issues arise. Blue-green deployment also supports scalability testing, continuous delivery practices, and experimentation with new features. Overall, blue-green deployment is a valuable approach for organizations seeking to achieve seamless software updates, minimize user disruption, and ensure a reliable and efficient deployment process. Benefits of Blue-Green Deployment Blue-green deployment offers several significant benefits for organizations looking to deploy software updates with confidence and minimize the impact on users. Here are the key benefits of implementing blue-green deployment: Minimized Downtime: Blue-green deployment significantly reduces downtime during the deployment process. By maintaining parallel environments, organizations can prepare and test the updated version (green environment) alongside the existing stable version (blue environment). Once the green environment is deemed stable and ready, the switch from blue to green can be accomplished seamlessly, resulting in minimal or no downtime for end-users. Rollback Capability: Blue-green deployment provides the ability to roll back quickly to the previous version (blue environment) if issues arise after the deployment. In the event of unforeseen problems or performance degradation in the green environment, organizations can redirect traffic back to the blue environment, ensuring a swift return to a stable state without impacting users. Risk Mitigation: With blue-green deployment, organizations can mitigate the risk of introducing bugs, errors, or performance issues to end-users. By maintaining two identical environments, the green environment can undergo thorough testing, validation, and user acceptance testing before directing live traffic to it. This mitigates the risk of impacting users with faulty or unstable software and increases overall confidence in the deployment process. Scalability and Load Testing: Blue-green deployment facilitates load testing and scalability validation in the green environment without affecting production users. Organizations can simulate real-world traffic and user loads in the green environment to evaluate the performance, scalability, and capacity of the updated version. This helps identify potential bottlenecks or scalability issues before exposing them to the entire user base, ensuring a smoother user experience. Continuous Delivery and Continuous Integration: Blue-green deployment aligns well with continuous delivery and continuous integration (CI/CD) practices. By automating the deployment pipeline and integrating it with version control and automated testing, organizations can achieve a seamless and streamlined delivery process. CI/CD practices enable faster and more frequent releases, reducing time-to-market for new features and updates. Flexibility for Testing and Experimentation: Blue-green deployment provides a controlled environment for testing and experimentation. Organizations can use the green environment to test new features, conduct A/B testing, or gather user feedback before fully rolling out changes. This allows for data-driven decision-making and the ability to iterate and improve software based on user input. Improved Reliability and Fault Tolerance: By maintaining two separate environments, blue-green deployment enhances reliability and fault tolerance. In the event of infrastructure or environment failures in one of the environments, the other environment can continue to handle user traffic seamlessly. This redundancy ensures that the overall system remains available and minimizes the impact of failures on users. Implementing Blue-Green Deployment To successfully implement blue-green deployment, organizations need to follow a series of steps and considerations. The process involves setting up parallel environments, managing infrastructure, automating deployment pipelines, and establishing efficient traffic routing mechanisms. Here is a step-by-step guide on how to implement blue-green deployment effectively: Duplicate Infrastructure: Duplicate the infrastructure required to support the application in both the blue and green environments. This includes servers, databases, storage, and any other components necessary for the application’s functionality. Ensure that the environments are identical to minimize compatibility issues. Automate Deployment: Implement automated deployment pipelines to ensure consistent and repeatable deployments. Automation tools such as Jenkins, Travis CI, or GitLab CI/CD can help automate the deployment process. Create a pipeline that includes steps for building, testing, and deploying the application to both the blue and green environments. Version Control and Tagging: Adopt proper version control practices to manage different releases effectively. Use a version control system like Git to track changes and create clear tags or branches for each environment. This helps in identifying and managing the blue and green versions of the software. Automated Testing: Implement comprehensive automated testing to validate the functionality and stability of the green environment before routing traffic to it. Include unit tests, integration tests, and end-to-end tests in your testing suite. Automated tests help catch issues early in the deployment process and ensure a higher level of confidence in the green environment. Traffic Routing Mechanisms: Choose appropriate traffic routing mechanisms to direct user traffic between the blue and green environments. Popular options include DNS switching, reverse proxies, or load balancers. Configure the routing mechanism to gradually shift traffic from the blue environment to the green environment, allowing for a controlled transition. Monitoring and Observability: Implement robust monitoring and observability practices to gain visibility into the performance and health of both environments. Monitor key metrics, logs, and user feedback to detect any anomalies or issues. Utilize monitoring tools like Prometheus, Grafana, or ELK Stack to ensure real-time visibility into the system. Incremental Rollout: Adopt an incremental rollout approach to minimize risks and ensure a smoother transition. Gradually increase the percentage of traffic routed to the green environment while monitoring the impact and collecting feedback. This allows for early detection of issues and quick response before affecting the entire user base. Rollback Strategy: Have a well-defined rollback strategy in place to revert back to the stable blue environment if issues arise in the green environment. This includes updating the traffic routing mechanism to redirect traffic back to the blue environment. Ensure that the rollback process is well-documented and can be executed quickly to minimize downtime. Continuous Improvement: Regularly review and improve your blue-green deployment process. Collect feedback from the deployment team, users, and stakeholders to identify areas for enhancement. Analyze metrics and data to optimize the deployment pipeline, automate more processes, and enhance the overall efficiency and reliability of the blue-green deployment strategy. By following these implementation steps and considering key aspects such as infrastructure duplication, automation, version control, testing, traffic routing, monitoring, and continuous improvement, organizations can successfully implement blue-green deployment. This approach allows for seamless software updates, minimized downtime, and the ability to roll back if necessary, providing a robust and efficient deployment strategy. Best Practices for Blue-Green Deployment Blue-green deployment is a powerful strategy for seamless software delivery and minimizing risks during the deployment process. To make the most of this approach, consider the following best practices: Version Control and Tagging: Implement proper version control practices to manage different releases effectively. Clearly label and tag the blue and green environments to ensure easy identification and tracking of each version. This helps in maintaining a clear distinction between the stable and updated versions of the software. Automated Deployment and Testing: Leverage automation for deployment pipelines to ensure consistent and repeatable deployments. Automation helps streamline the process and reduces the chances of human error. Implement automated testing at different levels, including unit tests, integration tests, and end-to-end tests. Automated testing helps verify the functionality and stability of the green environment before routing traffic to it. Infrastructure Duplication: Duplicate the infrastructure and set up identical environments for blue and green. This includes replicating servers, databases, and any other dependencies required for the application. Keeping the environments as similar as possible ensures a smooth transition without compatibility issues. Traffic Routing Mechanisms: Choose appropriate traffic routing mechanisms to direct user traffic from the blue environment to the green environment seamlessly. Popular techniques include DNS switching, reverse proxies, or load balancers. Carefully configure and test these mechanisms to ensure they handle traffic routing accurately and efficiently. Incremental Rollout: Consider adopting an incremental rollout approach rather than switching all traffic from blue to green at once. Gradually increase the percentage of traffic routed to the green environment while closely monitoring the impact. This allows for real-time feedback and rapid response to any issues that may arise, minimizing the impact on users. Canary Releases: Implement canary releases by deploying the new version to a subset of users or a specific geographic region before rolling it out to the entire user base. Canary releases allow you to collect valuable feedback and perform additional validation in a controlled environment. This approach helps mitigate risks and ensures a smoother transition to the updated version. Rollback Strategy: Always have a well-defined rollback strategy in place. Despite thorough testing and validation, issues may still occur after the deployment. Having a rollback plan ready allows you to quickly revert to the stable blue environment if necessary. This ensures minimal disruption to users and maintains the continuity of service. Monitoring and Observability: Implement comprehensive monitoring and observability practices to gain visibility into the performance and health of both the blue and green environments. Monitor key metrics, logs, and user feedback to identify any anomalies or issues. This allows for proactive detection and resolution of problems, enhancing the overall reliability of the deployment process. By following these best practices, organizations can effectively leverage blue-green deployment to achieve rapid and reliable software delivery. The careful implementation of version control, automation, traffic routing, and monitoring ensures a seamless transition between different versions while minimizing the impact on users and mitigating risks. Conclusion Deploying software in a blue-green fashion is a potent method for ensuring smooth and dependable releases. Organizations can minimize risks, cut down on downtime, and boost confidence in their new releases by maintaining two parallel environments and converting user traffic gradually. This method enables thorough testing, validation, and scalability evaluation and perfectly complies with the continuous delivery principles. Adopting blue-green deployment as the software development landscape changes can be a game-changer for businesses looking to offer their users top-notch experiences while maintaining a high level of reliability. Organizations can use the effective blue-green deployment strategy to deliver software updates with confidence. This method allows teams to seamlessly release new features and updates by reducing downtime, providing rollback capabilities, and reducing risks. Organizations can use blue-green deployment to achieve quicker and more reliable software delivery if the appropriate infrastructure is set up, deployment pipelines are automated, and traffic routing mechanisms are effective. Organizations can fully utilize blue-green deployment by implementing the recommended best practices discussed in this article. This will guarantee a positive user experience while lowering the risk of deployment-related disruptions. In conclusion, blue-green deployment has a lot of advantages, such as decreased downtime, rollback capability, risk reduction, scalability testing, alignment with CI/CD practices, flexibility for testing and experimentation, and increased reliability. Organizations can accomplish seamless software delivery, boost deployment confidence, and improve user experience throughout the deployment process by utilizing parallel environments and careful traffic routing.
Kubernetes can be intricate to manage, and companies want to leverage its power while avoiding its complexity. A recent survey found that 84% of companies don’t see value in owning Kubernetes themselves. To address this complexity, Cloud Foundry introduced open-source Korifi, which preserves the classic Cloud Foundry experience of being able to deploy apps written in any language or framework with a single cf push command. But the big difference is that this time, apps are pushed to Kubernetes. In this tutorial, we’ll explore how to use Korifi to deploy web applications written in different languages: Ruby, Node.js, ASP.NET, and PHP. I will also provide insights into Korifi’s functioning and basic configuration knowledge, helping you kick-start your multi-cloud, multitenant, and polyglot journey. Ruby For all the examples in this tutorial, I will use sample web applications that you can download from this GitHub repository, but feel free to use your own. You can also find instructions on installing Korifi in this article, which guides you through the easiest way to achieve that by running two Bash scripts that will set everything up for you. Once you have Korifi installed and have cloned a Ruby sample application, go into the root folder and type the following command: Shell cf push my-ruby-app That’s it! That is all you need to deploy a Ruby application to Kubernetes. Keep in mind that while the first iteration of cf push will take some time as Korifi needs to download a number of elements (I will explain this in the next paragraph); all subsequent runs will be much faster. At any point, if you want to check the status of a Korifi app, you can use the cf app command, which, in the case of our Ruby app, would be: Shell cf app my-ruby-app Node.js Before deploying a Node.js application to Kubernetes using Korifi, let me explain how it works under the hood. One of the key components at play here is Cloud Native Buildpacks. The concept was initially introduced in 2011 and adopted by PaaS providers like Google App Engine, GitLab, Deis, and Dokku. This project became a part of the CNCF in 2018. Buildpacks are primarily designed to convert an application’s source code into an OCI image, such as a Docker image. This process unfolds in two steps: first, it scans the application to identify its dependencies and configures them for seamless operation across diverse clouds. Then, it assembles an image using a Builder, a structured amalgamation of Buildpacks, a foundational build image, a lifecycle, and a reference to a runtime image. Although you have the option to construct your own build images and Buildpacks, you can also leverage those provided by established entities such as Google, Heroku, and Paketo Buildpacks. In this tutorial, I will exclusively use ones provided by Paketo — an open-source project that delivers production-ready Buildpacks for popular programming languages. Let’s briefly demonstrate what Korifi does by manually creating a Buildpack from a Node.js application. You can follow the installation instructions here to install the pack CLI. Then, get into the root folder of your application and run the following command: Shell pack build my-nodejs-app --builder paketobuildpacks/builder:base Your Node.js OCI image is available; you can check this by running the command: Shell docker images Once the Docker image is ready, Korifi utilizes Kubernetes RBAC and CRDs to mimic the robust Cloud Foundry paradigm of orgs and spaces. But the beauty of Korifi is that you don’t have to manage any of that. You only need one command to push a Node.js application to Kubernetes: Shell cf push my-nodejs-app That’s it! ASP.NET Now, let’s push an ASP.NET application. If you run cf push my-aspnet-app, the build will fail, and you will get the following error message: Shell BuildFail: Check build log output FAILED 2023-08-11T19:12:58.11+0000 [STG/] OUT ERROR: No buildpack groups passed detection. 2023-08-11T19:12:58.11+0000 [STG/] OUT ERROR: failed to detect: buildpack(s) failed with err These logs tell us that Korifi may not know a valid Buildpack to package an ASP.NET application. We can verify that by running the following command: Shell cf buildpacks You should get the following output, and we can see that there are no .NET-related buildpacks. Shell position name stack enabled locked filename 1 paketo-buildpacks/java io.buildpacks.stacks.jammy true false paketo-buildpacks/java@9.18.0 2 paketo-buildpacks/go io.buildpacks.stacks.jammy true false paketo-buildpacks/go@4.4.5 3 paketo-buildpacks/nodejs io.buildpacks.stacks.jammy true false paketo-buildpacks/nodejs@1.8.0 4 paketo-buildpacks/ruby io.buildpacks.stacks.jammy true false paketo-buildpacks/ruby@0.39.0 5 paketo-buildpacks/procfile io.buildpacks.stacks.jammy true false paketo-buildpacks/procfile@5.6.4 To fix that, first, we need to tell Korifi which Buildpack to use for an ASP.NET application by editing the ClusterStore: Shell kubectl edit clusterstore cf-default-buildpacks -n tutorial-space Make sure to replace tutorial-space with the value you used during your Korifi cluster configuration. Add the line – image: gcr.io/paketo-buildpacks/python; your file should look like this: Shell spec: sources: - image: gcr.io/paketo-buildpacks/java - image: gcr.io/paketo-buildpacks/nodejs - image: gcr.io/paketo-buildpacks/ruby - image: gcr.io/paketo-buildpacks/procfile - image: gcr.io/paketo-buildpacks/go - image: gcr.io/paketo-buildpacks/python Then we need to tell Korifi in which order to use Buildbacks by editing our ClusterBuilder: Shell kubectl edit clusterbuilder cf-kpack-cluster-builder -n tutorial-space Add the line – id: paketo-buildpacks/dotnet-core at the top of the spec order list. your file should look like this: Shell spec: sources: - image: gcr.io/paketo-buildpacks/java - image: gcr.io/paketo-buildpacks/nodejs - image: gcr.io/paketo-buildpacks/ruby - image: gcr.io/paketo-buildpacks/procfile - image: gcr.io/paketo-buildpacks/go - image: gcr.io/paketo-buildpacks/python If everything was done right, you should see the .NET Core Paketo Buildpack in the list output by the cf buildpacks command. Finally, you can simply run cf push my-aspnet-app to push your ASP.NET application to Kubernetes. PHP We need to follow the same process for PHP with the Buildpack paketo-buildpacks/php that needs to be added to the ClusterStore and ClusterBuilder. For anyone using Korifi version 0.9.0 released a few days ago, the issue that I am about to discuss has been fixed. But in case you are using an older version, running cf push my-php-app will fail and return the following error message: Shell [APP/] OUT php: error while loading shared libraries: libxml2.so.2: cannot open shared object file: No such file or directory The OCI image is missing the libxml library, which is required by PHP, this is probably due to the builder not supporting PHP. To check that, let’s look what builder Korifi is using by running this command: Shell kubectl describe clusterbuilder cf-kpack-cluster-builder | grep 'Run Image' Which will output the following: Shell Run Image: index.docker.io/paketobuildpacks/run-jammy-base@sha256:4cf369b562808105d3297296efea68449a2ae17d8bb15508f573cc78aa3b3772a As you can see, Korifi currently uses Paketo Jammy Base, which, according to its Github repo description, does not support PHP. You also can check that by looking at the builder’s builder.toml file or by running the command pack builder suggest, which will return the output: Shell Suggested builders: [...] Paketo Buildpacks: paketobuildpacks/builder-jammy-base Ubuntu 22.04 Jammy Jellyfish base image with buildpacks for Java, Go, .NET Core, Node.js, Python, Apache HTTPD, NGINX and Procfile Paketo Buildpacks: paketobuildpacks/builder-jammy-buildpackless-static Static base image (Ubuntu Jammy Jellyfish build image, distroless-like run image) with no buildpacks included. To use, specify buildpack at build time. Paketo Buildpacks: paketobuildpacks/builder-jammy-full Ubuntu 22.04 Jammy Jellyfish full image with buildpacks for Apache HTTPD, Go, Java, Java Native Image, .NET, NGINX, Node.js, PHP, Procfile, Python, and Ruby [...] While Jammy Base does not support PHP, the Jammy Full builder does. There are multiple ways to get Korifi to use another builder, I will just cover one way in this tutorial. This way assumes that we used the easy way to install Korifi with the deploy-on-kind.sh script. You need to go to Korifi source code and edit the file scripts/assets/values.yaml so that the fields clusterStackBuildImage and clusterStackRunImage are set to paketobuildpacks/build-jammy-full by running this command: Shell sed -i 's/base/full/g' scripts/assets/values.yaml` Then, run the scripts/deploy-on-kind.sh script. That’s it! Korifi will use the Jammy full builder, and Korifi will be able to deploy your PHP application with a cf push my-php-app command. Summary Hopefully, now you’ve experienced just how easy it is to use Korifi to deploy applications to Kubernetes written in Ruby, Node.js, ASP.NET, and PHP. You can stay tuned with the Korifi project by following Cloud Foundry X account and joining the Slack workspace.
In my previous posting, I explained how to run Ansible scripts using a Linux virtual machine on Windows Hyper-V. This article aims to ease novices into Ansible IAC at the hand of an example. The example being booting one's own out-of-cloud Kubernetes cluster. As such, the intricacies of the steps required to boot a local k8s cluster are beyond the scope of this article. The steps can, however, be studied at the GitHub repo, where the Ansible scripts are checked in. The scripts were tested on Ubuntu20, running virtually on Windows Hyper-V. Network connectivity was established via an external virtual network switch on an ethernet adaptor shared between virtual machines but not with Windows. Dynamic memory was switched off from the Hyper-V UI. An SSH service daemon was pre-installed to allow Ansible a tty terminal to run commands from. Bootstrapping the Ansible User Repeatability through automation is a large part of DevOps. It cuts down on human error, after all. Ansible, therefore, requires a standard way to establish a terminal for the various machines under its control. This can be achieved using a public/private key pairing for SSH authentication. The keys can be generated for an Elliptic Curve Algorithm as follows: ssh-keygen -f ansible -t ecdsa -b 521 The Ansible script to create and match an account to the keys is: YAML --- - name: Bootstrap ansible hosts: all become: true tasks: - name: Add ansible user ansible.builtin.user: name: ansible shell: /bin/bash become: true - name: Add SSH key for ansible ansible.posix.authorized_key: user: ansible key: "{{ lookup('file', 'ansible.pub') }" state: present exclusive: true # to allow revocation # Join the key options with comma (no space) to lock down the account: key_options: "{{ ','.join([ 'no-agent-forwarding', 'no-port-forwarding', 'no-user-rc', 'no-x11-forwarding' ]) }" # noqa jinja[spacing] become: true - name: Configure sudoers community.general.sudoers: name: ansible user: ansible state: present commands: ALL nopassword: true runas: ALL # ansible user should be able to impersonate someone else become: true Ansible is declarative, and this snippet depicts a series of tasks that ensure that: The Ansible user exists; The keys are added for SSH authentication and The Ansible user can execute with elevated privilege using sudo Towards the top is something very important, and it might go unnoticed under a cursory gaze: hosts: all What does this mean? The answer to this puzzle can be easily explained at the hand of the Ansible inventory file: YAML masters: hosts: host1: ansible_host: "192.168.68.116" ansible_connection: ssh ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible comasters: hosts: co-master_vivobook: ansible_connection: ssh ansible_host: "192.168.68.109" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible workers: hosts: client1: ansible_connection: ssh ansible_host: "192.168.68.115" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible client2: ansible_connection: ssh ansible_host: "192.168.68.130" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible It is the register of all machines the Ansible project is responsible for. Since our example project concerns a high availability K8s cluster, it consists of sections for the master, co-masters, and workers. Each section can contain more than one machine. The root-enabled account atmin on display here was created by Ubuntu during installation. The answer to the question should now be clear — the host key above specifies that every machine in the cluster will have an account called Ansible created according to the specification of the YAML. The command to run the script is: ansible-playbook --ask-pass bootstrap/bootstrap.yml -i atomika/atomika_inventory.yml -K The locations of the user bootstrapping YAML and the inventory files are specified. The command, furthermore, requests password authentication for the user from the inventory file. The -K switch, on its turn, asks that the superuser password be prompted. It is required by tasks that are specified to be run as root. It can be omitted should the script run from the root. Upon successful completion, one should be able to login to the machines using the private key of the ansible user: ssh ansible@172.28.110.233 -i ansible Note that since this account is not for human use, the bash shell is not enabled. Nevertheless, one can access the home of root (/root) using 'sudo ls /root' The user account can now be changed to ansible and the location of the private key added for each machine in the inventory file: YAML host1: ansible_host: "192.168.68.116" ansible_connection: ssh ansible_user: ansible ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible One Master To Rule Them All We are now ready to boot the K8s master: ansible-playbook atomika/k8s_master_init.yml -i atomika/atomika_inventory.yml --extra-vars='kubectl_user=atmin' --extra-vars='control_plane_ep=192.168.68.119' The content of atomika/k8s_master_init.yml is: YAML # k8s_master_init.yml - hosts: masters become: yes become_method: sudo become_user: root gather_facts: yes connection: ssh roles: - atomika_base vars_prompt: - name: "control_plane_ep" prompt: "Enter the DNS name of the control plane load balancer?" private: no - name: "kubectl_user" prompt: "Enter the name of the existing user that will execute kubectl commands?" private: no tasks: - name: Initializing Kubernetes Cluster become: yes # command: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint "{{ ansible_eno1.ipv4.address }:6443" --upload-certs command: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint "{{ control_plane_ep }:6443" --upload-certs #command: kubeadm init --pod-network-cidr 10.244.0.0/16 --upload-certs run_once: true #delegate_to: "{{ k8s_master_ip }" - pause: seconds=30 - name: Create directory for kube config of {{ ansible_user }. become: yes file: path: /home/{{ ansible_user }/.kube state: directory owner: "{{ ansible_user }" group: "{{ ansible_user }" mode: 0755 - name: Copy /etc/kubernetes/admin.conf to user home directory /home/{{ ansible_user }/.kube/config. copy: src: /etc/kubernetes/admin.conf dest: /home/{{ ansible_user }/.kube/config remote_src: yes owner: "{{ ansible_user }" group: "{{ ansible_user }" mode: '0640' - pause: seconds=30 - name: Remove the cache directory. file: path: /home/{{ ansible_user }/.kube/cache state: absent - name: Create directory for kube config of {{ kubectl_user }. become: yes file: path: /home/{{ kubectl_user }/.kube state: directory owner: "{{ kubectl_user }" group: "{{ kubectl_user }" mode: 0755 - name: Copy /etc/kubernetes/admin.conf to user home directory /home/{{ kubectl_user }/.kube/config. copy: src: /etc/kubernetes/admin.conf dest: /home/{{ kubectl_user }/.kube/config remote_src: yes owner: "{{ kubectl_user }" group: "{{ kubectl_user }" mode: '0640' - pause: seconds=30 - name: Remove the cache directory. file: path: /home/{{ kubectl_user }/.kube/cache state: absent - name: Create Pod Network & RBAC. become_user: "{{ ansible_user }" become_method: sudo become: yes command: "{{ item }" with_items: kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml - pause: seconds=30 - name: Configure kubectl command auto-completion for {{ ansible_user }. lineinfile: dest: /home/{{ ansible_user }/.bashrc line: 'source <(kubectl completion bash)' insertafter: EOF - name: Configure kubectl command auto-completion for {{ kubectl_user }. lineinfile: dest: /home/{{ kubectl_user }/.bashrc line: 'source <(kubectl completion bash)' insertafter: EOF ... From the host keyword, one can see these tasks are only enforced on the master node. However, two things are worth explaining. The Way Ansible Roles The first is the inclusion of the atomika_role towards the top: YAML roles: - atomika_base The official Ansible documentation states that: "Roles let you automatically load related vars, files, tasks, handlers, and other Ansible artifacts based on a known file structure." The atomika_base role is included in all three of the Ansible YAML scripts that maintain the master, co-masters, and workers of the cluster. Its purpose is to lay the base by making sure that tasks common to all three member types have been executed. As stated above, an ansible role follows a specific directory structure that can contain file templates, tasks, and variable declaration, amongst other things. The Kubernetes and ContainerD versions are, for example, declared in the YAML of variables: YAML k8s_version: 1.28.2-00 containerd_version: 1.6.24-1 In short, therefore, development can be fast-tracked through the use of roles developed by the Ansible community that open-sourced it at Ansible Galaxy. Dealing the Difference The second thing of interest is that although variables can be passed in from the command line using the --extra-vars switch, as can be seen, higher up, Ansible can also be programmed to prompt when a value is not set: YAML vars_prompt: - name: "control_plane_ep" prompt: "Enter the DNS name of the control plane load balancer?" private: no - name: "kubectl_user" prompt: "Enter the name of the existing user that will execute kubectl commands?" private: no Here, prompts are specified to ask for the user that should have kubectl access and the IP address of the control plane. Should the script execute without error, the state of the cluster should be: atmin@kxsmaster2:~$ kubectl get pods -o wide -A NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-flannel kube-flannel-ds-mg8mr 1/1 Running 0 114s 192.168.68.111 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-bkzgd 1/1 Running 0 3m31s 10.244.0.6 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-vzkw2 1/1 Running 0 3m31s 10.244.0.7 kxsmaster2 <none> <none> kube-system etcd-kxsmaster2 1/1 Running 0 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-apiserver-kxsmaster2 1/1 Running 0 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-controller-manager-kxsmaster2 1/1 Running 7 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-proxy-69cqq 1/1 Running 0 3m32s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-scheduler-kxsmaster2 1/1 Running 7 3m45s 192.168.68.111 kxsmaster2 <none> <none> All the pods required to make up the control plane run on the one master node. Should you wish to run a single-node cluster for development purposes, do not forget to remove the taint that prevents scheduling on the master node(s). kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule- However, a cluster consisting of one machine is not a true cluster. This will be addressed next. Kubelets of the Cluster, Unite! Kubernetes, as an orchestration automaton, needs to be resilient by definition. Consequently, developers and a buggy CI/CD pipeline should not touch the master nodes by scheduling load on it. Therefore, Kubernetes increases resilience by expecting multiple worker nodes to join the cluster and carry the load: ansible-playbook atomika/k8s_workers.yml -i atomika/atomika_inventory.yml The content of k8x_workers.yml is: YAML # k8s_workers.yml --- - hosts: workers, vmworkers remote_user: "{{ ansible_user }" become: yes become_method: sudo gather_facts: yes connection: ssh roles: - atomika_base - hosts: masters tasks: - name: Get the token for joining the nodes with Kuberenetes master. become_user: "{{ ansible_user }" shell: kubeadm token create --print-join-command register: kubernetes_join_command - name: Generate the secret for joining the nodes with Kuberenetes master. become: yes shell: kubeadm init phase upload-certs --upload-certs register: kubernetes_join_secret - name: Copy join command to local file. become: false local_action: copy content="{{ kubernetes_join_command.stdout_lines[0] } --certificate-key {{ kubernetes_join_secret.stdout_lines[2] }" dest="/tmp/kubernetes_join_command" mode=0700 - hosts: workers, vmworkers #remote_user: k8s5gc #become: yes #become_metihod: sudo become_user: root gather_facts: yes connection: ssh tasks: - name: Copy join command to worker nodes. become: yes become_method: sudo become_user: root copy: src: /tmp/kubernetes_join_command dest: /tmp/kubernetes_join_command mode: 0700 - name: Join the Worker nodes with the master. become: yes become_method: sudo become_user: root command: sh /tmp/kubernetes_join_command register: joined_or_not - debug: msg: "{{ joined_or_not.stdout }" ... There are two blocks of tasks — one with tasks to be executed on the master and one with tasks for the workers. This ability of Ansible to direct blocks of tasks to different member types is vital for cluster formation. The first block extracts and augments the join command from the master, while the second block executes it on the worker nodes. The top and bottom portions from the console output can be seen here: YAML janrb@dquick:~/atomika$ ansible-playbook atomika/k8s_workers.yml -i atomika/atomika_inventory.yml [WARNING]: Could not match supplied host pattern, ignoring: vmworkers PLAY [workers, vmworkers] ********************************************************************************************************************************************************************* TASK [Gathering Facts] ************************************************************************************************************************************************************************ok: [client1] ok: [client2] ........................................................................... TASK [debug] **********************************************************************************************************************************************************************************ok: [client1] => { "msg": "[preflight] Running pre-flight checks\n[preflight] Reading configuration from the cluster...\n[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Starting the kubelet\n[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...\n\nThis node has joined the cluster:\n* Certificate signing request was sent to apiserver and a response was received.\n* The Kubelet was informed of the new secure connection details.\n\nRun 'kubectl get nodes' on the control-plane to see this node join the cluster." } ok: [client2] => { "msg": "[preflight] Running pre-flight checks\n[preflight] Reading configuration from the cluster...\n[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Starting the kubelet\n[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...\n\nThis node has joined the cluster:\n* Certificate signing request was sent to apiserver and a response was received.\n* The Kubelet was informed of the new secure connection details.\n\nRun 'kubectl get nodes' on the control-plane to see this node join the cluster." } PLAY RECAP ************************************************************************************************************************************************************************************client1 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 client1 : ok=23 changed=6 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 client2 : ok=23 changed=6 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 host1 : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 Four tasks were executed on the master node to determine the join command, while 23 commands ran on each of the two clients to ensure they were joined to the cluster. The tasks from the atomika-base role accounts for most of the worker tasks. The cluster now consists of the following nodes, with the master hosting the pods making up the control plane: atmin@kxsmaster2:~$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8xclient1 Ready <none> 23m v1.28.2 192.168.68.116 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 kxsclient2 Ready <none> 23m v1.28.2 192.168.68.113 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 kxsmaster2 Ready control-plane 34m v1.28.2 192.168.68.111 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 With Nginx deployed, the following pods will be running on the various members of the cluster: atmin@kxsmaster2:~$ kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default nginx-7854ff8877-g8lvh 1/1 Running 0 20s 10.244.1.2 kxsclient2 <none> <none> kube-flannel kube-flannel-ds-4dgs5 1/1 Running 1 (8m58s ago) 26m 192.168.68.116 k8xclient1 <none> <none> kube-flannel kube-flannel-ds-c7vlb 1/1 Running 1 (8m59s ago) 26m 192.168.68.113 kxsclient2 <none> <none> kube-flannel kube-flannel-ds-qrwnk 1/1 Running 0 35m 192.168.68.111 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-pqp2s 1/1 Running 0 37m 10.244.0.9 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-rh577 1/1 Running 0 37m 10.244.0.8 kxsmaster2 <none> <none> kube-system etcd-kxsmaster2 1/1 Running 1 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-apiserver-kxsmaster2 1/1 Running 1 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-controller-manager-kxsmaster2 1/1 Running 8 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-proxy-bdzlv 1/1 Running 1 (8m58s ago) 26m 192.168.68.116 k8xclient1 <none> <none> kube-system kube-proxy-ln4fx 1/1 Running 1 (8m59s ago) 26m 192.168.68.113 kxsclient2 <none> <none> kube-system kube-proxy-ndj7w 1/1 Running 0 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-scheduler-kxsmaster2 1/1 Running 8 37m 192.168.68.111 kxsmaster2 <none> <none> All that remains is to expose the Nginx pod using an instance of NodePort, LoadBalancer, or Ingress to the outside world. Maybe more on that in another article... Conclusion This posting explained the basic concepts of Ansible at the hand of scripts booting up a K8s cluster. The reader should now grasp enough concepts to understand tutorials and search engine results and to make a start at using Ansible to set up infrastructure using code.
John Vester
Staff Engineer,
Marqeta @JohnJVester
Seun Matt
Engineering Manager,
Cellulant