The Challenges of Adopting K8s for Production and Tips to Avoid Them
Take a look at some of the top concerns you might face when implementing Kubernetes in production and how to mitigate or eliminate them.
Join the DZone community and get the full member experience.Join For Free
From its discreet debut in 2000 with the jail command introduced by FREEBSD, container technology is now firmly occupying the center stage of modern software delivery. Kubernetes is the de facto standard today for container orchestration and reputedly the best in the containerization space. And the timing is right for the platform, as Gartner has projected that by 2023, over 70% of global enterprises will be running two or more containerized applications -- up by 20% over last year.
Yet, Kubernetes remains complex to manage at enterprise scale, where workloads are heavy, and SLA compliance is critical. Even when Kubernetes is running smoothly in the test environment, running it in production needs to be approached with care to avoid pitfalls.
If considering Kubernetes for your environments, here is a list of the primary pitfalls to keep in mind at rollout time:
Mastering the Complexity
When deploying a complete K8s infrastructure, the sheer number of components to configure turns the deployment into a daunting process:
- Applying proper DNS settings: Even with the advent of K8s CoredDNS, conflicts with some Linux Distribution occur which need to be manually fixed, DNS search limits need to be factored in, etc.
- Managing load balancing: There are many elements that need to be taken into consideration such as opting between a K8s hosted or an external load balancer.
- Setting up resource constraints: Resource constraints can be defined at container, node or even namespace level, each with its own pros and cons.
- Managing K8s Role-Based Access Control (RBAC): Defining and configuring “Role” and “ClusterRole” across multiple pods, nodes clusters and namespaces requires advanced planning and careful implementation.
- Configuring Metrics and Telemetrics: Though K8s provides detailed information about application resource use at containers, pod, service and cluster level, application monitoring is dispersed amongst multiple applications and tools.
Implementing security in a K8s environment needs to factor in the increased attack surface stemming from the deployment of microservices. K8s might be easy to deploy but failure to manage and control them tightly can create new vulnerabilities. Some of the elements that need to be taken into account are:
1. Image integrity
2. Node access
3. Network segmentation and isolation
4. Connections enforcement
5. API management
6. Ingress and Egress IAM
The complexities involved in integrating all of these elements are often under-evaluated, and lead to a number of security breaches. According to a 2020 study about “The State of Kubernetes Security”, “94% of respondents have experienced a security incident in their Kubernetes and container environments during the last 12 months.
- K8s exploits – In containerized environments, attackers can use “privileged containers” to obtain access to higher privileges than the ones they may already have access to. Privileged containers are those which have all the capabilities of the host machine. Gaining access to such a container could therefore give the attacker permissions and access to all the host’s actions and resources.
- Scaling – Production deployments of K8s often, and quickly, scale up to hundreds of pods. Ineffective monitoring and logging can result in an inability to diagnose severe failures that result in service interruptions.
- Autoscaling – Configuring vertical, horizontal, or combined auto-scaling processes requires in-depth evaluation of the needed resources range (minimum and maximum CPU and memory for nodes), which is extremely difficult to anticipate. To date, no auto-scaling feature fits all providers, and the loss of some pod control is often the result.
- Updating – K8s updates and application updates: When the community releases a new version of K8s, production clusters need to be upgraded, preferably without generating application downtime.
- Resources management – Failure to set source and limits runs the risk of leading to downtime or workload eviction.
- Monitoring – Effective monitoring is critical to avoid poor diagnoses of system failures.
- Human resources – Managing highly complex infrastructures requires high levels of expertise. As K8s are relatively new, finding experts is not an easy feat. Indeed data shows that in the four years between October 2015 and October 2019, the share of Kubernetes jobs per million grew by 2,141.03%
There are, however, some best practices that you can implement to adopt K8s for production safely:
- Run Non-Root Containers: Root containers increase the risk of lateral movement in case of a breach.
- Implement RBAC: Properly configuring RBAC policies to ensure that they, at least, meet K8s minimal requirements, and, at best are granularly configured according to specific role and privilege access definition to maximize security.
- Implement a comprehensive testing process: Before deploying in production, comprehensive testing, including functional tests, user tests, and load tests have to be systematically run. Best practices include replicating/separating the deployment to dev/staging/production environments and run the tests on dev/staging to gain confidence before moving to production.
- Avoid rolling tags: When rolling out multi-container pods into production, rolling tags might lead to failure: “latest” tags are especially to be avoided, as they could lead to running two different versions of the same application.
- Configure a CI/CD - Pipeline: Properly configuring a CI/CD pipeline is crucial to improve the quality and security, as well as to accelerate the speed of releases for K8s-based applications. This can be facilitated by the automation of deployment and the adoption of a number of plugin tools.
- Registry and Deployment Management: Ensuring that Docker images are securely stored in private registry servers, enables an image management workflow. Proper management ensures consistency in image signing, security, LDAP integration, etc. Using deployment managers such as the increasingly popular K8s’ native operator pattern, or Helm for example, facilitates the management of K8s-based application definition, upgrade, and install stages.
- Use Pod Security Policies (PSP): A cluster-level resource, PSP defines the set of conditions necessary for a pod to be accepted into the system.
- Scan images: Especially when using external images, scanning them before using them reduces the risk of introducing vulnerabilities. Alternatively, Kubei is a new tool that “scans only images that are deployed in runtime that also include the scanning of non-registry images. It replaces the need to scan the entire images registry which contains many different versions and/or images that are not in use.”
- Use Service Mesh: Securing communications by ensuring they are authorized, authenticated, and stable is facilitated by service meshes.
- Use metrics, CPU status, and a dashboard to measure health: Ineffective monitoring and logging can result in an inability to diagnose severe failures that result in service interruptions and impact customer satisfaction and the business.
Opinions expressed by DZone contributors are their own.
Grow Your Skills With Low-Code Automation Tools
Revolutionize JSON Parsing in Java With Manifold
What Is Plagiarism? How to Avoid It and Cite Sources
13 Impressive Ways To Improve the Developer’s Experience by Using AI