What Is Istio Service Mesh?
Istio makes it easier to scale workloads in Kubernetes across multicloud environments. Learn how Istio can help different IT teams and understand its architecture and benefits.
Join the DZone community and get the full member experience.Join For Free
Most organizations prefer to deploy containerized applications into K8s because of its scalability and flexibility. But as the number of microservices increased and application pods are distributed across multiple clusters and cloud providers, managing and scaling them has become complex.
While scaling, it is harder to configure complex communication logic between microservices. Also, there are additional concerns like securing and observing these connections. Without a proper management system for communication, service downtime and security breaches among distributed services become common.
Service mesh was introduced to address these challenges and complexity while scaling microservices in K8s. It provides a platform for network traffic management and handles service-to-service communication at an infrastructure level.
Challenges of Microservices in Cloud and Kubernetes Environments
We have outlined a few pressing challenges faced by teams of developers, security teams, SREs, and platform engineers. Below are a few of them and the respective teams facing the challenge.
Developers Toil to Create Security Policies for Their Application
Organizations and different government bodies have set security policies and standards for application development, such as HIPAA and PII. It helps to ensure the security of data and applications in a dynamic threat landscape. A web-based firewall (WAF) is not enough to secure data-in-transit. Developers have to write authentication and authorization policies in their business logic (or service) to avoid data breaches during communication between other services.
It is particularly a headache for developers because of repeated efforts. And wherever there is a new organizational policy for compliance, developers have to toil and ensure security changes are done for their system.
No Central Point to Secure Multicloud and Multicluster Applications
Other reasons such as the diversity of technology and container versions make the security local to the applications. It is a pain to enforce consistent security policies for applications across multiple clusters, cloud providers, and with various third-party services and APIs. While developers develop security policies for their applications or services, the responsibility of securing traffic for multiple infrastructures remains a question.
Traffic Management of Microservices Is a Pain for Cloud Teams
DevOps engineers, cloud engineers, or infra engineers configure the communication logic of an application in the service manifest itself. Every time new services are deployed or any changes happen to existing services, the networking logic has to be changed. For example, if a new service has been deployed, the endpoint to the new service has to be manually added on all the other services that communicate with it. All this traffic management is a time-consuming process.
Struggle to Monitor Application Performance (SREs)
It is essential to have real-time visibility into the health and performance of network infrastructure. It helps SREs to identify and respond to any issues or security threats emerging in the real-time traffic, and ensures the availability of services at all times.
Most of the microservices are distributed across cloud or Kubernetes clusters. Without a single plane of visibility of traffic — north-south and east-west — and the granular performance and behavioral details, it is challenging for SREs to troubleshoot and resolve issues quickly. This often leads to SLA breaches and service unavailability.
Istio solves all the above problems by abstracting the network and security logic from the application layer to its own infrastructure layer. This is done by injecting a sidecar (Envoy proxy) on pods, which in turn helps in managing complex, distributed networks at scale. Let us now discuss Istio and its components and how it can help simplify network challenges for developers, cloud engineers, and SREs.
What Is Istio?
Istio is an open-source service mesh platform that simplifies and secures traffic between microservices. Istio provides a dedicated infrastructure for traffic management, security, and observability, to help developers handle the network of microservices in Kubernetes and multiple clouds, at scale.
Istio works by deploying Envoy proxy — an L4 and L7 layer proxy — alongside each microservice. The proxy intercepts and handles service-to-service traffic, and thus abstracts communication logic from the service/application layer into a dedicated infrastructure layer (refer to Fig. A).
Istio Components and Sidecar Architecture
Istio has two main components that control and manage the entire network infrastructure layer (refer to Fig. B):
- Data plane: It is a network of envoy proxies that handle the communication between services in the mesh. Envoy is a lightweight proxy deployed as a sidecar alongside each service in a mesh, which then intercepts the traffic between that particular service and other services. Using envoy proxy, data plane route and control the request flow between services, and offers traffic routing, load balancing, methods to test the resilience of the system (retries, timeouts, circuit breakers, fault injection), and metrics for better visibility and observability.
- Control plane: In Istio, the control plane interacts with the data plane and provides a centralized management and configuration layer for data plane proxies. That is, the control plane converts high-level routing rules that define traffic control behavior into Envoy-specific configurations. Earlier control plane components were divided into Pilot, Galley, Citadel, and Mixer. Now, control plane functionalities are consolidated into a single binary called istiod. Istiod handles service discovery, configuration, and certificate management for service communication in the mesh.
Although injecting envoy proxies as sidecars is the go-to method to implement Istio, there are certain limitations to this approach. The primary limitation is that the operational cost of deploying and maintaining a sidecar is fixed, regardless of the complexity of the use cases. That is, even if the use case is to achieve simple transport security or to configure complex L7 policies, they both require deploying and maintaining sidecars.
And there are other operational challenges. For example, upgrading a sidecar could be disruptive for workloads since it requires restarting the service pod. It cannot be done without a pod restart because Kubernetes pod spec has to be modified to inject the updated sidecar container. Istio recently introduced an early version of the ambient mesh, which solves some of the challenges with sidecar injection.
Istio Ambient Mesh
Istio ambient mesh is a modified and sidecar-less data plane for Istio. Ambient mesh takes a layered approach and splits the functionality of Istio into two: the secure overlay layer and the L7 processing layer. Secure overlay layer suits enterprises with comparatively minimal use cases such as routing, zero trust with mTLS, and L4 processing. The L7 processing layer helps organizations take advantage of the secure overlay, along with accessing advanced L7 processing and the full range of Istio capabilities. This layered approach is useful for enterprises to adopt Istio incrementally.
Also, since ambient mesh uses a shared agent (ztunnel) running on each node in the cluster, it provides complete separation from the application service, unlike in sidecars.
Features of Istio Service Mesh
Istio provides several features for a variety of IT teams in an organization. Traffic management, security, observability, and extensibility are the major ones.
Istio automatically detects all the endpoints of respective services in the mesh and stores them in its internal service registry. It helps Istio to manage traffic by load balancing between replicas. Istio supports applying fine-grained control over traffic splitting between services, like routing a percentage of traffic to a specific service.
Besides, Istio provides the following network resilience and testing methods to ensure the reliability of the applications:
- Timeouts: It is the timeframe within which a service call succeeds or fails. Timeouts are useful so that services do not hang due to the Envoy proxy waiting for replies forever.
- Retries: It is the number of times Envoy should try to connect to a service when the initial connection attempt fails.
- Circuit breakers: It is the threshold for calls to specific hosts within a service. Once the threshold has been reached, further connections to the host are prevented.
- Fault injection: It is a testing method for systems where errors are deliberately introduced to verify their ability to recover from error conditions.
Not only does Istio offer all the above traffic management features and network resilience tests, but it also ensures that the traffic flow is secured across the network.
Istio helps to secure the network of microservices by facilitating granular authentication, authorization, and access control policies.
- Authentication: Istio verifies the identity of users (humans or machines) by allowing peer-to-peer and request authentication policies. Peer authentication involves mTLS implementation, where communication between services is encrypted and authenticated using certificates issued for both the client and the server. Request authentication involves server-side verification, where the client has to attach JWT (JSON Web Token) to the request.
- Authorization: Verifying whether the authenticated user is allowed to access a server and perform specific actions is done using authorization policies in Istio. Authorization policies can be set to allow, deny, or perform custom actions against an incoming request based on different parameters.
- Access control: Istio controls the access of authorized users to resources by implementing the least privilege policy and role-based access control (RBAC). Istio supports RBAC policies to be set on method, service, and namespace levels.
Istio helps in further securing the network by automating key and certificate rotation at scale.
Istio offers observability and real-time visibility into the performance and behavior of applications. It is done by providing detailed telemetry for traffic flow between services in the mesh. Istio generates the following types of telemetry:
- Metrics: Istio generates service metrics for real-time performance monitoring of services. They are based on the four “golden signals” of monitoring, which are latency, traffic, errors, and saturation.
- Distributed traces: Istio collects traces of activity across multiple services in a mesh to better understand service dependency and traffic flow.
- Access logs: Istio can produce a complete record of communication between services in the mesh, making it easier to understand the behavior of each workload.
Istio provides the ability to extend proxy functionality using WebAssembly (Wasm), which is a sandboxing technology that can be used to extend the Istio proxy (Envoy). Istio does it by replacing the primary extension mechanism in it called Mixer. With Wasm, users can build support for new protocols, custom metrics, loggers, and other filters. And these Wasm modules can be distributed dynamically at runtime.
The following are some of the open-source projects that have emerged over Istio using WASM:
- Slime – an intelligent service manager to use Istio and Envoy
- MOSN – provides cloud-native edge gateways and agents
- Aeraki – provides support for all the L7 protocols apart from HTTPs and gRPC
Benefits of Istio Service Mesh
Istio is considered to be resource-intensive and it has a bit of a learning curve. But the benefits Istio provides outweigh them all. Below are some major benefits of implementing Istio:
- 5X increased developer experience: Using Istio, developers are free to work on business logic rather than writing complex network rules. This ensures better productivity, fosters innovation and improves developer experience.
- 100% zero trust network security: Istio allows DevOps engineers and security managers to set granular security policies for enforcing strict authentication and authorization for communication between services. Coupled with mTLS-based traffic encryption, these significantly improve the security posture of the infrastructure. The centralized way of enforcing policies provided by Istio makes compliance easier for security teams, and the policies work across cluster boundaries. This allows them to seamlessly implement a zero trust network (ZTN) for microservices.
- Zero-hassle progressive delivery: With its ability to perform flexible and fine-grained traffic splitting between services, Istio helps DevOps engineers to perform progressive delivery canary and blue-green releases without any hassle.
- 10X faster audit: Access logs provided by Istio help Auditors analyze the performance of the network over a period of time. The telemetry data helps them quickly identify performance bottlenecks and provide suggestions for improvements.
- 4X faster MTTR from network failures: Istio helps SREs and ops teams have observability and real-time visibility of microservices networks in the cloud. The telemetry data provided by Istio is useful for them to diagnose and troubleshoot any errors, and restore services as early as possible. The data provides SREs and Ops with end-to-end visibility into the request flow and dependencies between services, enabling them to have multicluster and multicloud visibility to analyze the performance and behavior of applications.
- 99.99% resilient infrastructure: With advanced load balancing and traffic management capabilities, cloud architects can ensure a highly available and high-performance network infrastructure for production.
Published at DZone with permission of Md Azmal. See the original article here.
Opinions expressed by DZone contributors are their own.