The Importance To DevOps In Navigating the Service Mesh Map
DevOps increasingly rely on service meshes to abstract application network functions from the code, but what else do service meshes do?
Join the DZone community and get the full member experience.Join For Free
A “service mesh” is an infrastructure layer regulating the interactions and relationships between applications and microservices. Rather than a source of fundamentally new features, it provides a repackaging of functionalities such as request-level load balancing, circuit-breaking, retries, instrumentation, and others. When developing cloud-native or hybrid applications, DevOps increasingly relies on service meshes to abstract application network functions from the code.
Born as a facilitator for orchestration in the wake of Kubernetes and other container technology, service meshes are rapidly becoming an indispensable tool for containerization. They enable DevOps teams to focus on building added value services in distributed architectures that are ready to scale with built-in predictability and consistency across platforms.
From a security perspective, service meshes are instrumental in enforcing compliance and best practices -- alleviating the SOC team’s workload and improving resilience while simplifying vulnerability identification and remediation.
The ever-increasing adoption of public cloud services has created a novel set of complexities stemming from the cloud architectural paradigm. These consist of a collection of interconnected microservices in constant communication and collaboration. For example, the exponentially greater number of endpoints and interactions to monitor, secure, and scale, can generate a debugging bottleneck and a new set of security vulnerabilities. One application of the service mesh is to address these emerging issues.
What Do Service Meshes Do?
When migrating from a monolithic architecture to a hybrid or cloud-native one, DevOps needs to adapt to a methodology capable of incorporating the management of communication between a collection of microservices, while safeguarding and monitoring the drastically increased number of endpoints without compromising on scaling abilities or expanding debugging time or resource requirements. Service meshes are designed to address these issues. From streamlining traffic management by, for example, eliminating the necessity for gateway updates when adding microservices… to reducing complexity by abstracting common infrastructure-related functionalities to a different layer as they provide features that make them near indispensable for cloud-native and hybrid application development.
Currently, service meshes most popular capabilities are:
- Traffic management: Connecting and controlling the traffic flow and API calls between services.
- Security: Enforcing authentication to secure bi-directional traffic between client and server.
- Access Control: Applying and enforcing policies and resource distribution.
- Observability: Inferring the system’s internal states from external outputs.
Depending on the intended application development specifications, the DevOps team needs to select a service mesh that optimally matches business and technical requirements. First available on the market was the service mesh Istio, and it remains the best known to date. There are other key players though and It might be worth comparing their differences in architectures and consider the pros and cons of Consul vs Istio, Linkerd vs Istio, Linkerd vs Consul for example, as well as others.
As service meshes are less than a decade old, there are only a small number of options, with a significant overlap in the fundamental concepts, but each one privileges a different angle, and they have varying degrees of interoperability and pricing implications ranging from entirely free to premium.
The leading solutions today are:
- Istio: A full open-source solution founded by IBM, Google and Lyft.
- App Mesh: Exclusive to AWS.
- Linkerd: Initially developed by Twitter for internal use, in 2017 it was made open-source and donated to the CNCF.
- Consul Connect: Open-source with a premium paid service.
- SMI (Microsoft Service Mesh Interface): Announced at KubeCon in 2019, it is backed by heavy players such as Linkerd, HashiCorp, Colo.io, and VMWare.
When shopping for a service mesh solution, defining a clear set of priorities before the initial exploratory survey can help streamline the process. Some of the priorities to consider before selecting a service mesh solution include:
- Managed or self-managed: Deploying Kubernetes clusters with a managed service is easy, but comes at the cost of losing control over some of the cluster control pane. Selecting either requires assessing the pros and cons and evaluating the cost in IT management relative to the benefits of added flexibility.
- Full, partial open-source or proprietary: Open-source platforms are typically more flexible but might be harder to operate, whereas proprietary ones have more limits and are not free. There is no one size fits all, so the optimal option for a specific project depends on factors such as cost evaluation, necessity for flexibility, availability of IT resources, and more.
- Multi-cluster expansion: Larger projects might require multi-cluster expansion, and smaller ones might need it to scale. When selecting a service mesh service, it is always good practice to analyze their multi-cluster expansion capabilities.
- Level of automation: Automation saves time and can also tighten security. Different projects require different types of automation, so checking what automation options are included in a service mesh solution should be part of the selection process.
- Level of built-in security functionalities: Kubernetes built-in security is lacking, and tightening security implies taking additional measures. Service mesh solutions typically provide some security functionalities that address different priorities.
- Type and extend of authentication: Authentication is a critical element of security. A projects’ type, complexity, and scope dictate the authentication features required.
- Observability: Critical to keep a comprehensive view of services health and performance, observability depends on obtaining telemetry data to monitor latency, traffic, errors, and saturation. Choosing between built-in observability, compatibility with external observability solutions or in-house observability configuration are factors dictated by the project’s priorities and should be taken into account when selecting a service mesh solution.
- Interoperability: As the popularity of service mesh grows and new services are emerging, interoperability becomes increasingly critical to enable the interconnection of multiple workloads. Service mesh solutions have various degrees of interoperability that should be factored in when selecting a provider.
To accelerate the selection process, an ebook is available that contains an in-depth overview of each of these service mesh solutions, detailing their specific features, pros and cons, and providing a snapshot of their distinctive architecture. The ebook is available for free download here (https://page.portshift.io/the-devops-guide-to-service-meshes).
Opinions expressed by DZone contributors are their own.