Which service mesh is the one for you?

cool eh?

This week I set out to write a post comparing Istio and Linkerd, and I told myself: I’m going to create tables comparing features, and it’s going to be great and people will love and the world will be happier for a few seconds. I promised myself It was going to be a fair comparison without bias from any end. While the ‘comparison table’ is still here, I shifted the focus of the article: the goal is not on which is better, but which is better for you, for your applications, for your organization.

For some time in my career, I worked as a pre-sales Architect for Red Hat and I remember the many times we were asked to fill out product comparison sheets. I often needed to use my creativity to make sure the product looked good, avoiding, almost at all costs, the unpleasant ‘not supported’ box. But, by prioritizing honest work, I sometimes had to do it.

Putting myself in the shoes of the evaluators I understand that they’re (hopefully) aiming for an impartial comparison and to that degree, a comparison sheet seems like a safe approach. We know that success in a project can anticipate a career progression and we all like that. But here’s the problem: if the comparison sheet, and not quality software that runs a business and can make organizations more competitive is the end game for that evaluation, then you’re in the ‘sheet’ business.

Comparing is not the end game, knowing what’s better for your use case is. So let’s dig into seven different areas of the technologies, which are:

Traffic Management

Security

Installation / Configuration

Supported Environments

Observability

Policy Management

Performance

For each of these areas, I’ll present my perspective which hopefully will help you get closer to making a concise decision.

Traffic Management

The significant difference to be highlighted here is the fact that two different proxying technologies are used for the data plane.

Istio uses Envoy as its proxy. Envoy is written in C++ and was initially built by Lyft to facilitate traffic management of microservices in a non-Kubernetes way. Many have extended Envoy to serve also as a Kubernetes cluster ingress technology.

Linkerd (v2) is using a built-for-purpose service mesh proxy called linkerd-proxy. This proxy is built in Rust, and together with the proxy, many low-level proxying (network client and server) capabilities are being built in Rust on another project called Tower. Tower relies on Tokio, an event-driven, non-blocking I/O library for Rust. If like me, you appreciate statistics, Rust has been the most loved language in Stack-overflow tor the last four consecutive years (2016, 2017, 2018, 2019).

Istio takes advantage here in that it builds on top of Envoy, which already included essential capabilities IMHO, such as subset routing. Users can still achieve Canary/Blue-Green/A-B with Linkerd but would have to rely on separate Kubernetes Services and a cluster ingress technology capable of splitting the traffic, like Gloo (gloo.solo.io)

The Linkerd team made public on a recent community meeting that more advanced L7 traffic management features are planned for future releases.

Security

For this context, I’m considering the ability to secure a communication channel. Both offer quite reasonable capabilities.

Both technologies can rely on an external root certificate, which would yield a situation where you have secure communication between Linkerd and Istio. That sounds like a good weekend project. Anyone?

Installation and Configuration

Given that Istio can be installed on many different platforms, the instructions can differ substantially. Regarding Linkerd, while writing this article, I was impressed by the requirements pre-check feature available. So many times I went on to install something in shared Kubernetes clusters where it wasn’t clear if I had the necessary privileges to do so. For Linkerd, the pre-check (or check — pre) verifies if you have the permission to create Kubernetes resources required during the install process.

Supported Environments and Deployment Models

With regards to Kubernetes vs non-Kubernetes, this is a straight-forward one, Linkerd 2 is being built with a Kubernetes only mindset, at least for now, while Istio received contributions from companies that also want to see it running also on non-Kubernetes environments.

On the topic of multi-cluster deployments, it’s fair to say that its interpretation can be tricky. Technically speaking, services on multiple different clusters (with distinct control plane installations) that share the same root CA can effectively communicate with each other.

Istio extends the concept above in that it supports multiple clusters under different circumstances:

Single control plane when network connectivity and IP addressing between pods in different clusters is available.

By using cluster border gateways (egress and ingress) with a single control plane that has access to the Kubernetes API server on the multiple clusters.

In both scenarios above, the cluster containing the control plane becomes the SPOF (single point of failure) for mesh management. In our world where a single cluster can be deployed on multiple availability zones under the same region, this becomes less of an issue but still can’t be disregarded entirely.

Observability

I can say that an admin console for Istio is a missing part. The Kiali Observability Console does solve some of the needs an admin would require of a service mesh. Kiali (κιάλι) is a Greek word that means spyglass and it’s clear from its website that it intends to be an observability console more than a Service Mesh management one.

The Linkerd console is not complete but the fact that the community decided to have a goal to also build a management dashboard is a plus.

Linkerd fails to deliver on tracing. Users that want a non-intrusive way to see trace spans would have to wait for that feature to be implemented. Istio takes advantage of the fact that Envoy supports the addition of tracing headers.

Users should be reminded that applications need to be prepared to forward the tracing headers. If not, proxies can generate new tracing IDs, which could potentially and unintentionally split a single request into multiple tracing spans without the needed correlation. Most development frameworks have the option to forward headers without a user having to write extensive code blocks.

Policy Management

Istio lovers, rejoice! Istio’s policy management capabilities are impressive.

The project built an extensive policy management mechanism that allows other technologies to integrate with Istio from multiple aspects, see below the list of ‘templates’ supported, and the providers that built integrations. You can consider that a ‘template’ is a type of integration.

To highlight other policy types, Istio can also apply rating and limiting and ships with out-of-the-box support for principal authentication. Linkerd users can rely on cluster ingress controllers to provide rating and limiting. For principal authentication, that is delegated to the application, which I believe should always be the case. Remember the fallacy of #4 distributed computing: the network is secure. Zero trust policy.

The impressive policy management capability in Istio comes at a cost. Given how extensive it is, managing the multitude of options adds on to that already expensive operational cost.

The policy management component of istio (Mixer) also adds a significant performance hit, which we’ll talk more below.

Performance

Where’s my comparison table? Fortunately, just recently, two great blogs were published with a performance comparison of Istio and Linkerd, which I quote below some of the conclusions:

Istio’s Envoy proxy uses more than 50% more CPU than Linkerd’s, for this synthetic workload. Linkerd’s control plane uses a tiny fraction of Istio’s, especially when considering the “core” components.

Michael Kipper — Benchmarking Istio & Linkerd CPU

and…

In this experiment, both the Linkerd2-meshed setup and Istio-meshed setup experienced higher latency and lower throughput, when compared with the baseline setup. The latency incurred in the Istio-meshed setup was higher than that observed in the Linkerd2-meshed setup. The Linkerd2-meshed setup was able to handle higher HTTP and GRPC ping throughput than the Istio-meshed setup.

Ivan Sim — Linkerd 2.0 and Istio Performance Benchmark

Aware of the added processing times incurred by Mixer, the Istio team is currently working on rewriting the Mixer component: “…Mixer will be rewritten in C++ and directly embedded in Envoy. There will no longer be any stand-alone Mixer service. This will improve performance and reduce operational complexity.” — Source Mixer V2 Design Document.

Conclusion

Yes, Istio has many more features that Linkerd 2.3 has. That is great. A larger feature set can often mean increased ability to address more complex and edge use cases. And here there’s no magic, more features often indicate more configuration options, potentially increased resource utilization and operationalization costs, so here’ three pieces of advice:

If you don’t have an idea of what the 80% of your most common workloads look like, you’re going to have a bad time choosing a service mesh. I even challenge you that if you don’t know those numbers your enterprise might not be ready for a service mesh right now. If you’re just exploring, it’s a different story.

If you want to plan to solve all possible current and future use cases (often called the comparison spreadsheet), you’re going to have a bad time. You will very likely over acquire, and this is true for any piece of software you procure.

Blindly deciding on one tech or the other will give you a bad time. The hype can be strong, but remember, you’re not in the hype business (well, unless you are).

Try both!

I used SuperGloo because it was super simple to get both services meshes bootstrapped quickly, with almost no effort on my part. We’re not using SuperGloo in production, but it was perfect for a task like this. It was literally two commands per mesh.

Michael Kipper — Benchmarking Istio & Linkerd CPU — Shopify

Solo SuperGloo is a Service Mesh orchestrator that can install and manage popular service mesh technologies with intuitive and straightforward commands. As noted by Michael above, installing either Istio or Linkerd becomes a one-liner activity. SuperGloo does not stop there though. It provides an abstraction on top of distinct service mesh technologies allowing for a consistent and repeatable operation against multiple meshes. SuperGloo is entirely open-source, and you can try it right now.