Late last year Matt Klein wrote a fantastic post exploring “Service mesh data plane vs control plane”. Although I was familiar with the term “control plane”, Matt made me fully appreciate my past experiences with this concept, and also the importance in relation to continuous delivery of software — particularly in regards to deployment control (and the subtle differences) around ingress/edge gateways and service meshes.

I’ve written previously about the role a front proxy or API gateway can play in the delivery of software, “Continuous Delivery: How Can an API Gateway Help (or Hinder)?”, and had several great discussions about the impact modern proxies like Envoy are making within the operation of “cloud native” applications. Through this I’ve come to the conclusion that that although the use of microservices, containers with dynamic orchestration, and cloud technologies have presented new opportunities, one of the core challenges that remains is that our control planes must adapt in order to keep pace with the changes.

Control Planes and Personas

In Matt’s article he states that the service mesh control plane “provides policy and configuration for all of the running data planes in the mesh”, and that “the control plane turns all of the data planes into a distributed system”. Ultimately, the goal of a control plane is to set policy that will be enacted by the data plane. A control plane can be implemented through configuration files, API calls and user interfaces.

The method of implementation chosen often depends on the persona of the user, and their goals and technical capability. For example, a product owner may want to release a new feature within an application, and here a UI would typically be the most appropriate control plane, as this can display an understandable view of the system and also provide some guide rails. However, for a network operator that wants to configure a series of low-level firewall rules, using a CLI or configuration file will provide more fine-grained (power-user style) control, and also facilitate automation.

The choice of control plane can also be influenced by the scope of control required. My colleague Rafi has talked about this before at QCon SF, and the requirements to centralise or decentralise operations can definitely impact the implementation of the control plane. This also directly relates to whether the control impact should be local or global. For example, an operations team may want to specify globally sensible defaults and safeguards. However, a development team working on the front lines will want fine-grained control for their local services, and potentially (if they are embracing the “freedom and responsibility” model) the ability to override safeguards.

Matt also talked about the local/global interaction in his recent QCon New York presentation, and demonstrated dashboards that the Lyft team had created for the service-to-service and edge/ingress proxies:

North-South vs East-West Traffic

Two typical classifications of traffic flowing within a software application are north-south, commonly referred to as ingress traffic — traffic that is flowing into and out of the system from and to an external source — and east-west, often referred to as intra-datacenter traffic, which is traffic flowing within a (potentially virtualised) internal network perimeter.

Within modern cloud native applications two separate components often control these flows of traffic: an API gateway or front proxy deals with north-south traffic, and a service mesh handles the east-west traffic. For example, within the Kubernetes domain the Ambassador open source API gateway can deal with ingress traffic, and the Istio open platform can handle cross-service traffic.

The underlying networking technology can be the same for both north-south and east-west proxy components (Envoy, in the examples provided), but the control plane will typically be different, based upon the personas interacting with the system.

The primary persona targeted by the Ambassador control plane is the developer, and allows simple annotations to be added to Kubernetes config to control core deployment functionality like routing (including traffic shadowing), canarying (with integration with Prometheus) and rate limiting.

The primary persona that Istio focuses on is the operator, and the control plane allows the specification of additional Kubernetes resources that facilitate traffic management (including fault injection), security (RBAC and certificate management) and telemetry (including tracing and operational metrics).

Conclusion: Divergence or Convergence?

Lyft uses Envoy as both a front proxy and service mesh. I have also heard reports of engineers using Ambassador to manage inter-service (east-west) communication, and also Istio to handle ingress (even before the new Gateway features of the v1.0 release). However, at the moment the two approaches to a control plane for proxy technology exemplified by Ambassador and Istio appear to offer benefits for their respective personas of developer and operator.

I’m not yet confident that there is an easy one-size-fits-all solution given the state of our collective knowledge and experience with modern container networking, and therefore I believe that the control plane solutions for managing north-south and east-west traffic may diverge before they ultimately converge on the final solutions.