In the previous part of this Rate Limiting article series I showed you how to build and deploy a Java-based rate limiting service that could be integrated with the open source Ambassador API gateway and Kubernetes (part 2 and part 1 can be found here). Several of you reached out and asked about how best to design a rate limiting service — especially given the unique flexibility of the Envoy proxy rate limiting API that underlies Ambassador — and so this post is will help to address this goal.

Setting the Scene

If you haven’t looked at part 3 of this rate limiting series, “Implementing a Java Rate Limiting Service for the Ambassador API Gateway”, then I would definitely encourage you to do so now (and part 2 and part 1 will also provide context). The key thing to take away from this is that Ambassador, much like the Envoy Proxy that powers it, implements rate limiting by calling out to another service to determine if a request should be rate limited. This is a nice implementation of the separation of concerns pattern (and the single responsibility principle), and as Ambassador is a Kubernetes-native API gateway, then you also get the benefits of deploying the rate limiter as a standard Kubernetes service that is managed by the platform in regards to fault tolerance, and can easily be scaled.

The rest of this post assumes that you have successfully deployed Ambassador to your Kubernetes cluster, and that you have also deployed a rate limiting service as I demonstrated in my previous Medium post. This is what the Kubernetes config for the Java-based rate limiting service looks like:

---

apiVersion: v1

kind: Service

metadata:

name: ratelimiter

annotations:

getambassador.io/config: |

---

apiVersion: ambassador/v0

kind: RateLimitService

name: ratelimiter_svc

service: "ratelimiter:50051"

labels:

app: ratelimiter

spec:

type: ClusterIP

selector:

app: ratelimiter

ports:

- protocol: TCP

port: 50051

name: http ---

apiVersion: apps/v1beta2

kind: Deployment

metadata:

name: ratelimiter

labels:

app: ratelimiter

spec:

replicas: 1

selector:

matchLabels:

app: ratelimiter

template:

metadata:

labels:

app: ratelimiter

spec:

containers:

- name: ratelimiter

image: danielbryantuk/ratelimiter:0.3

ports:

- containerPort: 50051

Descriptors

The rate limiting flexibility within Ambassador comes from the ability to specify descriptors and headers on the Kubernetes config, which will be passed through to the rate limiting service. I find it easier to talk about these concepts with an example at hand. Let’s look at a sample Ambassador config for my shopfront app that you have explored in the previous Medium posts:

---

apiVersion: v1

kind: Service

metadata:

labels:

service: ambassador

name: ambassador

annotations:

getambassador.io/config: |

---

apiVersion: ambassador/v0

kind: Mapping

name: shopfront_stable

prefix: /shopfront/

service: shopfront:8010

rate_limits:

- descriptor: Example descriptor

headers:

- "X-MyHeader"

- descriptor: Y header descriptor

headers:

- "Y-MyHeader"

You can see from the rate_limits config that we have two elements in our YAML list, each with different descriptor values and header lists. As mentioned in the Ambassador Rate Limiting docs, if headers are defined then they must be part of the request in order to be rate limited. So, with this example:

A request made to the shopfront service with no headers will not be eligible for rate limiting (i.e. a call against the rate limiting service that is defined elsewhere in the Ambassador config will not be made)

A request made to the shopfront service with the header “X-MyHeader:123” will be eligible for rate limiting. The rate limiting service will receive the descriptor information (as a “generic_key”) associated with the rate_limits element that matches the “X-MyHeader” header — in this case “Example descriptor” — and also the header key and value e.g. the rate limiting service will receive this request metadata: [{“generic_key”,“Example descriptor”},{“X-MyHeader”,”123”}]

A request made to the shopfront service with the header “Y-MyHeader:ABC” will be eligible for rate limiting. The rate limiting service will receive the descriptor information (as a “generic_key”) associated with the rate_limits element that matches the “Y-MyHeader” header — in this case “Y header descriptor” — and also the header key and value e.g. the rate limiting service will receive this request metadata: [{“generic_key”,”Y header descriptor”},{“Y-MyHeader”,”ABC”}]

The decision to rate limit a request, or not, is made within your rate limiting service, and you simply return an appropriate value as specified in the Envoy ratelimit.proto gRPC rate limit service interface: OK, OVER_LIMIT or UNKNOWN. Using the descriptor and header combination as described above means that you have two places where you can add request metadata that can be used within the rate limiting service. The descriptor can be added to the Ambassador Kubernetes config at deploy time, and the headers can be added at runtime.

Working with Example Rate Limiting Metadata

Let’s now look at an example. Say your organisation has created a mobile app that talks to a backend service that is fronted by the Ambassador API gateway, and you want to rate limit requests differently for regular and beta users, and you also want to rate limit unauthenticated users completely different. You have access to UserID and UserType data that could be added to the header of any request:

---

apiVersion: v1

kind: Service

metadata:

labels:

service: BackendService

name: BackendService

annotations:

getambassador.io/config: |

---

apiVersion: ambassador/v0

kind: Mapping

name: backend_app

prefix: /app/

service: backend_app:8010

rate_limits:

- descriptor: Mobile app ingress - authenticated

headers:

-"UserID"

-”UserType”

- descriptor: Mobile app ingress - unauthenticated

Any request made with headers “UserID” and “UserType” present will be forwarded to the rate limit service, which will contain the header keys and values alongside the (generic_key) descriptor value “Mobile app ingress — authenticated”. Requests without these headers will be caught by your second descriptor, and these will be forwarded to the rate limit service will only the (generic_key) descriptor value “Mobile app ingress — unauthenticated”. Your rate limiting service can then implement an algorithm (in any language you require) to deliver the appropriate behaviour.

Conclusion

If you are looking for inspiration, or a ready-made Ambassador rate limiting service, then be sure to check out the Envoy documentation and Lyft GitHub repository. In particular, the Lyft ratelimit reference implementation of an Envoy rate limiting service is very useful, both as a drop-in solution, and also as a guide on how any custom rate limiting solutions can load configuration and runtime.

You can find the tutorial on installing the Ambassador API Gateway within Kubernetes and configuring rate limiting within the previous post “Implementing a Java Rate Limiting Service for the Ambassador API Gateway”. As usual, you are welcome to post any question via the Ambassador Gitter channel.

Continue reading the other articles in this four part series:

Part 1: Rate Limiting: A Useful Tool with Distributed Systems

Part 2: Rate Limiting for API Gateways

Part 3: Implementing a Java Rate Limiting Service for the Ambassador API Gateway