Transcript

Gibb: My name is Spencer Gibb, I'm the co-founder and leader of the Spring Cloud project. It has been 5 years since I'm doing it. Today we will talk about Spring Cloud Gateway RSocket protocol. We will talk about reactive architecture, reactive communication and the RSoket Protocol.

Reactive Architecture

Reactive Architectures are fundamentally non-blocking, usually there's an event loop. These have been around for a long time. JavaScript, Netty, these kinds of architectures have been around for a while, Spring has started to adopt these. Something interesting about Reactive Architectures is back pressure. Reactive architectures don't care if it's synchronous or asynchronous, but it has to be non-blocking. The interesting thing is back pressure.

What is back pressure? A friend of mine did this and I'm using it. Requester says, "want it" and responder says "here," then "It’s too much, I need help" and this is back pressure, because the requester can say "stop". Well, he doesn’t say "stop", he says, "I want 5, or 10 or 20" and that’s what the responder will give him, so it’s better.

There are four interfaces, these four here publisher, subscriber, subscription, and processor. In Spring, Project Reactor is our implementation of reactive streams. If you saw Josh's talk, you saw fluxes and monos and things like that. You won't see much of that here, but it's all built on top of that. There are roadblocks in using reactive architectures everywhere. Data access - not everyone can use Mongo or Apache or Cassandra. Pivotal has taken on projects R2DBC - Josh showed you a little bit about that - things are starting to progress with data access, but there's still this problem across processes and across the network - how do you translate these concepts across the network? That's where RSocket comes in.

RSocket vs HTTP

RSocket is bi-directional, it's multiplex. We'll talk about each one of these keywords that make RSocket interesting. As a protocol, it supports four different communication models, Fire-and-Forget, Request-Response, Request-Stream, Request-Channel, so you can do all sorts of things that you would normally do with say, HTTP. Another interesting thing about RSocket, as we talked about, is that it's transport agnostic. There are implementations that run over TCP, WebSockets, so whatever your network can support, you can usually run RSocket over it.

It may help to understand RSocket by comparing it with HTTP. The first thing, generally with HTTP your clients are making a connection to the server every time. With RSocket, you make a single connection to the server, and then all your requests go over the same connection. The cost of making a connection, doesn't matter much because you only do it once.

Then, there's multiplexing. Given this single connection, all the requests go across it, every stream or request has an ID, and so you can reuse this connection and it's very efficient in networking, because you're only using the one connection, and it's persistent, it stays open for a long time.

The other thing we chatted a little bit about is communicating back pressure. There's really only HTTP error codes to deal with and they're not always very helpful, so you get a 503 from the server. With RSocket, each side of the connection can communicate with the other side about how much data it can send, or how much data it can receive, and so that back pressure can go across the network - that's in my demo when I show it to you a little later.

Some network protocols have pieces of back pressure, like TCP, but there are still places where buffers can overrun. I really like this last statement here that one of my colleagues wrote "Reactive Streams pull-push back pressure ensures that data is only materialized and transferred when the receiver is ready to process it." That just doesn't happen in HTTP, It's going to get sent no matter what if you requested it.

Another interesting thing is that once you make a connection, then either side of that connection can be the requester, which is different. In HTTP, even in HTTP/2 where there's bi-directional communication, the client still has to ask for data and then the server can make responses, but with RSocket, once you have the connection, it's bi-directional and it doesn't matter. You have to stop thinking in client server terms because you don't know which one is which and think in requester-responder terms.

Another interesting thing about RSocket is being able to cancel requests. The requester makes a request, and then it determines, "You know what? I already have the information I need, I can go cancel that request." Another one is resumption of requests. Facebook is one of the companies behind RSocket, they take advantage of resumption on their mobile device, their mobile app. When you leave home and the Wi-Fi drops, there's that little time when you don't have a network connection. Facebook used to go grab your whole stream again and it was very expensive, but now they use RSocket and its resumption support. The client says, "I was here," and then it only returns the difference in what was requested rather than the whole thing again, so saving real dollars in bandwidth for companies.

Message Driven Binary Protocol

Another interesting property of RSocket is that it is message driven. Think of it more like a message broker, like RabbitMQ or something like that. It has those semantics, and in fact, in the RSocket integration with Spring, it doesn't look like WebFlux does Web MVC. It actually hooks into the Spring message mapping API. The protocol is framed, so that the messages are split up. It's binary, so it's very fast. There could be some issues, that's one of the reasons people like JSON, because you can just look at it and see it, but it's super inefficient. Someone asked me, "What do all your servers do at work?" They are JSON parsers, we spend all our cycles parsing JSON. This can definitely help with that. It can be JSON, it could be XML, Protobufs, whatever you want to send across the wire, it can.

Another interesting thing is that the payload could be encrypted, even if the transport that you're using is not encrypted. With each request, there's some metadata surrounding the payload. There's MIME-type and then this bag of bits that can be used to carry information about payload. Pivotal has been working with Netifi and other companies to define some metadata for announcement routing that I'll show you here in a little bit. That means that you can use that bit of metadata to determine what piece of your application handles that message. You can send messages across the same connection, whatever the message may be. You don't have to open a specific connection for this type of messages, a different connection for that one, they all go over the same connection.

Spring Cloud Gateway RSocket

Spring is adding support for RSocket, it's coming in Spring 5.2. There's already been one or two milestones of framework, Spring Boot will auto-configure your RSocket server for you. We're looking at working with Spring Security just to add its goodness to secure the protocol.

If you think about when you make an HTTP connection, you really don't know how many machines that connection has traversed. Similar to RSocket, it really doesn't matter how many machines it's traversed, but there does need to be something smart in the middle that knows how to route your RSocket connections, and so that's what we've started to build with Spring Cloud Gateway.

This, traditionally, has been an HTTP gateway, similar to Zuul from Netflix, but in this implementation, it's not HTTP, it's RSocket. The RSocket Java library is built using Project Reactor from Spring, which is already native to Spring, which is great, and Spring Boot auto-configures things for us. We get a lot of goodness out of that so that the gateway RSocket module can be very narrow in what it does.

What does it do? A gateway deployment might look like this, a small cluster of gateway instances and then RSocket clients that connect to it. The communication is routed from, for example, the Java client on one side through one or two gateways to perhaps a JavaScript client on the other side. The other thing that's nice about RSocket is this polyglot. Obviously, the Spring guys we're going to do Java, but Facebook are the maintainers of the C++ library. There's a Python 1, a Go 1, JavaScript 1, I've seen demos, you can use the JavaScript library directly in the browser over WebSockets, so directly from this JavaScript client, could be a browser, it doesn't matter.

The client makes a connection to the gateway, the gateway doesn't connect to the client, and when it connects, it sends some metadata that says, "Who am I?", a name and an ID. It's interesting that when a client makes a connection, it can be secured in such a way that no incoming connections are allowed. We all have these problems in whatever IaaS we're using, whether it's AWS, or Azure, whatever, setting up the firewall rules to only let certain things in on this port or that port from these networks. Imagine what your security people would say if you could say, "No incoming connections whatsoever." The gateways need them, but the clients do not. You can just totally shut off all access to that machine. That's the ultimate security, isn't it? No connections allowed, that's almost as good as unplugging it from the internet.

Another bit of metadata here on the right runs through a filter. Then the gateway can say you can set up custom filters where you could say, "Is this client actually allowed to connect to me?" You can set up security at both levels from a connection standpoint.

Another interesting thing that happens is, because a client sends this little bit of metadata, its name and its ID, the gateway has to build a routing table so that it knows when it gets a connection that wants to go to service A, where to send it. In essence, it's building this routing table and it becomes a service registry system and a discovery system. You don't have to run something else that its sole job is to do service registration and discovery. It basically handles that function for you.

Clients are already connected, we don't have to make another connection, now we can start making requests. We make requests, and you send a little bit of metadata. The other thing that's interesting about RSocket is, the protocol is designed to be extensible, so this metadata that I'm talking about, we're working with to formalize this into an extension of the protocol. It's not just some random thing we're coming up with, but it will be formalized as an extension to the protocol. When you make a request, you send this little bit of metadata, "I want to talk to service A, and I want to call for better lack, the echo method on service A."

What does that look like? Here’s an example, the Java client is trying to call the JavaScript client, and we want the destination echo. This request will get routed through one or two gateways to get there, but you'll notice that it doesn't need client side load balancer, the load balancing is built into the gateway. If there's more than one service A, the gateway will be smart enough to pick one. There's no service mesh, there's no sidecar running. There's no duplication of processes that come with something like sidecar. Another interesting thing is, there doesn't need to be a circuit breaker in the client as well. That goes back to the back pressure that can be communicated across the network.

There's an interesting situation that you can get yourself into here. You can connect to a gateway and make requests to a service that isn't connected to the gateway yet, so what happens? What we've done is, we've built into the gateway a system such that when a request comes for a service that doesn't exist, we create this placeholder. Basically, what happens is, we use Reactor and apply back pressure across the network. How many of you have come to the situation where "I'm spinning up a new environment, but I need to start this service first, and then that one, and then I can start the last."? I see you laughing, but it happens, when you spin up test environments or whatever for the first time, and it's very difficult. What do you end up doing? You set up retry with exponential backoff, so that you can, hopefully, in the window of your retry before it dies, the other thing comes up and it will work. In this case, you don't have to deal with any of that, because the gateway will handle it for you.

Another thing that happens, just like at the connection level, at the requester level, each request runs through a series of filters that can be customizable. Again, you can apply security here, such as, "Can this service actually talk to that other service?" There are lots of highly-regulated environments where there are some segregation that needs to happen, and you can do that on arbitrary metadata that you provide, or simply based on the names and destinations. You've tried to plug in, provide the hook points you need to add security at whatever layer you need to.

Another thing that we do is collect metrics. We built Micrometer at Pivotal to be the new metrics implementation in Spring Boot 2. If you've used Spring Boot 2, you've used Micrometer. What we've done in RSocket and in the gateway is, set up Micrometer such that it collects metrics at various different levels in the gateway, at a general level, at a connection level, and also we take care of requests, metrics at the request level, so you can see rates of messages flowing from service A to service B, for example - something that could be very useful that often is very hard to get, and we do it all in one central place.

Demo

Let's get to some demos. This is the demo setup that I'm eventually going to get to and I'm going to show some simpler versions of this, specifically, first, without a gateway, to show that these applications relatively unchanged, work. They can communicate directly with each other. One actually, is a server, it has to listen to a connection. That's the small change, but other than that, it should just work. My demo application is very complicated. Who here has played ping-pong? Let me show you a ping application here.

Most of what this does is not super interesting. This is how I create metadata to say, "I'm a ping," so the gateway knows who I am, and we set that here. Then, here is where we actually send a message. I have a method called SendPings, and if you saw Josh's demo, a flux is a stream of things, in this case of payloads, and so every second, I'm going to send a ping to the server, and almost an identical implementation on the pong side. Like I said, the little bit is MIS server or not, because I have to listen if I'm actually going to do it.

Here I connected, this is ping and pong, directly connected to each other. You can see both sides sending and receiving. I did a request channel, which is a long-running request, many inputs to many outputs, there's no gateway in the middle. That's just me proving that this stuff works. Now, I come and run gateway, I run my ping application. There's no pong server, I haven't run that yet. You can see, every second it's printing this statement here. This is gateway applying back pressure across the network, it says, "I can't respond to that message right now." What I had to do is do on back pressure something. If you look at what you can do in Reactor, you can do all sorts of different things when back pressure is applied. You can buffer, you can drop them, you can throw an error, whatever the case may be. It's really interesting that the semantics of RSocket follow the same semantics of Reactive Streams, such that these methods translate directly across the network.

In computer science, programming, things really just work the first time. You try them and it fails and you figure out what you did wrong. When I did this the first time and it worked, I was, "Wait, I'm sure I'm missing something in the middle and that this didn't happen," but it just worked. The backpressure just propagated across the network. This is where you would deal with something akin to a circuit breaker fallback. Now, you probably couldn't handle all of your services being down for minutes, or even hours, but in startup, for example, like we're showing here, this is pretty useful. Let's go ahead and run Pong. All of a sudden, it's receiving pings and it just started, I didn't have to retry, I didn't have to do anything, it just worked.

Now we're getting to the actual picture here. I'll start gateway 1, in this case, the second gateway is going to connect to the first one. Basically, your cluster of gateways would have to know about each other, but that should be a fairly small number of them. We've connected and again, I'll do the same demo I did before. Neither one of gateway 1 nor gateway 2 has a pong to respond and so we get that same backpressure. I will start pong this time pointing to gateway 2 instead of gateway 1. We can see it's started to work.

What happened is that pong registered with gateway 2, gateway 2 told gateway 1 that I have one, and when it told gateway 1 that, it went, "Gateway 1, uh, I'm looking for that," so it actually gave the request from ping, the RSocket connection to gateway 2. It didn't give the connection to the pong server, it just gave the connection to gateway 2, and so it went two hops. Usually people think, "Oh, no, another network hop. That's bad." Why do they think that? Usually, it's bad because what do you have to do is, you have to make a connection, which is expensive, but here you don't. The connection was already there, so the request over the already open connection was very fast. There's going to be physics involved, but it should be very small.

To Recap: Things You Won’t Need

Here are some of the things we talked about that you don't need in this kind of setup. Your services don't need incoming permissions, they can be totally blocked. They just make an outbound connection to the gateway and then you just have to secure the gateway, some small number of them. We have ideas about - you could run a gateway cluster, for example, in one region, and another in another and have them connected over a wide area network or VPN or something like that, and do some intelligent load balancing. For example, generally, the requests across to the other region would be slower, because they're going across the country or whatever, or across the ocean, but, for example, if a service died in the local region, well, slow service is better than no service.

We talked about that there isn't a need for a separate service discovery system. Now, I've built Spring Cloud on integrating service discovery systems. We don't need something separate for that, we don't necessarily need a service broker, if your messages don't need durability, for example, they're just kind of these ephemeral messages, rather than REST calls. If we think about message-oriented systems instead of request-response based systems, this kind of thing fits right in. You don't need a separate message broker.

We talked about the need for eliminating circuit breaker, we can use the built-in back pressure from Reactor and RSocket. We don't need a client-side load balancer like Ribbon. You don't need a library. This is super helpful when you're polyglot, because with Eureka, Ribbon, it's, "How do I integrate that? Because now our front-end guys want to use Node." We don't need that, a sidecar, a service mesh.

The CTO of Netifi said that they ran a demo, they took the canonical STO application and put it on some hardware in a cloud, ran it, benchmarked it. They then replaced STO with RSocket on the exact same hardware, and immediately got lower latency and more requests per second. Then they scaled it and kept getting better and better performance, and he said, "We stopped because it would just get a little slower every time, just because of the ability for the back pressure to propagate across the network. Services would not go down, they would just get a little slower."

We talked about startup ordering problems, you don't have to deal with that. Our friends at Netflix talk about an application of theirs that takes tens of minutes, maybe even an hour to warm up, before it can start taking requests. They bring it up, register it with Eureka with out of service, then once it's warm, they send a message. If we use something like RSocket, it could just be not available until it was ready. There are certain cases where with this kind of architecture, a thundering herd can be avoided.

We talked about some of these good things from RSocket, persistent connections that are multiplexed, or you have multiple transports. One of the things I envision is a couple of RSocket servers that live on the edge, that then are connected to your private RSocket servers. The ones on the edge may speak WebSockets, or they may speak TCP. Your mobile apps, your web apps could talk directly to those and then they could talk directly to your interior cluster.

Polyglot - write a little JavaScript ping client and the only thing you need to send is that little bit of metadata with your requests, and then we get all this data that's incoming from Micrometer.

The roadmap that we are looking at, we have some clustering enhancements to do. Another interesting thing is that because the cluster knows about all of these services, there are times when it's interesting to say, "I want to send a message to all instances of a service" some kind of coordination message or some kind of administrative message, we would like to be able to do that. You could just add a little piece of metadata to your fire-and-forget message that said, "Send this message to everyone." Or, it might be an interesting thing to do. Send a message to multiple servers and get the first reply, the fastest one back, so you send it to two or three and the fastest one wins, called requests hedging.

One of the extensions to the RSocket protocol is for tracing. Have you heard of Zipkin before? OpenTracing? It’s the ability to figure out, pinpoint where in your system requests are slow. There are other things that we can do for optimizing routing. I talked a little bit about that intelligent load balancer that could say, "It's faster over here, that is work that has to be done." We're targeting this for the third quarter of this year. As part of Spring Cloud Hoxton, this will come with the work that is done in Spring framework and Spring Boot and we are targeting the third quarter.