Modern cloud applications are made up of many services implemented in different programming languages. For an application to perform its tasks, the services need to communicate with each other. There are various alternatives to inter-service communication such as publish-subscribe, request-response, notifications, etc. In our earlier post, we described the use of synchronous request-response communication mechanism based on REST, and how the Netsil Application Operations Center (AOC) can observe and analyze REST service interactions. In this post we will describe Apache Thrift, an alternative to REST for request-response communication between services, and Netsil AOC’s approach for analyzing Thrift RPC interactions in real-time.

What is Thrift?

Thrift is a remote procedure call (RPC) framework for creating interoperable and scalable services. Thrift allows writing cross-language RPC clients and servers, and supports arbitrary function calls from client to server. Thrift provides an Interface Definition Language (IDL) for defining services. With Thrift’s compiler you can generate client and server code stubs for variety of languages including C++, Java, Python, PHP, Ruby, Erlang and Node.js. Thrift defines its own data types which are mapped to the data types native to a particular programming language while generating the code. Thrift provides a space and time efficient serialization mechanism for interoperability between clients and servers implemented in different languages.

Thrift was originally developed by Facebook for its internal use. As Facebook grew over the years, different engineering teams chose programming languages and software platforms that were best suited for their requirements. While this approach was deemed optimal for individual teams, it led to interoperability issues between services implemented in different languages. Thrift was designed to assist in building scalable and interoperable services across different languages and platforms. Thrift was later contributed to Apache Foundation and released under the Apache 2.0 license.

Thrift Stack

Figure 1. Thrift networking stack

Thrift comes with a complete networking stack as shown in Figure 1. The top layers in the stack include the code generated from the Thrift interface definition file. Thrift provides a number of servers such as a single-threaded server using blocking I/O (TSimpleServer), a multi-threaded server using blocking I/O (TThreadPoolServer) and a multi-threaded server using non-blocking I/O (TNonblockingServer).

The bottom layers (protocol and transport) are part of the Thrift’s runtime library. By decoupling the code from the runtime library, Thrift allows you to easily change the protocol and transport by making only a few changes in the code. The Thrift transport is responsible for transmitting data over the wire. Thrift supports multiple transports for channels such as Raw TCP, HTTP, Socket I/O, files or memory. The choice of a transport depends on the requirements of the application. For example, for blocking calls, the TSocket can be used whereas for non-blocking calls the TFramedTransport can be used. Thrift protocols prepare the data to be transmitted over the wire. Protocols handle serialization of the data structures while sending data and deserialization while receiving data. Thrift supports formats such as binary, JSON, plain text etc. The choice of a format depends on the type of data that needs to be transmitted. The binary format is best suited for most applications as it supports all types of data and has the least overhead.

Figure 2. Workflow of implementing a service using Thrift

It is quite easy to get started with Thrift. Thrift has built-in interface definition language (IDL) and code-generation tools that allow you to easily set-up working servers and clients. Using the IDL, you can define the data types and service interfaces in a thrift interface definition file. Next, using this file as an input, you can generate the code for clients and servers in different programming languages.

Thrift vs REST

Following are few important advantages of Thrift over REST:

REST is designed for accessing and changing representations of web resources using a uniform and predefined set of stateless operations (GET, POST, PUT, DELETE) whereas Thrift is an RPC framework that allows communication between clients and servers implemented in different languages.

While REST is best suited for services which operate on collections of resources, Thrift supports arbitrary function calls from client to server.

REST has a higher overhead compared to Thrift as each request from a client must contain all the information necessary to service the request. Thrift is more efficient than REST as it has fewer handshakes and supports greater amount of parallelism for clients and servers.

Thrift supports multiplexing of requests over different services based on the service context.

As a result of above benefits, Thrift along with other efficient RPC frameworks such as gRPC and Vanadium are emerging as powerful alternatives to REST when building microservices based applications.

Netsil’s approach for real-time analysis of Thrift RPC

While existing solutions for monitoring Thrift require instrumenting the client/server code, Netsil AOC monitors Thrift-based applications by capturing the real-time service interaction over the network. As a result, Netsil AOC neither requires code change on client side nor on server side. The AOC can discover the RPC calls sent by a client to a server by analysis of the client-server interactions.

Figure 3. Auto-discovered application topology map showing a Thrift service

The golden signals of monitoring for Thrift are the same as for REST (i.e., latency, traffic, errors, and saturation) as described in our earlier post. Figure 4 shows the key performance indicators (KPIs) for a Thrift service including throughput, latency and request/response sizes. Netsil AOC has the ability to measure round trip time for client-server interactions and track exceptions or errors.

Figure 4. KPIs for Thrift service in Netsil AOC

The network-centric approach allows Netsil to track request method names and response types without having to parse the entire Thrift payload. This approach is perfect for production environments because it is non-intrusive, efficient and provides detailed performance and health metrics for each Thrift RPC. In the current version, we capture the client-server interactions over the wire without reading the Thrift interface definition file (IDL). In upcoming release, Netsil AOC will accept Thrift IDL files for use cases where deep analytics on method arguments is required.

Conclusion

The Netsil AOC has the ability to observe and analyze thrift interactions in cloud applications. Each time a thrift client makes an RPC call to a server, the entire interaction is captured via network in real-time. Netsil uses the network interactions as the source-of-truth for app observability rather than instrumenting the client/server code. In the future, we plan to support other RPC frameworks such as gRPC and Vanadium.

We welcome your comments on this post and also encourage you to try the Netsil AOC at https://netsil.com