Computing Thoughts

Calling Go from Python via JSON-RPC

by Bruce Eckel

August 27, 2011



Summary

Although it's often uncomfortable, I think the best approach to learning a new language or technology is just to grab your nose and jump feet-first into the hole in the ice.


I've been intrigued by the Go language since I first encountered it about 6 months ago. There are some great online lectures that you can find here.

Go solves a number of problems that C++ and Java didn't. Unlike C++, Go has a garbage collector which eliminates a large class of memory-leak issues. Go has a much cleaner syntax than either language, eliminating a great deal of programmer overhead when writing and reading code. Indeed, based on syntax alone I personally classify Go, along with Scala, as a next-generation language because they respect the needs of the programmer; they serve the programmer rather than telling what hoops must be jumped through.

But one of the biggest issues solved by Go is parallel programming. Go has built-in support for Communicating Sequential Processes (CSP). You can start any function f() as a separate process by saying go f(), and then communicate with that process using channels (which safely transport messages back and forth). It's worth noting that Go was initially designed to create servers (not web servers per se, just general-purpose servers as we'll explore in this article) and has expanded to a general-purpose systems language; it creates natively-compiled executables and tends to be more closely aligned to the hardware -- thus "goroutines" tend to be tied to the processors/cores of a single machine. Scala, in contrast, seems to be aiming for distributed multiprocessing systems, especially when you look at the direction that Akka is going (the rumor is that Akka will eventually be folded into Scala). (There's a project for distributing Go across machines).

My interest here is to make the case for hybrid programming. I distinguish hybrid from polyglot programming; the latter means that the programmer has learned and benefits from knowing multiple languages (this is essential, I believe, as it allows you to see solutions in multiple ways rather than being restricted to the lens of a single language). When I say hybrid programming, I mean that a single application is crafted using multiple languages.

In my experience, all languages do some things well but have blind spots. When you use those languages, you can move very quickly when utilizing their "power regions" but the project slows way down and becomes expensive when you encounter the problem areas. Why not use multiple languages and let each one work to its strength?

Python, for example, is an incredibly powerful language for rapid development because of the simplicity and clarity of its syntax and the extensive reach of its libraries. But it has blind spots.

One is building user interfaces. It's not that there are no UI libraries for Python; there are too many! Each one has its own strengths and weaknesses, and you only discover these after you've slogged through and built a UI with each one. The weaknesses are not advertised so you only discover them after putting in significant effort. So here's an example of hybrid programming: in this article and in this one I create a Python application and use a Flash UI written in Flex, thus getting the best of both worlds (these days, however, everything seems to be shifting to HTML5).

Both Python and Ruby support concurrency, which allows you to program as if you have multiple processing units available. This can be very helpful for code organization. However, neither language can support parallelism: making use of multiple processors or cores. In the aforementioned article, I show how to distribute work over multiple cores by using multiple independent Python processes, but that turns out to be a fair amount of work. What if we hybridize the solution by doing the parallelism in Go?

To do this, we must be able to communicate between the two languages. In the past I've used XML-RPC, but in recent years there's been a move towards JSON-RPC. This is similar in many ways to XML-RPC, but it uses the more efficient (and often easier-to-read) JSON to transport data. (Note that this might also make it easier to work with client-side JavaScript). The standard Go library currently provides support for JSON-RPC (but not yet XML-RPC), so that became the obvious choice.

Here is the Go server, which simply echoes whatever is sent to it. The file is named Server.go:

package main import ( "rpc/jsonrpc" "rpc" "os" "net" "log" ) type RPCFunc uint8 func (*RPCFunc) Echo(arg *string, result *string) os.Error { log.Print("Arg passed: " + *arg) *result = ">" + *arg + "<" return nil } func main() { log.Print("Starting Server...") l, err := net.Listen("tcp", "localhost:1234") defer l.Close() if err != nil { log.Fatal(err) } log.Print("listening on: ", l.Addr()) rpc.Register(new (RPCFunc)) for { log.Print("waiting for connections ...") conn, err := l.Accept() if err != nil { log.Printf("accept error: %s", conn) continue } log.Printf("connection started: %v", conn.RemoteAddr()) go jsonrpc.ServeConn(conn) } }

Packaging in Go is reminiscent of other languages. However, main() is a special name and if you have a main(), Go wants the package name to also be main.

You can import other packages one at a time, as in import "os", or you can import them in a block as shown here. The Go compiler gives you an error if you import a package that you don't use.

The type statement creates an alias; here it's just to another type but it can also be to something complex like a structure. Go built-in types are very specific -- they don't rely on the vagaries of the underlying machine as C and C++ do. Here, uint8 refers to an unsigned 8-bit integer. The only reason we're creating a type in this case is because Go's JSON-RPC library requires a class to be registered, and doesn't allow individual functions.

To create a class in Go, you attach methods to a type (in a more typical class, you'd start with a structure to hold the data fields, but in this case we aren't actually using fields so we don't care what the type is). Go's classes would be called "open classes" in Ruby -- you can attach methods at any time. Here, the only method is Echo(). Before the name, you see the receiver in parentheses; this is the structure to which you attach the method. In C++ or Java, the receiver is equivalent to the this pointer, and in Python it would be self. Thus, any time you see a receiver right after the func keyword, that's the class this method belongs to. If there's no receiver, it's an ordinary function. (In this case the receiver has no identifier because we are ignoring it in the method body).

Arguments in Go are name first, type second. The return type of the function appears after the closing parenthesis of the argument list. One thing you'll note is that Go has eliminated all redundant syntax -- much like Scala, but it often goes even further. Still, much of the syntax is comfortable for C++, Java and even Python programmers.

Fitting with its server orientation, Go has a built-in log package, with familiar Print, Printf and Fatal functions (the last one also exits the current function). While Go has pointers, it doesn't have pointer arithmetic, so it keeps the efficiency of pointers without the pitfalls of the arithmetic. A JSON-RPC method can take a pointer or regular argument, but its result must be a pointer so that you can write into it. It must also return an os.Error, which is nil if the method call succeeds.

In the second line of main() you see something unfamiliar:

l, err := net.Listen("tcp", "localhost:1234")

The call to net.Listen() seems ordinary enough, but it produces a tuple of two values as the result, because Go allows multiple return values. You'll most often see it in this form, with the result followed by an error value. But the := is different, too: it means that l and err are being defined right here and that Go should infer their types. Again, very succinct, dynamic-like behavior for this statically-typed language. In general, you'll find that the Go designers have tried to do everything they possibly can to eliminate coding overhead for the programmer.

The following line introduces a new concept, as well:

defer l.Close()

The defer keyword ensures that something will happen regardless of how this function is exited. It's a much cleaner version of Java's finally.

For full-blown exceptions, Go has panic and recover (which I have yet to explore). But for situations you can test for and deal with, you check the error condition when you make a call. This might seem uncomfortable at first if you're used to try-catch, but I'm OK with it so far, and it might even have some benefits.

Note that an if statement does not have parentheses around the conditional, but it does require curly braces around the body.

The new built-in function allocates storage for user-defined types.

In Go, for can have three forms: the classic for init; condition; step; {}, the Python-like for i,v:= range Iterable {} and the while-loop-like for condition {}; a special case of this is the infinite-while for {} that you see in main().

At the end of main() you see:

go jsonrpc.ServeConn(conn)

The go keyword creates a "goroutine" out of the function call, and this runs independently of the main() thread until it completes. So the server can go back to waiting for more requests while the previous calls are being handled.

The Go team has obviously gained inspiration from other languages. You can even see one clear homage to Python in the documentation, where they advise you to "Keep these under your pillow," a phrase that has long been seen in the Python documentation.

Go is open-source and it's already part of the Ubuntu and Debian distributions (I have heard that Ubuntu is quite interested in developing with Go).

To compile Server.go, you'll first need to install Go. Go does not yet have its own build system, but it provides include files for simple creation of Makefiles. To turn the above into a compiled executable, the Makefile looks like this:

include /go/src/Make.inc TARG=Server GOFILES=\ Server.go\ include /go/src/Make.cmd

You can configure your environment so that make handles this, or you can just run gomake. The resulting executable will be the TARG, which in this case will be called Server.

You can create a Go client to test the server like this (Client.go):

package main // Connect to JSONRPC Server and send command-line args to Echo import ( "rpc/jsonrpc" "os" "net" "fmt" ) func main() { conn, e := net.Dial("tcp", "localhost:1234") if e != nil { fmt.Fprintf(os.Stderr, "Could not connect: %s

", e) os.Exit(1) } client := jsonrpc.NewClient(conn) var reply string for i, arg := range os.Args { if i == 0 { continue } // Ignore program name fmt.Printf("Sending: %s

", arg) client.Call("RPCFunc.Echo", arg, &reply) fmt.Printf("Reply: %s

", reply) } }

Dial() is a general-purpose function for making network connections; here, it's configured to use tcp sockets. From the connection, we produce a JSON-RPC client and use that to make the RPC calls.

Note this line:

for i, arg := range os.Args {

This uses the range keyword to step through the command-line arguments provided when the program is invoked. The first element of the tuple returned by range is the index into the os.Args array, and the second is the value at that index. Element zero is the program name, so we ignore that. range can also be used to iterate through other kinds of structures, such as map.

Here is the Makefile for the client:

include /go/src/Make.inc TARG=Client GOFILES=\ Client.go\ include /go/src/Make.cmd

When trying to call Go from Python, I ran into a snag. There are a number of JSON-RPC libraries for Python, but they all assume that they will be making the connection via HTTP, whereas the current JSON-RPC library in Go supports only raw socket connections. I only discovered this by trying all the Python JSON-RPC libraries, and finally asking the mailing list, which is called Go-Nuts. This turned out to be very helpful; after I thrashed around in frustration, someone finally showed me how to do a raw call to my server:

import json, socket s = socket.create_connection(("localhost", 1234)) s.sendall( json.dumps({"id": 1, "method": "RPCFunc.Echo", "params": ["Cupid's Arrow"]}).encode()) print((s.recv(4096)).decode())

This also gives you a pretty good idea of how JSON-RPC works. You can find details in the specification. A call to a JSON-RPC server simply sends a block of data through a socket. The data is formatted as a JSON structure, and a call consists of an id (so you can sort out the results when they come back), the name of the method to execute on the server, and params, an array of parameters which can itself consist of complex JSON objects. The dumps() call converts the Python structure into JSON, and the encode() and decode() calls are necessary to make the program run in both Python 2 and 3.

After the call is made, the result comes back in a buffer with a maximum size of 4096:

{"id":1,"result":">Cupid's Arrow<","error":null}

Note that the id is the same as the one we sent, so we know it's a response to our query. The result can be a complex JSON object, and a non-null error tells you something has gone wrong.

It felt like a breakthrough when this code worked, but it's an awkward way to make calls. Fortunately, Stephen Day wrote a Python class for making socket-based JSON-RPC calls, and I made some changes so the library will work with both Python 2 and Python 3 (I say Python 2.6 here, but that's only because it was the earliest one I have installed. It might work with earlier versions as well):

# jsonclient.py # Simple JSONRPC client library created to work with Go servers # Works with both Python 2.6+ and Python 3 # Copyright (c) 2011 Stephen Day, Bruce Eckel # Distributed under the MIT Open-Source License: # http://www.opensource.org/licenses/MIT import json, socket, itertools class JSONClient(object): def __init__(self, addr): self.socket = socket.create_connection(addr) self.id_counter = itertools.count() def __del__(self): self.socket.close() def call(self, name, *params): request = dict(id=next(self.id_counter), params=list(params), method=name) self.socket.sendall(json.dumps(request).encode()) # This must loop if resp is bigger than 4K response = self.socket.recv(4096) response = json.loads(response.decode()) if response.get('id') != request.get('id'): raise Exception("expected id=%s, received id=%s: %s" %(request.get('id'), response.get('id'), response.get('error'))) if response.get('error') is not None: raise Exception(response.get('error')) return response.get('result')

The constructor creates the socket connection, and makes a counter for the JSON-RPC id using itertools.count(). This creates an object which increments and produces the new value each time you pass it to next(). The destructor closes the socket.

The call to JSON-RPC allows multiple parameters, so JSONClient.call() has a variable argument list, *params. This comes in as a tuple, so list(params) turns it into an array.

The request is created as a dict, which conveniently converts the keys into strings, so the JSON-RPC elements id, params and method are already in the right form. The call to json.dumps() converts the structure into JSON format and returns it as a string; encode() is applied to make it Python 3 compatible.

One unfinished part of this code is the response reception, which is currently limited to the maximum buffer size of 4096. To make it more general, it needs to deal with the case when the response is bigger than that. I poked around a little in Python's socket documentation, but the best way to handle this was not clear (perhaps the code in Python's XML-RPC library would yield a solution).

The call to json.loads() returns a dictionary containing the response information; this allows us to use get() to select the various elements.

By placing the library in your local directory, or installing it in your Python library, it's now possible to write a more elegant program to make calls to the Go server:

from jsonclient import JSONClient rpc = JSONClient(("localhost", 1234)) for i in range(100): print(rpc.call("RPCFunc.Echo", "hello " + str(i)))

This has only been step one towards making the case for a hybrid program. Next, I will create a Go server to solve an actual parallel programming problem.

Coming from a background in C/C++, I find Go to be a real breath of fresh air. At this point, I think it would be a far better choice than C++ for doing systems programming because it will be much more productive and it solves problems that would be notably more difficult in C++. This is not to say that I think C++ was a mistake -- on the contrary, I think it was inevitable. At the time, we were deeply mired in the C mindset, slightly above assembly language and convinced that any language construct that generated significant code or overhead was impractical. Things like garbage collection or language support for parallelism were downright ridiculous and no one took them seriously. C++ took the first baby steps necessary to drag us into this larger world, and Stroustrup made the right choices in making C++ comprehensible to the C programmer, and able to compile C. We needed that at the time.

We've had many lessons since then. Things like garbage collection and exception handling and virtual machines, which used to be crazy talk, are now accepted without question. The complexity of C++ (even more complexity has been added in the new C++), and the resulting impact on productivity, is no longer justified. All the hoops that the C++ programmer had to jump through in order to use a C-compatible language make no sense anymore -- they're just a waste of time and effort. Now, Go makes much more sense for the class of problems that C++ was originally intended to solve.

Talk Back!

Have an opinion? Readers have already posted 8 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Bruce Eckel adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Bruce Eckel (www.BruceEckel.com) provides development assistance in Python with user interfaces in Flex. He is the author of Thinking in Java (Prentice-Hall, 1998, 2nd Edition, 2000, 3rd Edition, 2003, 4th Edition, 2005), the Hands-On Java Seminar CD ROM (available on the Web site), Thinking in C++ (PH 1995; 2nd edition 2000, Volume 2 with Chuck Allison, 2003), C++ Inside & Out (Osborne/McGraw-Hill 1993), among others. He's given hundreds of presentations throughout the world, published over 150 articles in numerous magazines, was a founding member of the ANSI/ISO C++ committee and speaks regularly at conferences.

This weblog entry is Copyright © 2011 Bruce Eckel. All rights reserved.