What's happening in Go tip (2013-08-15)

This is the first part of an ongoing series on changes and developments in Go tip, i.e. the development branch of Go.

In this series I intend to look at noteworthy recent CLs¹ and proposals and comment on their significance and impact. For CLs, I will be concentrating on those that have either been already committed or that have been rejected, if they had been of significance. For proposals I will look at their current state and summarize the discussions where possible.

¹: CL is short for Change List and is, in essence, what GitHub calls a pull request. Go maintains a linear history of changes, and every change needs to be submitted as a CL and peer reviewed before it can be committed to the repository.

The format

I am intending to post weekly updates, talking about the changes that happened during that week. Close to the release of a new version of Go there will naturally be fewer/no interesting changes. In that case I might either pause posting updates, or post “Nothing to see this time” updates.

Since this is the first post in the series and due to unlucky timing it will have special status. For one, there have been a lot of changes between Go 1.1 and tip, and catching up on all of them will require more than one article. “Luckily”, Go is nearing its feature freeze (September 1), and a lot of the Go team will be on summer vacation until the middle of September. This means there will be little (accepted) change, allowing me to spend some more time talking about past changes.

Disclaimer

There are some important things to note before diving into the article.

The first is that I am not part of the core Go team and as such my interpretations might sometimes be wrong. I am trying my best to be as accurate as possible, but if I do mess up, let me know and I will correct any errors.

Second, just because something got committed doesn’t mean that it will actually be part of the next release of Go. It might be reverted for various reasons or it might be part of the second next release. Reverted changes will be covered in future articles, and I will try my best to annotate changes that are not intended for the next release.

What’s happening

In the first article in the series I will be looking at:

C++ support in cgo (kind of)

Function pointers in cgo

Improved marshaling through the addition of new interfaces

Preemptive scheduling

cgo

C++ support in cgo, kind of

Relevant CLs: CL 8248043

CL 8248043 adds support to the cgo tool to include C++ files in the build process. It adds the CXXFiles field to the internal representation of packages and support for the CPPFLAGS and CXXFLAGS variables.

Now, this doesn’t mean that you can directly use C++ classes and methods in Go. After all, these are still hard to map to the concepts of Go and especially name mangling wouldn’t make your life easy. Instead you will still have to write C wrappers for the C++ APIs and bind to those wrappers instead. What this change allows, however, is to compile those wrappers during the build process, instead of being forced to add a precompiled library to the project.

Function pointers in cgo

Relevant CLs: CL 9835047

CL 9835047 adds partial support for function pointers to cgo. While it won’t permit calling functions from function pointers directly, it will allow storing and passing them around, giving them a proper type. Therefore it will be possible to receive function pointers from C, store them in Go, and pass them back to C to let C call them.

Improved marshaling

Go 1.2 will include some very nice improvements to encoding, by introducing a general text marshaler and an improved marshaling interface for encoding/xml .

Generic text and binary marshalers

Relevant CLs: CL 12541051, CL 12703043, CL 12751045, CL 12681044, CL 12705043, CL 12706043

The first part of the improvements is the addition of an encoding package, defining the TextMarshaler / TextUnmarshaler and BinaryMarshaler / BinaryUnmarshaler interfaces.

These two sets of interfaces define generic (un)marshaling of values to/from UTF-8 encoded text and binary form respectively. The idea behind them is that packages like encoding/json , encoding/xml and encoding/gob can all rely on these interfaces instead of forcing the user to implement three different sets of interfaces (or more if new encoding/* packages are added.)

One issue that might be familiar to some of you is that encoding a net.IP to JSON would result in a base64 encoded string instead of a human-readable representation of an IP. And encoding it to XML is even more futile, not to mention that neither would be able to unmarshal again.

That is because previously, net.IP would’ve had to implement the JSON Marshaler interface, and XML didn’t provide any way to implement custom marshaling.

Now, by implementing the TextMarshaler and TextUnmarshaler interfaces, a type can have a single machine-parseable textual representation that all encoding packages can use. And if text isn’t an option, there’s the BinaryMarshaler and BinaryUnmarshaler interfaces, which right now encoding/gob uses and prefers over the textual ones.

Of course it will still be possible to implement encoding-specific behaviour for e.g. JSON. The generic interfaces will be tried last, after checking that no encoding-specific interfaces have been implemented.

All existing encoding packages in the standard library support the new interfaces (JSON, XML, gob) and both net.IP and time.Time implement them.

If you’re interested, there is also a design document.

xml.Marshaler and xml.Unmarshaler

Relevant CLs: CL 12603044, CL 12556043

The second and very important improvement is the addition of marshaling interfaces to encoding/xml . While previously XML marshaling behaviour could only be affected via struct tags, it is now possible to implement entirely custom behaviour, both for values that are marshaled as tags ( Marshaler and Unmarshaler ) and as attributes ( MarshalerAttr , UnmarshalerAttr )

Since XML is more complex than JSON by having different contexts in which data can appear (tags, attributes, comments) and more complicated parsing and state (escaping, namespaces, …), the design of the XML interfaces differs from the JSON ones in one significant aspect: While the JSON marshaling interfaces deal with byte slices, the XML interfaces give you an Encoder / Decoder value, allowing you to use Go’s parser to parse or emit tokens or tags.

For that purpose, two new functions have been added to xml.Encoder , namely EncodeElement and EncodeToken .

But if your requirements don’t include complex handling of tokens you can still use the new interfaces to encapsulate a common pattern: Copying your data into another data structure that reflects the desired XML structure and marshal that. Previously, this step had to be done before the call to xml.Marshal , now it can be hidden away; similar for xml.Unmarshal . This playground snippet demonstrates some of the possibilities, but it’s hard to cover all of them in a succinct way without real world examples. And of course you cannot execute that code in the playground because it depends on changes in Go tip.

There is a design document. for these interfaces as well, but be warned that the final implementation differs slightly by separating the Marshaler interface into two interfaces.

Scheduler

Basic preemptive scheduling

Relevant CLs: CL 10264044

One of the oldest issues on the Go project has been the request for preemptive scheduling (Issue 543). Until now, Go’s scheduler has been strictly cooperative, with scheduling happening during specific events (channel operations and locks).

While cooperative scheduling isn’t familiar to many modern-day programmers and might confuse them in some cases, that’s not the main problem. The main problem is that the garbage collector, too, needs to wait for all goroutines to yield before it can do its job.

When the garbage collector wants to run, it prevents all goroutines from being scheduled. A goroutine that is already running, however, will run until it yields. This can lead to long GC pauses, where all cooperative goroutines have already been stopped but one misbehaving goroutine is still running and won’t yield for the foreseeable future. During that period, neither other goroutines nor the GC can run.

Preemptive scheduling solves this problem by, well, preempting goroutines. No longer do they choose when to yield.

So, does that mean that for {} will no longer lock up the scheduler? Not quite. The amount of preemption in Go 1.2 is limited to preemption at function calls.

Whenever a function gets called, its preamble checks whether there is enough stack left or if more needs to be allocated. The scheduler abuses this check by pretending that more stack needs to be allocated and thus forcing the “need more stack” function to run. This function is aware of preemption, realizes that it doesn’t really need more stack and instead preempts the goroutine.

This does imply that a goroutine that never uses any channel operations, locks, or function calls will still not be preempted, but such code should be rare, and the amount of preemption might well be increased in future versions of Go.

The future

I hope that this article could give you a rough overview over the kind of changes that are happening in Go tip, as well as the format of this series. Of course there are a lot more changes that happened between Go 1.1 and now, but as I stated in the introduction, because of the upcoming summer slowdown, I spread out these changes over several articles. And after all, the idea of this series is to provide digestible summaries of what happened. A huge article with all changes combined would contradict that idea.

Please let me know if/how you liked this article and the idea of the series and whether it is something you want me to continue.

You can also contact me via IRC, email, G+ or Twitter and let me know about specific changes that you want me to cover. To give you an overview over what I have planned for the upcoming articles: I will be talking about shared linking, performance improvements and a new slicing syntax!