Go in Go Gopherfest 26 May 2015 Rob Pike Google

Go in Go As of the 1.5 release of Go, the entire system is now written in Go.

(And a little assembler.) C is gone. Side note: gccgo is still going strong.

This talk is about the original compiler, gc . 2

Why was it in C? Bootstrapping. (Also Go was not intended primarily as a compiler implementation language.) 3

Why move the compiler to Go? Not for validation; we have more pragmatic motives: Go is easier to write (correctly) than C.

Go is easier to debug than C (even absent a debugger).

Go is the only language you'd need to know; encourages contributions.

Go has better modularity, tooling, testing, profiling, ...

Go makes parallel execution trivial. Already seeing benefits, and it's early yet. Design document: golang.org/s/go13compiler 4

Why move the runtime to Go? We had our own C compiler just to compile the runtime.

We needed a compiler with the same ABI as Go, such as segmented stacks. Switching it to Go means we can get rid of the C compiler.

That's more important than converting the compiler to Go. (All the reasons for moving the compiler apply to the runtime as well.) Now only one language in the runtime; easier integration, stack management, etc. As always, simplicity is the overriding consideration. 5

History Why do we have our own tool chain at all?

Our own ABI?

Our own file formats? History, familiarity, and ease of moving forward. And speed. Many of Go's big changes would be much harder with GCC or LLVM. news.ycombinator.com/item?id=8817990 6

Big changes All made easier by owning the tools and/or moving to Go: linker rearchitecture

new garbage collector

stack maps

contiguous stacks

write barriers The last three are all but impossible in C: C is not type safe; don't always know what's a pointer

aliasing of stack slots caused by optimization ( Gccgo will have segmented stacks and imprecise (stack) collection for a while yet.) 7

Goroutine stacks Until 1.2: Stacks were segmented.

1.3: Stacks were contiguous unless executing C code (runtime).

1.4: Stacks made contiguous by restricting C to system stack.

1.5: Stacks made contiguous by eliminating C. These were each huge steps, made quickly (led by khr@ ). 8

Converting the runtime Mostly done by hand with machine assistance. Challenge to implement the runtime in a safe language.

Some use of unsafe to deal with pointers as raw bits in the GC, for instance.

But less than you might think. The translator (next sections) helped for some of the translation. 9

Converting the compiler Why translate it, not write it from scratch? Correctness, testing. Steps: Write a custom translator from C to Go.

Run the translator, iterate until success.

Measure success by bit-identical output.

Clean up the code by hand and by machine.

Turn it from C-in-Go to idiomatic Go (still happening). 10

Translator First output was C line-by-line translated to (bad!) Go.

Tool to do this written by rsc@ (talked about at GopherCon 2014).

Custom written for this job, not a general C-to-Go translator. Steps: Parse C code using new simple C parser ( yacc )

) Remove or rewrite C-isms such as *p++ as an expression

as an expression Walk the C parse tree, print the C code in Go syntax

Compile the output

Run, compare generated code

Repeat The Yacc grammar was translated by sam-powered hands. 11

Translator configuration Aided by hand-written rewrite rules, such as: this field is a bool

this function returns a bool Also diff-like rewrites for things such as using the standard library: diff { - g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow) - idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32) - if g.Rpo == nil || idom == nil { - Fatal("out of memory") - } + g.Rpo = make([]*Flow, g.Num) + idom = make([]int32, g.Num) } 12

Another example This one due to semantic difference between the languages. diff { - if nreg == 64 { - mask = ^0 // can't rely on C to shift by 64 - } else { - mask = (1 << uint(nreg)) - 1 - } + mask = (1 << uint(nreg)) - 1 } 13

Grind Once in Go, new tool grind deployed (by rsc@ ): parses Go, type checks

records a list of edits to perform: "insert this text at this position"

at end, applies edits to source (hard to edit AST). Changes guided by profiling and other analysis: removes dead code

removes gotos

removes unused labels, needless indirections, etc.

moves var declarations nearer to first use rsc.io/grind 14

Performance problems Output from translator was poor Go, and ran about 10X slower.

Most of that slowdown has been recovered. Problems with C to Go: C patterns can be poor Go; e.g.: complex for loops

loops C stack variables never escape; Go compiler isn't as sure

interfaces such as fmt.Stringer vs. C's varargs

vs. C's no unions in Go, so use structs instead: bloat

in Go, so use instead: bloat variable declarations in wrong place C compiler didn't free much memory, but Go has a GC.

Adds CPU and memory overhead. 15

Performance fixes Profile! (Never done before!) move vars closer to first use

closer to first use split vars into multiple

into multiple replace code in the compiler with code in the library: e.g. math/big

use interface or other tricks to combine struct fields

fields better escape analysis ( drchase@ ).

). hand tuning code and data layout Use tools like grind , gofmt -r and eg for much of this. Removing interface argument from a debugging print library got 15% overall! More remains to be done. 16

Technical benefits Other benefits of the conversion: Garbage collection means no more worry about introducing a dangling pointer. Chance to clean up the back ends. Unified 386 and amd64 architectures throughout the tool chain. New architectures are easier to add. Unified the tools: now one compiler, one assembler, one linker. 17

Compiler GOOS=YYY GOARCH=XXX go tool compile One compiler; no more 6g , 8g etc. About 50K lines of portable code.

Even the registerizer is portable now; architectures well characterized.

Non-portable: Peepholing, details like registers bound to instructions.

Typically around 10% of the portable LOC. 18

Assembler GOOS=YYY GOARCH=XXX go tool asm New assembler, all in Go, written from scratch by r@ .

Clean, idiomatic Go code. Less than 4000 lines, <10% machine-dependent. Almost completely compatible with previous yacc and C assemblers. How is this possible? shared syntax originating in the Plan 9 assemblers

unified back-end logic (old liblink , now internal/obj ) 19

Linker GOOS=YYY GOARCH=XXX go tool link Mostly hand- and machine- translated from C code. New library, internal/obj , part of original linker, captures details about machines, writes object files. 27000 lines summed across 4 architectures, mostly tables (plus some ugliness). arm : 4000

: 4000 arm64 : 6000

: 6000 ppc64 : 5000

: 5000 x86 : 7500 ( 386 and amd64 ) Example benefit: one print routine to print any instruction for any architecture. 20

Bootstrap With no C compiler, bootstrapping requires a Go compiler. Therefore need to build or download a working Go installation to build 1.5 from source. We use Go 1.4+ as the base to build the 1.5+ tool chain. (Newer is OK too.) Details: golang.org/s/go15bootstrap 21

Future Much work still to do, but 1.5 is mostly set. Future work: Better escape analysis.

New compiler back end using SSA (much easier in Go than C).

Will allow much more optimization. Generate machine descriptions from PDFs (or maybe XML).

Will have a purely machine-generated instruction definition:

"Read in PDF, write out an assembler configuration".

Already deployed for the disassemblers. 22

Conclusions Getting rid of C was a huge advance for the project.

Code is cleaner, testable, profilable, easier to work on. New unified tool chain reduces code size, increases maintainability. Flexible tool chain, portability still paramount. 23