Go: Easy to Read, Hard to Compile Corner cases when compiling Go Ian Lance Taylor Google

Introduction To really learn a language, write a compiler for it

Compiler bugs imply language complexity

Or, compiler bugs imply differences from C/C++

Sometimes simpler for users is harder for compilers

Fortunately Go is much simpler to compile than C++ or even C This talk is based on Go compiler bugs encountered over the years. 2

Recursive Types Names in Go packages are defined in the entire package, so Go types

can refer to themselves recursively. type P *P type S []S type C chan C type M map[int]M This is not permitted in C/C++, except for the special case of a

struct/union/class field which is a pointer/reference. All Go compiler code that walks over types has to be careful to avoid

endless loops. 3

Recursive Types What good is a recursive pointer type? It can only be nil or a

pointer to itself. That's enough for Peano arithmetic. func Val(p *P) int { if p == nil { return 0 } else { return 1 + Val(*p) } } func Add(a, b *P) *P { if b == nil { return a } else { a1 := new(P) *a1 = a // a1 == a + 1 return Add(a1, *b) // a + b == Add(a+1, b-1) } } 4

Recursive Types Actually, a recursive pointer can have a bit more information: it can

have a finalizer. // +build ignore package main import ( "fmt" "runtime" ) // 1 START OMIT type P *P type S []S type C chan C type M map[int]M // 1 END OMIT // 2 START OMIT func Val(p *P) int { if p == nil { return 0 } else { return 1 + Val(*p) } } func Add(a, b *P) *P { if b == nil { return a } else { a1 := new(P) *a1 = a // a1 == a + 1 return Add(a1, *b) // a + b == Add(a+1, b-1) } } // 2 END OMIT func Print(p *P) { fmt.Println(Val(p)) } func Allocate() { p := new(P); *p = new(P); **p = new(P) runtime.SetFinalizer(p, Print) runtime.SetFinalizer(*p, Print) runtime.SetFinalizer(**p, Print) } func main() { Allocate() for i := 0; i < 5; i++ { runtime.GC() runtime.Gosched() } } 5

Recursive Types Recursive function types are actually useful: they can implement a

state machine. type F func(*State) F type State int func Begin(s *State) F { *s = 1 return Middle } func Middle(s *State) F { *s++ if *s >= 10 { return End } return Middle } 6

Recursive types // +build ignore package main import "fmt" // 1 START OMIT type F func(*State) F type State int func Begin(s *State) F { *s = 1 return Middle } func Middle(s *State) F { *s++ if *s >= 10 { return End } return Middle } // 1 END OMIT func End(s *State) F { fmt.Println(*s) return nil } func main() { var f F = Begin var s State for f != nil { f = f(&s) } } 7

Recursive Types Simple rule: all names at package scope are visible in the entire

package. Complex consequence: compiler must handle recursive types (also

recursive initializers). 8

Constants Go has both typed and untyped constants. They follow the same rules,

except that a typed constant must be representable in its type. This is reasonably clear for integers, less so for floats. // +build ignore package main import "fmt" const C1 = 1e-323 const C2 = C1 / 100 const C3 = C2 * 100 const C4 float64 = C1 / 100 const C5 = C4 * 100 func main() { fmt.Println(C3, C5) } 9

Constants Go's floating point variables follow IEEE-754 rules. Constants do not. // +build ignore package main import "fmt" const C1 = 1e+308 const C2 = C1 * 10 const C3 = C2 / 10 var V1 = C1 var V2 = V1 * 10 var V3 = V2 / 10 func main() { fmt.Println(C3, V3) } 10

Constants The special unsafe.Sizeof function returns a constant. // +build ignore package main import ( "fmt" "unsafe" ) var V1 = 0x01020304 var V2 [unsafe.Sizeof(V1)]byte func main() { *(*int)(unsafe.Pointer(&V2)) = V1 fmt.Println(V2) } 11

Constants Simple rule: constants are untyped; they are mathematically exact and

do not require type conversions. Complex consequence: exact floating point behavior depends on the

type. 12

Name Lookup Name lookup in a Go compiler is simple compared to many languages.

For every name the scope in which to look it up is obvious. This

makes parsing Go quite simple. With one exception. What is the scope for i? func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) } 13

Name Lookup One possibility. // +build ignore package main import "fmt" func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) } type T map[int]int 14

Name Lookup Another possibility. // +build ignore package main import "fmt" func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) } type T struct{ i int } 15

Name Lookup Simple rule: in a struct composite literal you can use field names as

keys. Complex consequence: if you don't know the type of the composite

literal, the lookup scope of names used as keys is unclear when

parsing. 16

Methods Any named type can have methods. Any struct type can inherit methods

from an embedded field. It follows that you can sometimes call

methods on a variable even if it has an unnamed type. // +build ignore package main import ( "fmt" "os" ) var V = struct { name string os.FileMode }{ name: "hello.go", } func main() { fmt.Println(V) } 17

Methods Simple rules: named types can have methods; structs can have embedded

fields. Complex consequence: unnamed types can have methods. 18

Conclusion Go is simpler to compile than most languages

There are still complexities for the compiler

Most complexities stem from making Go easier to write 19