Sunday September the 18th marks a month since the Go 1.8 cycle opened officially. I’m passionate about the performance of Go programs, and of the compiler itself. This post is a brief look at the state of play, roughly 1/2 way into the development cycle for Go 1.81.

Note: these results are of course preliminary and represent only a point in time, not the performance of the final Go 1.8 release.

Compile times

Nothing much to report here. Using the methodology from my previous Go 1.7 benchmarks, there is a 3.22%–5.11% improvement in full compile time compared to Go 1.7.

Performance improvements

Intel amd64

Better code generation and small improvements to the runtime and standard library show some small improvements for amd642, but really nothing to write home about yet.

name old time/op new time/op delta BinaryTree17-4 3.07s ± 2% 3.06s ± 2% ~ (p=0.661 n=10+9) Fannkuch11-4 3.23s ± 1% 3.22s ± 0% -0.43% (p=0.008 n=9+10) FmtFprintfEmpty-4 64.4ns ± 0% 61.8ns ± 4% -4.17% (p=0.005 n=9+10) FmtFprintfString-4 162ns ± 0% 162ns ± 0% ~ (p=0.065 n=10+9) FmtFprintfInt-4 142ns ± 0% 142ns ± 0% ~ (p=0.137 n=8+10) FmtFprintfIntInt-4 220ns ± 0% 217ns ± 0% -1.18% (p=0.000 n=9+10) FmtFprintfPrefixedInt-4 224ns ± 0% 224ns ± 1% ~ (p=0.206 n=9+9) FmtFprintfFloat-4 313ns ± 0% 312ns ± 0% -0.26% (p=0.001 n=10+9) FmtManyArgs-4 906ns ± 0% 894ns ± 0% -1.32% (p=0.000 n=7+6) GobDecode-4 8.88ms ± 1% 8.81ms ± 0% -0.81% (p=0.003 n=10+10) GobEncode-4 7.93ms ± 1% 7.88ms ± 0% -0.66% (p=0.008 n=9+10) Gzip-4 272ms ± 1% 277ms ± 0% +1.95% (p=0.000 n=10+9) Gunzip-4 47.4ms ± 0% 47.4ms ± 0% ~ (p=0.720 n=9+10) HTTPClientServer-4 201µs ± 4% 202µs ± 2% ~ (p=0.631 n=10+10) JSONEncode-4 19.3ms ± 0% 19.3ms ± 0% ~ (p=0.063 n=10+10) JSONDecode-4 61.0ms ± 0% 61.2ms ± 0% +0.33% (p=0.000 n=10+8) Mandelbrot200-4 5.20ms ± 0% 5.20ms ± 0% ~ (p=0.475 n=10+7) GoParse-4 3.95ms ± 1% 3.97ms ± 1% +0.65% (p=0.003 n=9+9) RegexpMatchEasy0_32-4 88.4ns ± 0% 88.7ns ± 0% +0.34% (p=0.001 n=10+9) RegexpMatchEasy0_1K-4 1.14µs ± 0% 1.14µs ± 0% ~ (p=0.369 n=9+6) RegexpMatchEasy1_32-4 82.6ns ± 0% 82.0ns ± 0% -0.70% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 469ns ± 0% 463ns ± 0% -1.23% (p=0.000 n=6+9) RegexpMatchMedium_32-4 138ns ± 1% 136ns ± 0% -1.38% (p=0.000 n=10+9) RegexpMatchMedium_1K-4 43.6µs ± 1% 42.0µs ± 0% -3.74% (p=0.000 n=9+9) RegexpMatchHard_32-4 2.25µs ± 1% 2.23µs ± 0% -0.57% (p=0.000 n=8+8) RegexpMatchHard_1K-4 68.8µs ± 0% 68.6µs ± 0% -0.37% (p=0.000 n=8+8) Revcomp-4 477ms ± 1% 472ms ± 0% -1.03% (p=0.000 n=8+8) Template-4 76.1ms ± 0% 76.4ms ± 0% +0.35% (p=0.000 n=9+9) TimeParse-4 367ns ± 0% 366ns ± 0% -0.16% (p=0.003 n=10+8) TimeFormat-4 386ns ± 0% 384ns ± 0% -0.58% (p=0.000 n=9+9) name old speed new speed delta GobDecode-4 86.4MB/s ± 1% 87.1MB/s ± 0% +0.81% (p=0.003 n=10+10) GobEncode-4 96.7MB/s ± 1% 97.4MB/s ± 0% +0.66% (p=0.007 n=9+10) Gzip-4 71.4MB/s ± 1% 70.0MB/s ± 0% -1.91% (p=0.000 n=10+9) Gunzip-4 409MB/s ± 0% 410MB/s ± 0% ~ (p=0.703 n=9+10) JSONEncode-4 101MB/s ± 0% 100MB/s ± 0% ~ (p=0.084 n=10+10) JSONDecode-4 31.8MB/s ± 0% 31.7MB/s ± 0% -0.33% (p=0.000 n=10+8) GoParse-4 14.7MB/s ± 1% 14.6MB/s ± 1% -0.67% (p=0.002 n=9+9) RegexpMatchEasy0_32-4 362MB/s ± 0% 361MB/s ± 0% -0.36% (p=0.000 n=10+9) RegexpMatchEasy0_1K-4 898MB/s ± 0% 898MB/s ± 0% ~ (p=0.762 n=9+8) RegexpMatchEasy1_32-4 387MB/s ± 0% 390MB/s ± 0% +0.70% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 2.18GB/s ± 0% 2.21GB/s ± 0% +1.20% (p=0.000 n=9+9) RegexpMatchMedium_32-4 7.23MB/s ± 1% 7.32MB/s ± 0% +1.19% (p=0.000 n=10+9) RegexpMatchMedium_1K-4 23.5MB/s ± 1% 24.4MB/s ± 0% +3.88% (p=0.000 n=9+9) RegexpMatchHard_32-4 14.2MB/s ± 1% 14.3MB/s ± 0% +0.58% (p=0.000 n=8+8) RegexpMatchHard_1K-4 14.9MB/s ± 0% 14.9MB/s ± 0% +0.34% (p=0.000 n=8+7) Revcomp-4 533MB/s ± 1% 539MB/s ± 0% +1.04% (p=0.000 n=8+8) Template-4 25.5MB/s ± 0% 25.4MB/s ± 0% -0.36% (p=0.000 n=9+9)

ARM

The major improvement that landed recently in the development branch is the conversion of the remaining architecture backends to use the compiler’s SSA form. This has brought a substantial improvement in generated code for non Intel architectures, like ARM3.

name old time/op new time/op delta BinaryTree17-4 33.8s ± 1% 27.7s ± 0% -18.06% (p=0.000 n=10+10) Fannkuch11-4 42.0s ± 0% 19.3s ± 0% -54.10% (p=0.000 n=10+10) FmtFprintfEmpty-4 670ns ± 1% 581ns ± 1% -13.30% (p=0.000 n=10+10) FmtFprintfString-4 2.04µs ± 1% 1.65µs ± 0% -19.09% (p=0.000 n=10+10) FmtFprintfInt-4 1.71µs ± 0% 1.21µs ± 0% -29.39% (p=0.000 n=10+9) FmtFprintfIntInt-4 2.69µs ± 1% 1.94µs ± 0% -27.77% (p=0.000 n=10+10) FmtFprintfPrefixedInt-4 2.70µs ± 0% 1.85µs ± 0% -31.41% (p=0.000 n=10+9) FmtFprintfFloat-4 5.15µs ± 0% 3.65µs ± 0% -29.01% (p=0.000 n=9+10) FmtManyArgs-4 11.3µs ± 0% 8.5µs ± 0% -24.79% (p=0.000 n=10+9) GobDecode-4 112ms ± 0% 77ms ± 1% -31.04% (p=0.000 n=9+9) GobEncode-4 88.5ms ± 1% 77.2ms ± 1% -12.78% (p=0.000 n=10+10) Gzip-4 4.79s ± 0% 3.34s ± 0% -30.18% (p=0.000 n=9+9) Gunzip-4 702ms ± 0% 463ms ± 0% -34.05% (p=0.000 n=10+10) HTTPClientServer-4 645µs ± 3% 571µs ± 3% -11.45% (p=0.000 n=10+10) JSONEncode-4 227ms ± 0% 186ms ± 0% -18.16% (p=0.000 n=10+10) JSONDecode-4 845ms ± 0% 618ms ± 0% -26.81% (p=0.000 n=10+10) Mandelbrot200-4 59.3ms ± 0% 40.0ms ± 0% -32.47% (p=0.000 n=10+10) GoParse-4 45.0ms ± 0% 37.0ms ± 0% -17.68% (p=0.000 n=9+9) RegexpMatchEasy0_32-4 974ns ± 0% 878ns ± 0% -9.81% (p=0.000 n=10+9) RegexpMatchEasy0_1K-4 4.60µs ± 0% 4.48µs ± 0% -2.57% (p=0.000 n=10+10) RegexpMatchEasy1_32-4 1.02µs ± 0% 0.94µs ± 0% -8.08% (p=0.000 n=8+10) RegexpMatchEasy1_1K-4 6.92µs ± 0% 6.08µs ± 0% -12.10% (p=0.000 n=10+10) RegexpMatchMedium_32-4 1.61µs ± 0% 1.27µs ± 0% -20.98% (p=0.000 n=9+6) RegexpMatchMedium_1K-4 447µs ± 0% 317µs ± 0% -29.05% (p=0.000 n=10+9) RegexpMatchHard_32-4 24.9µs ± 0% 18.4µs ± 0% -25.89% (p=0.000 n=10+10) RegexpMatchHard_1K-4 740µs ± 0% 552µs ± 0% -25.36% (p=0.000 n=10+10) Revcomp-4 81.0ms ± 1% 65.2ms ± 0% -19.53% (p=0.000 n=9+9) Template-4 1.17s ± 0% 0.81s ± 0% -31.28% (p=0.000 n=9+9) TimeParse-4 5.52µs ± 0% 3.79µs ± 0% -31.42% (p=0.000 n=10+9) TimeFormat-4 10.6µs ± 0% 8.5µs ± 0% -19.14% (p=0.000 n=10+10) name old speed new speed delta GobDecode-4 6.86MB/s ± 0% 9.95MB/s ± 1% +45.00% (p=0.000 n=9+9) GobEncode-4 8.67MB/s ± 1% 9.94MB/s ± 1% +14.69% (p=0.000 n=10+10) Gzip-4 4.05MB/s ± 0% 5.81MB/s ± 0% +43.32% (p=0.000 n=10+9) Gunzip-4 27.6MB/s ± 0% 41.9MB/s ± 0% +51.63% (p=0.000 n=10+10) JSONEncode-4 8.53MB/s ± 0% 10.43MB/s ± 0% +22.20% (p=0.000 n=10+10) JSONDecode-4 2.30MB/s ± 0% 3.14MB/s ± 0% +36.39% (p=0.000 n=9+10) GoParse-4 1.29MB/s ± 0% 1.56MB/s ± 0% +20.93% (p=0.000 n=9+10) RegexpMatchEasy0_32-4 32.8MB/s ± 0% 36.4MB/s ± 0% +10.87% (p=0.000 n=10+10) RegexpMatchEasy0_1K-4 222MB/s ± 0% 228MB/s ± 0% +2.64% (p=0.000 n=10+10) RegexpMatchEasy1_32-4 31.3MB/s ± 0% 34.0MB/s ± 0% +8.75% (p=0.000 n=9+10) RegexpMatchEasy1_1K-4 148MB/s ± 0% 168MB/s ± 0% +13.76% (p=0.000 n=10+10) RegexpMatchMedium_32-4 620kB/s ± 0% 790kB/s ± 0% +27.42% (p=0.000 n=10+8) RegexpMatchMedium_1K-4 2.29MB/s ± 0% 3.23MB/s ± 0% +41.05% (p=0.000 n=10+10) RegexpMatchHard_32-4 1.29MB/s ± 0% 1.74MB/s ± 0% +34.88% (p=0.000 n=9+10) RegexpMatchHard_1K-4 1.38MB/s ± 0% 1.85MB/s ± 0% +34.06% (p=0.000 n=10+10) Revcomp-4 31.4MB/s ± 1% 39.0MB/s ± 0% +24.26% (p=0.000 n=9+9) Template-4 1.65MB/s ± 0% 2.41MB/s ± 0% +45.71% (p=0.000 n=10+9)

Notes: