Bad Go: frivolous Sprintf

Sprintf is not always fastf

This is the 3rd in a series of posts about Bad Go - a clickbaity title for posts about Go that I’ve found frustrating because it could just be that little bit better. Better in my mind is often more performant with less impact on GC without being more complex or harder to read.

The first two posts are about slices of pointers and pointer returns from functions

This one is about reaching for fmt.Sprintf to convert types to strings. Things like fmt.Sprintf("%d", aNumber) and fmt.Sprintf("%t", aBoolean) . Or even worse tag := "working:" + fmt.Sprintf("%t", isWorking) .

Is this bad Go? Not really. But it annoys me beyond reason. I should really just chill out and get over it. Before I go and do that, let’s go over why it annoys me in excruciating detail.

The reason I don’t like this is not just that it is inefficient. It is needlessly inefficient. The more efficient versions are not more difficult to write or to understand.

Why is it inefficient?

fmt.Sprintf needs to parse its first argument to understand what to do. But in these examples the programmer knows there’s just a single very simple task required: convert this number to a string, or this boolean to a string. Why not just do that directly? fmt.Sprintf needs to marshal its variable arguments into a slice of interface{}. The compiler has improved a lot here, but this could cause unnecessary allocations. There are simpler functions for the same task that don’t need variable arguments or interface arguments. But this isn’t all bashing fmt.Sprintf . fmt.Sprintf is great. But when you use it, use it. Build the whole string. Don’t build tiny pieces and concatenate them. Concatenating strings causes allocations. Allocations are the bane of the life of anyone trying to squeeze performance out of Go.

What should we do instead? Well, for that first case fmt.Sprintf("%d", aNumber) , we can just replace it with strconv.FormatInt . Let’s write a quick benchmark to compare converting a number with fmt.Sprintf and strconv.FormatInt .

func BenchmarkSprintfNumber (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = fmt. Sprintf ( "%d" , i) } } func BenchmarkSprintfStrconvNumber (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = strconv. FormatInt ( int64 (i), 10 ) } }

Next we run the benchmarks and feed the results into benchstat

go test -run ^$ -bench BenchmarkSprintf -count 8 | tee sprint1.txt benchstat sprint1.txt name time/op SprintfNumber-8 133ns ± 2% SprintfStrconvNumber-8 45.1ns ± 2% name alloc/op SprintfNumber-8 32.0B ± 0% SprintfStrconvNumber-8 23.0B ± 0% name allocs/op SprintfNumber-8 2.00 ± 0% SprintfStrconvNumber-8 0.00

So fmt.Sprintf takes quite a bit longer and allocates more memory. I struggle to believe that the strconv.FormatInt case allocates no memory. I think it’s doing just under 1 allocation per operation, as it has a lookup table for strings for 0-99 so doesn’t allocate in those cases. (and it’s an easy exercise for the reader to change the benchmark to use numbers greater than 100 to prove this).

The fmt.Sprintf case is already benefitting from lots of compiler cleverness. The function takes variable interface{} arguments, so in previous versions of the compiler there would be an allocation for the slice to carry the arguments and an allocation to convert the integer argument to an interface{} . That doesn’t seem to be the case anymore and the difference between these two options used to be starker in earlier versions of Go.

What about fmt.Sprintf("%t", aBoolean) ? Well, in this case the difference is pretty stark. fmt.Sprintf has all kinds of work to do and strconv.FormatBool is an extremely simple function. We can see the difference in some simple benchmarks.

func BenchmarkBoolSprintf (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = fmt. Sprintf ( "%t" , i & 1 == 0 ) } } func BenchmarkBoolStrconv (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = strconv. FormatBool (i & 1 == 0 ) } }

Here are the results - strconv.FormatBool wins hands down.

name time/op BoolSprintf-8 85.9ns ± 7% BoolStrconv-8 14.0ns ± 5% name alloc/op BoolSprintf-8 24.0B ± 0% BoolStrconv-8 16.0B ± 0% name allocs/op BoolSprintf-8 1.00 ± 0% BoolStrconv-8 0.00

Finally, why do I find tag := "working:" + fmt.Sprintf("%t", isWorking) so particularly annoying? Well, what if we wrote tag := fmt.Sprintf("working:%t", isWorking) ? That’s using fmt.Sprintf as it is supposed to be used - to build moderately complex strings. Isn’t that just much nicer? And it doesn’t incur an extra allocation concatenating the two strings.

fmt.Sprintf used like that doesn’t make me angry. But for a really hot loop there’s a more efficient way. It’s a bool so there are only two options for the strings required. So we could write the following.

var tag string if isWorking { tag = "working:true" } else { tag = "working:false" }

We can compare these three approaches with some simple benchmarks.

func BenchmarkBoolTagSprintfAdd (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = "working:" + fmt. Sprintf ( "%t" , i & 1 == 0 ) } } func BenchmarkBoolTagSprintf (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { vals[i] = fmt. Sprintf ( "working:%t" , i & 1 == 0 ) } } func BenchmarkBoolTagIf (b * testing.B) { b. ReportAllocs () vals := make ([] string , b.N) for i := 0 ; i < b.N; i ++ { isWorking := i & 1 == 0 if isWorking { vals[i] = "working:true" } else { vals[i] = "working:false" } } }

And here are the results (again run 8 times and fed through benchstat).

name time/op BoolTagSprintfAdd-8 133ns ± 3% BoolTagSprintf-8 106ns ± 2% BoolTagIf-8 13.4ns ± 5% name alloc/op BoolTagSprintfAdd-8 40.0B ± 0% BoolTagSprintf-8 32.0B ± 0% BoolTagIf-8 16.0B ± 0% name allocs/op BoolTagSprintfAdd-8 2.00 ± 0% BoolTagSprintf-8 1.00 ± 0% BoolTagIf-8 0.00

All these results are in nanoseconds, so really none of this matters terribly much by itself. But all these inefficiencies will eventually add up, so why not use the more efficient versions? They aren’t harder to read or to write. And if your code somehow eventually ends up at the heart of a massive machine learning pipeline, it won’t cause nearly as much grief!