Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.12.

Questions about the performance of the encoding/json package is a recurrent topic and multiple libraries like easyjson, jsoniter or ffjson are trying to address this issue. But is it really slow? Has it been improved?

Evolution of the package

Let’s look first at the performance evolution of the library. I made a small makefile with a benchmark file in order to run it against all versions of go:

type JSON struct {

Foo int

Bar string

Baz float64

}



func BenchmarkJsonMarshall(b *testing.B) {

j := JSON{

Foo: 123,

Bar: `benchmark`,

Baz: 123.456,

}

b.ResetTimer()

for i := 0; i < b.N; i++ {

_, _ = json.Marshal(&j)

}

}



func BenchmarkJsonUnmarshal(b *testing.B) {

bytes := `{"foo": 1, "bar": "my string", bar: 1.123}`

str := []byte(bytes)

b.ResetTimer()

for i := 0; i < b.N; i++ {

j := JSON{}

_ = json.Unmarshal(str, &j)

}

}

The makefile creates a folder for each version of go, creates a container based on its docker image, and runs the benchmark. The results are compared in two ways:

each version VS the last version of go 1.12

each version VS the next version

The first comparison allows us to check the evolution from a specific version against the last one, while the second analysis allows us to know which one brought the most improvements into the encoding/json package.

Here are the most significant results:

from 1.2.0 to 1.3.0, the time for each operation has reduced by ~25/35%:

name old time/op new time/op delta

JsonMarshall 1.91µs ± 2% 1.37µs ± 2% -28.23%

JsonUnmarshal 2.70µs ± 2% 1.75µs ± 3% -35.18%

from 1.6.0 to 1.7.0, the time for each operation has reduced by ~25/40%:

name old time/op new time/op delta

JsonMarshall-4 1.24µs ± 1% 0.90µs ± 2% -27.65%

JsonUnmarshal-4 1.52µs ± 3% 0.91µs ± 2% -40.05%

from 1.10.0 to 1.11.0, the memory allocation has reduced by ~25/60%:

name old alloc/op new alloc/op delta

JsonMarshall-4 208B ± 0% 80B ± 0% -61.54%

JsonUnmarshal-4 496B ± 0% 368B ± 0% -25.81%

from 1.11.0 to 1.12.0, the time for each operation has reduced by ~5/15%:

name old time/op new time/op delta

JsonMarshall-4 670ns ± 6% 569ns ± 2% -15.09%

JsonUnmarshal-4 800ns ± 1% 747ns ± 1% -6.58%

The full report is available on github for Marshall and Unmarshall.

If we check from 1.2.0 to 1.12.0, the performances have significantly improved:

name old time/op new time/op delta

JsonMarshall 1.72µs ± 2% 0.52µs ± 2% -69.68%

JsonUnmarshal 2.72µs ± 2% 0.85µs ± 5% -68.70% name old alloc/op new alloc/op delta

JsonMarshall 188B ± 0% 48B ± 0% -74.47%

JsonUnmarshal 519B ± 0% 368B ± 0% -29.09%

The benchmark has been done with a simple struct. The deltas could be different with a different value to encode/decode things such as a map or an array or even a bigger struct.

Dive into the code

The best way to understand why it seems slower is to dive into the code. Here is the flow of the Marshal method in go 1.12:

Marshal operation

Now that we know the flow, let’s compare the code of the versions 1.10 and 1.12 since we have seen there was a huge improvement on the memory during the Marshal process. The first modification that we see is related to the first step of the flow when the encoder is retrieved from the cache:

The sync.Pool has been added here in order to share the encoder and reduce the number of allocations. The method newEncodeState() already existed in 1.10 but was not used. To confirm that, we can just replace this piece of code in go 1.10 and check the new result:

name old alloc/op new alloc/op delta

CodeMarshal-4 4.59MB ± 0% 1.98MB ± 0% -56.92%

In order to run the benchmark with the Go repository, just go to the folder of the lib and run:

go test encoding/json -bench=BenchmarkCodeMarshal -benchmem -count=10 -run=^$

As we can see, the impact of the sync package is huge and should be considered in your project when you allocate the same struct intensively.

Regarding the the Unmarshal method, here is the flow in go 1.12:

Unmarshal operation

Each of the flows are pretty optimized with a cache strategy — thanks to sync package — and we can see that the part regarding the reflection and the iteration on each fields is the bottleneck of the package.

Alternatives and performances

There are many alternatives in the Go community. ffjson, that is one of them, generates static MarshalJSON and UnmarshalJSON functions that are called from a similar API: ffjson.Marshal and ffjson.Unmarshal . The generated methods look like this:

func (j *JSONFF) MarshalJSON() ([]byte, error) {

var buf fflib.Buffer

if j == nil {

buf.WriteString("null")

return buf.Bytes(), nil

}

err := j.MarshalJSONBuf(&buf)

if err != nil {

return nil, err

}

return buf.Bytes(), nil

}



// MarshalJSONBuf marshal buff to json - template

func (j *JSONFF) MarshalJSONBuf(buf fflib.EncodingBuffer) error {

if j == nil {

buf.WriteString("null")

return nil

}

var err error

var obj []byte

_ = obj

_ = err

buf.WriteString(`{"Foo":`)

fflib.FormatBits2(buf, uint64(j.Foo), 10, j.Foo < 0)

buf.WriteString(`,"Bar":`)

fflib.WriteJsonString(buf, string(j.Bar))

buf.WriteString(`,"Baz":`)

fflib.AppendFloat(buf, float64(j.Baz), 'g', -1, 64)

buf.WriteByte('}')

return nil

}

Let’s now compare the benchmark between the standard library and ffjson (with usage of ffjson.Pool()):

standard lib:

name time/op

JsonMarshall-4 500ns ± 2%

JsonUnmarshal-4 677ns ± 2% name alloc/op

JsonMarshall-4 48.0B ± 0%

JsonUnmarshal-4 320B ± 0% ffjson:

name time/op

JsonMarshallFF-4 538ns ± 1%

JsonUnmarshalFF-4 827ns ± 3% name alloc/op

JsonMarshallFF-4 176B ± 0%

JsonUnmarshalFF-4 448B ± 0%

For Marshaling or Unmarshaling, it looks like that the native library is more efficient.

Regarding the higher usage of memory, we can see with the compiler go run -gcflags="-m" some variables will be allocated to the heap:

:46:19: buf escapes to heap

:48:23: buf escapes to heap

:27:26: &buf escapes to heap

:22:6: moved to heap: buf

Let’s have a look at another one: easyjson. It uses the same strategy. Here is the benchmark:

standard lib:

name time/op

JsonMarshall-4 500ns ± 2%

JsonUnmarshal-4 677ns ± 2% name alloc/op

JsonMarshall-4 48.0B ± 0%

JsonUnmarshal-4 320B ± 0% easyjson:

name time/op

JsonMarshallEJ-4 349ns ± 1%

JsonUnmarshalEJ-4 341ns ± 5% name alloc/op

JsonMarshallEJ-4 240B ± 0%

JsonUnmarshalEJ-4 256B ± 0%

This time, it seems than easyjson is much faster, 30% for the Marshalling and almost 2 times faster for the Unmarshalling. Everything make sense if we look at the easyjson.Marshal method provided by the library:

func Marshal(v Marshaler) ([]byte, error) {

w := jwriter.Writer{}

v.MarshalEasyJSON(&w)

return w.BuildBytes()

}

The method MarshalEasyJSON is generated by the library in order to print the JSON:

func easyjson42239ddeEncode(out *jwriter.Writer, in JSON) {

out.RawByte('{')

first := true

_ = first

{

const prefix string = ",\"Foo\":"

if first {

first = false

out.RawString(prefix[1:])

} else {

out.RawString(prefix)

}

out.Int(int(in.Foo))

}

{

const prefix string = ",\"Bar\":"

if first {

first = false

out.RawString(prefix[1:])

} else {

out.RawString(prefix)

}

out.String(string(in.Bar))

}

{

const prefix string = ",\"Baz\":"

if first {

first = false

out.RawString(prefix[1:])

} else {

out.RawString(prefix)

}

out.Float64(float64(in.Baz))

}

out.RawByte('}')

}



func (v JSON) MarshalEasyJSON(w *jwriter.Writer) {

easyjson42239ddeEncode(w, v)

}

As we can see, there is no more reflection. The flow is pretty straightforward. Also, the library provide compatibility with the native JSON library:

func (v JSON) MarshalJSON() ([]byte, error) {

w := jwriter.Writer{}

easyjson42239ddeEncodeGithubComMyCRMTeamEncodingJsonEasyjson(&w, v)

return w.Buffer.BuildBytes(), w.Error

}

However, the performances here will be worse than the native library since the native flow will be applied and only this small part of code will be run during the Marshalling.

Conclusion

If many efforts have been done on the standard library, it could be never be as fast as a library that dumps the generation of the JSON. The negative points here are that you will have to maintain this code generation and remain dependent on an external library.

Prior to making any decision about switching from the standard library, you should measure how the json Marshalling/Unmarshalling impacts your application and if a gain of performance could drastically improve the performance of your whole application. If it represents only a small percentage, it is maybe not worth it, the standard library is now efficient enough in most of the cases.