February 04, 2019 at 06:21 Tags Go

Recently I've gotten into answering Go questions on StackOverflow, and one of the patterns I noticed are many repetitive questions about JSON processing. The goal of this post is to collect a "cookbook" of JSON processing code and examples; think of it as a vastly expanded version of the JSON gobyexample page. It's a living document - I will update it once in a while when I find new patterns/problems people ask about.

The code samples here should be reasonably self-contained; if you want actual code, full go run -able files are available here.

Some background on JSON The acronym JSON stands for JavaScript Object Notation. It originally came into existence as a serialization format for JS, and can actually be considered a subset of JS. That said, it's no longer considered good style to pass JSON to JS's eval() ; in fact, some valid JSON is not valid JS. Newer editions of the ECMAScript standard provide JSON.stringify and JSON.parse for serialization and deserialization, respectively. These days JSON is a very popular language-independent serialization format, with simple syntax that's described here. In brief, JSON has values that can be either strings, numbers, null, true/false boolean constants, arrays or objects. Arrays are linear, ordered collections of values; in Go they are mapped to slices. Objects are unordered sets of key/value pairs; in Go they mapped to maps. JSON object keys are strings, and values are arbitrary JSON values. Note that this is a recursive definition - objects can hold other objects, or lists, which themselves can hold other objects, etc. This should be very familiar to programmers coming from dynamic languages like Python, Perl or JavaScript, or from the Lisp family where such nested data structures are common and idiomatic.

Marshaling of basic Go data types Go uses the term marshaling to refer to the kind of serialization done by converting Go data structures to JSON . Therefore, the json package's main convenience functions are called json.Marshal and json.Unmarshal . These functions are generic, in the sense that they work with interface{} values, and they have the proper runtime logic to figure out which actual type is being (un)marshaled. Here is an example that uses the basic types: boolS , _ := json . Marshal ( true ) fmt . Println ( string ( boolS )) var b bool if err := json . Unmarshal ( boolS , & b ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled bool:" , b ) intS , _ := json . Marshal ( 42 ) fmt . Println ( string ( intS )) var i int if err := json . Unmarshal ( intS , & i ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled int:" , i ) floatS , _ := json . Marshal ( 3.14159 ) fmt . Println ( string ( floatS )) var f float64 if err := json . Unmarshal ( floatS , & f ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled float64:" , f ) stringS , _ := json . Marshal ( "golang" ) fmt . Println ( string ( stringS )) var s string if err := json . Unmarshal ( stringS , & s ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled string:" , s ) This will print: true unmarshaled bool: true 42 unmarshaled int: 42 3.14159 unmarshaled float64: 3.14159 "golang" unmarshaled string: golang Note that while JSON doesn't distinguish between integers and floating-point numbers, Go lets us do this by having the static type information on the objects passed into json.Marshal or json.Unmarshal . For more details on numbers see the "JSON numbers - ints or floats?" section.

Null and nil Another basic type supported by JSON is null, which is just a way of saying "nothing to see here". In Go, nil pointers are marshaled to null: nilS , _ := json . Marshal ( nil ) fmt . Println ( string ( nilS )) Prints null . Demonstrating unmarshaling of null is a little bit trickier, because it's not clear what to pass to json.Unmarshal . Passing it any uninitialized pointer ( nil ) results in an error. A trick that works is: var p interface {} if err := json . Unmarshal ( nilS , & p ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled null:" , p ) Prints: unmarshaled null: <nil>

(Un)marshaling with generic interface{} values As was briefly mentioned above, the definitions of json.Marshal and json.Unmarshal are generic, in the sense that they use an interface{} for the Go object which is to be encoded or decoded into: func Marshal ( v interface {}) ([] byte , error ) func Unmarshal ( data [] byte , v interface {}) error The samples above showcased how special logic inside these functions handles basic types that are known at compile time. We can do similar things using interface{} directly. While not overly useful for these basic types, this knowledge will come handy when we're discussing collection types (JSON representations of slices and maps) later on. We saw earlier that to encode a boolean into JSON we can do: boolS , _ := json . Marshal ( true ) Alternatively, we can do: var ii interface {} ii = true boolS , _ := json . Marshal ( ii ) fmt . Println ( string ( boolS )) As far as json.Marshal is concerned, these two code samples are equivalent because the function really accepts a interface{} . It becomes a bit more interesting when unmarshaling. Where we previously did: var b bool if err := json . Unmarshal ( boolS , & b ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled bool:" , b ) We can also do: var ib interface {} if err := json . Unmarshal ( boolS , & ib ); err != nil { panic ( err ) } But now what do we do with ib ? It's not a bool , so we can't treat it as such. Since we already know it should contain an bool , we can do a type assertion: b := ib .( bool ) fmt . Println ( "unmarshaled bool:" , b ) If we don't know what type to expect here exactly, we'll likely need a type switch: switch v := ib .( type ) { case bool : fmt . Println ( "it's a bool:" , v ) case float64 : fmt . Println ( "it's a float:" , v ) // other possible types enumerated... default : panic ( "can't figure out the type" ) }

JSON numbers - ints or floats? One common issue with decoding JSON is distinguishing between different numeric types. Unlike Go, which has separate types for integers and floats, JavaScript only has floats; this fact is reflected in JSON. The JSON standard doesn't acknowledge the distinction between the two, and treats both as "numbers", though it's clear from the specification that the more general type (floating point) is inferred. As long as we know the type of field to decode statically, everything will work fine. As the very first example in this post demonstrates, when we pass a pointer to int to Unmarshal , it will know to parse properly into an integer. But what happens when we don't know the type at compile time? When using generic interface{} decoding, floating point is always chosen. Consider this: func main () { s := [] byte ( "1234" ) var inum interface {} if err := json . Unmarshal ( s , & inum ); err != nil { panic ( err ) } switch v := inum .( type ) { case int : fmt . Println ( "it's an int:" , v ) case float64 : fmt . Println ( "it's a float:" , v ) // other possible types enumerated... default : panic ( "can't figure out the type" ) } } It will print: it's a float: 1234 This is not a bug; it's the logical thing for Unmarshal to do, given that it doesn't know what type to expect. 1234 looks like an integer, but it might as well be a float with the decimal point omitted. Unmarshal has to decode the most general type based on the JSON specification. If this is a real issue, one way to work around it is to use the alternative json.Decoder API for unmarshaling. This API is slightly different from json.Unmarhsal ; it's designed to parse JSON streams, which could result from reading over HTTP, for example. Here's the same code using Decoder : d := json . NewDecoder ( bytes . NewReader ( s )) var ii interface {} if err := d . Decode ( & ii ); err != nil { panic ( err ) } switch v := ii .( type ) { case int : fmt . Println ( "it's an int:" , v ) case float64 : fmt . Println ( "it's a float:" , v ) // other possible types enumerated... default : panic ( "can't figure out the type" ) } It gives a similar result to using Unmarshal . However, here's the twist. Decoder has an option to not parse numbers into concrete types. Instead, a number be left unparsed as a json.Number , which is a just a string used to represent number literals. This is accomplished by calling Decoder.UseNumber() , as follows: d := json . NewDecoder ( bytes . NewReader ( s )) var ii interface {} d . UseNumber () // <-- UseNumber if err := d . Decode ( & ii ); err != nil { panic ( err ) } switch v := ii .( type ) { case int : fmt . Println ( "it's an int:" , v ) case float64 : fmt . Println ( "it's a float:" , v ) case json . Number : fmt . Println ( "it's a string:" , v ) // other possible types enumerated... default : panic ( "can't figure out the type" ) } Now this will print: it's a string: 1234 And we're free to parse the string as we wish, for example with strconv.Atoi . You may think this is unnecessary - can't we just convert the float64 read by Unmarshal into an int ? Things are not so simple, however. Floating point numbers have limited representation accuracy, and for big integers we may get wrong results. We might even want to marshal arbitrary precision integers ( big.Int ), and these also have to be properly parsed to not lose precision.

JSON arrays and Go slices Go slices are encoded to JSON arrays, and vice-versa: sS , _ := json . Marshal ([] string { "broccoli" , "almonds" , "banana" }) fmt . Println ( string ( sS )) var s [] string if err := json . Unmarshal ( sS , & s ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled []string:" , s ) Prints: ["broccoli","almonds","banana"] unmarshaled []string: [broccoli almonds banana] When we unmarshal the JSON bytes above, we used a []string since we knew all the elements of the JSON array are strings. But what happens when JSON array elements have different types? While in Go the types of all slice elements has to be the uniform, the same is not true in JSON (due to its JavaScript roots). Let's try this: variedEncodedSlice := [] byte ( `["broccoli", true]` ) var s2 [] string if err := json . Unmarshal ( variedEncodedSlice , & s2 ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled []string:" , s2 ) This code panics: panic: json: cannot unmarshal bool into Go value of type string The error message makes sense: we gave json.Unmarshal a []string to unmarshal into, but the JSON bytes contain a bool. We can't unmarshal into an []bool , for a similar reason. So what is there to do? This is where generic JSON comes in again. If we don't know - ahead of time - the types of elements contained in a JSON array we're unmarshaling, we have to fall back to generic interface{} s and type switches: var iis [] interface {} if err := json . Unmarshal ( variedEncodedSlice , & iis ); err != nil { panic ( err ) } fmt . Println ( "unmarshalled slice of length:" , len ( iis )) for i , e := range iis { fmt . Printf ( "decoding element %d

" , i ) switch v := e .( type ) { case bool : fmt . Println ( " it's a bool:" , v ) case string : fmt . Println ( " it's a string:" , v ) // other possible types enumerated... default : panic ( "can't figure out the type" ) } } This outputs: unmarshalled slice of length: 2 decoding element 0 it's a string: broccoli decoding element 1 it's a bool: true

JSON objects and Go maps Go maps are encoded to JSON objects, and vice-versa: mS , _ := json . Marshal ( map [ string ] bool { "almonds" : false , "cashews" : true }) fmt . Println ( string ( mS )) var m map [ string ] bool if err := json . Unmarshal ( mS , & m ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled map[string]bool:" , m ) Prints: {"almonds":false,"cashews":true} unmarshaled map[string]bool: map[almonds:false cashews:true] Similarly to the case of slices, this works well as long as the types of JSON elements are known ahead of time. If these types are not known, or values can have one of several types, we'll need to use generic capabilities, by unmarshaling into an interface{} and following up with type assertions or switches. We've seen how to do these in the slice example, so let's play with something slightly different now. JSON objects are sometimes nested, and we don't even know at compile-time what their level of nesting is. Consider this sample JSON: { "foo": true, "bar": false, "baz": { "next": true, "prev": { "fizz": true, "buzz": false }, "top": false } } Say we want to find the key "fizz" in it, and to see what it maps to. How can we do that? Let's think this through. First, it's obvious that we'll have to use interface{} unmarshaling, because the values of keys in each object can have different types (some are booleans, some are nested objects). Second, since JSON is a tree structure, a recursive approach is natural. Here's a function that will do that: // findNested looks for a key named s in map m. If values in m map to other // maps, findNested looks into them recursively. Returns true if found, and // the value found. func findNested ( m map [ string ] interface {}, s string ) ( bool , interface {}) { // Try to find key s at this level for k , v := range m { if k == s { return true , v } } // Not found on this level, so try to find it nested for _ , v := range m { nm , ok := v .( map [ string ] interface {}) if ok { found , val := findNested ( nm , s ) if found { return found , val } } } // Not found recursively return false , nil } And here's how we can use it: bb := [] byte ( ` { "foo": true, "bar": false, "baz": { "next": true, "prev": { "fizz": true, "buzz": false }, "top": false } }` ) var ii interface {} if err := json . Unmarshal ( bb , & ii ); err != nil { panic ( err ) } mi := ii .( map [ string ] interface {}) ok , fizzVal := findNested ( mi , "fizz" ) if ok { fmt . Println ( "found fizz! value is" , fizzVal ) }

Golang structs as JSON objects Nested slices and maps are great, but in Go it's idiomatic to assign more semantics to structured data with struct s. Go's json package supports marshaling structs into JSON objects and vice versa. Here's a simple example: type Food struct { Id int Name string FatPerServ float64 ProteinPerServ float64 CarbPerServ float64 } func main () { f := Food { 200403 , "Broccoli" , 0.3 , 2.5 , 3.5 } fS , _ := json . MarshalIndent ( f , "" , " " ) fmt . Println ( string ( fS )) var fD Food if err := json . Unmarshal ( fS , & fD ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled Food:" , fD ) } It will print: { "Id": 200403, "Name": "Broccoli", "FatPerServ": 0.3, "ProteinPerServ": 2.5, "CarbPerServ": 3.5 } unmarshaled Food: {200403 Broccoli 0.3 2.5 3.5} Here we're using the MarshalIndent method that indents JSON output for easier visual scanning. You'll notice that the JSON objects have their key names set from the struct field names automatically. This is nice for low-effort dumping, but isn't always satisfactory in real applications. We often don't control the format of the JSON data we're consuming, so we may get something like: { "id": 200403, "name": "Broccoli", "fat_per_serv": 0.3, "protein_per_serv": 2.5, "carb_per_serv": 3.5 } If we call json.Unmarshal for data like this, it will expect a struct with similarly named fields; but these names aren't idiomatic in Go, and moreover they start with lowercase so they won't even be visible outside the struct's own methods. We have a problem - we either sacrifice the style of our Go code, or have to enforce schemas on JSON we don't necessarily control. The solution is to use custom field tags, which is an esoteric feature of Go that was designed specifically for such use cases. Here's a complete example: type Food struct { Id int `json:"id"` // <--- Field tags Name string `json:"name"` FatPerServ float64 `json:"fat_per_serv"` ProteinPerServ float64 `json:"protein_per_serv"` CarbPerServ float64 `json:"carb_per_serv"` } func main () { f := Food { 200403 , "Broccoli" , 0.3 , 2.5 , 3.5 } fS , _ := json . MarshalIndent ( f , "" , " " ) fmt . Println ( string ( fS )) var fD Food if err := json . Unmarshal ( fS , & fD ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled Food:" , fD ) } It prints: { "id": 200403, "name": "Broccoli", "fat_per_serv": 0.3, "protein_per_serv": 2.5, "carb_per_serv": 3.5 } unmarshaled Food: {200403 Broccoli 0.3 2.5 3.5} Field tags let us map between the Go internal view of the struct's fields and its external materialization as a JSON object. These techniques work just as well with nested structs. Check out these code samples: sample 1, sample 2.

Partial encoding and decoding of structs It's common for JSON data to omit some fields that are then assumed to not exist or take on their default values. Think about passing many options, where the full list of options is so large that a lot of time and bandwidth would be wasted to transfer them all fully; usually we only want to modify a small number of options for every given call. The json package supports this with partial unmarshaling. Here's an example using the Food struct shown above: func main () { bb := [] byte ( ` { "Name": "Broccoli", "FatPerServ": 0.3, "ProteinPerServ": 2.5, "CarbPerServ": 3.5 }` ) var fD Food if err := json . Unmarshal ( bb , & fD ); err != nil { panic ( err ) } fmt . Println ( "unmarshaled Food:" , fD ) } Note that the JSON string doesn't have the Id field populated. The result will be: unmarshaled Food: {0 Broccoli 0.3 2.5 3.5} The unmarshaling is successful, and fD.Id is left at the default value for its type (0 for numbers, empty string for strings, etc). This behavior can be controlled via the Decoder.DisallowUnknownFields method when using the json.Decoder API. For a similar effect during marshaling, we can use the special "omitempty" field tag; it tells the json package to not emit a struct field if it has the default value for its type. Here's an example: type Food struct { Id int `json:"id,omitempty"` Name string `json:"name"` FatPerServ float64 `json:"fat_per_serv"` ProteinPerServ float64 `json:"protein_per_serv"` CarbPerServ float64 `json:"carb_per_serv"` } func main () { f := Food { 0 , "Broccoli" , 0.3 , 2.5 , 3.5 } fS , _ := json . MarshalIndent ( f , "" , " " ) fmt . Println ( string ( fS )) } { "name": "Broccoli", "fat_per_serv": 0.3, "protein_per_serv": 2.5, "carb_per_serv": 3.5 } Note how the id field is left out of the JSON, because it was given the empty value 0. We can also tell the encoder to omit certain fields. For example, we may have a struct with a field that should be kept private to the application and not sent over the wire. Even when the field has a non-empty value, we want it out of the serialized JSON string. We can do this with the json:"-" field tag: type Account struct { Name string Password string `json:"-"` Balance float64 } func main () { joe := Account { Name : "Joe" , Password : "123456" , Balance : 102.4 } s , _ := json . Marshal ( joe ) fmt . Println ( string ( s )) }

Delayed parsing with json.RawMessage Sometimes the data you need to parse is not on the top level of the JSON string, and/or you'd like to ignore a lot of the JSON contents, focusing just on the piece you need to parse. Consider this JSON string: { "event": {"name": "joe", "url": "event://101"}, "otherstuff": 15.2, "anotherstuff": 100 } And suppose we're interested only in the event key, as we already have a structure to parse it into: type Event struct { Name string `json:"name"` Url string `json:"url"` } How do we do that? The json module relies on static typing quite a bit, unless we go full generic with interface{} . But in that case, we may need to convert large maps into large structs manually, which is undesirable. The solution is json.RawMessage , which exists for this purpose. It tells the json module to not parse some parts of the string and leave them as string s, which we can then parse again later. Here's a complete solution to the issue discussed above: type Event struct { Name string `json:"name"` Url string `json:"url"` } func main () { bb := [] byte ( ` { "event": {"name": "joe", "url": "event://101"}, "otherstuff": 15.2, "anotherstuff": 100 }` ) var m map [ string ] json . RawMessage if err := json . Unmarshal ( bb , & m ); err != nil { panic ( err ) } if eventRaw , ok := m [ "event" ]; ok { var event Event if err := json . Unmarshal ( eventRaw , & event ); err != nil { panic ( err ) } fmt . Println ( "Parsed Event:" , event ) } else { fmt . Println ( "Can't find 'event' key in JSON" ) } }

JSON and pointer/reference types The json package has special handling for pointer and reference types. Consider this sample structure: type NamePtr struct { Id int Name * string } We can marshal it as follows: name := "Sam" np := NamePtr { 101 , & name } npS , _ := json . Marshal ( np ) fmt . Println ( string ( npS )) This will print: {"Id":101,"Name":"Sam"} This works because json.Marshal does the right thing here - it "sees through" the pointer to string and emits the string itself as the "Name" field. It works in reverse as well: var npD NamePtr if err := json . Unmarshal ( npS , & npD ); err != nil { panic ( err ) } fmt . Println ( npD . Id , * npD . Name ) Note that when we create npD , its Name field is initialized with the default value - a nil for pointers. json.Unmarshal allocates an actual value and sets the pointer to its address when unmarshaling. If Name is not present in the JSON string being decoded, the pointer will be left as nil . The same applies to other reference types, like slices: type BoolAndVals struct { Fresh bool Vals [] float64 } func main () { bb := [] byte ( ` { "Fresh": true, "Vals": [1.2, 3.24, 18.99] }` ) var bvD BoolAndVals if err := json . Unmarshal ( bb , & bvD ); err != nil { panic ( err ) } fmt . Println ( bvD ) } This will print: {true [1.2 3.24 18.99]} When we declare the variable bvD , its Vals field is an unallocated slice, but json.Unmarshal will allocate it for us if the Vals field is present in the decoded JSON object. This behavior is very useful for being able to multiplex several struct types in a single container, implementing a sum type. Here's a complete example: type RequestBodyFoo struct { Name string Balance float64 } type RequestBodyBar struct { Id int Ref int } type Request struct { Foo * RequestBodyFoo Bar * RequestBodyBar } func ( r * Request ) Show () { if r . Foo != nil { fmt . Println ( "Request has Foo:" , * r . Foo ) } if r . Bar != nil { fmt . Println ( "Request has Bar:" , * r . Bar ) } } func main () { bb := [] byte ( ` { "Foo": {"Name": "joe", "balance": 4591.25} } ` ) var req1 Request if err := json . Unmarshal ( bb , & req1 ); err != nil { panic ( err ) } req1 . Show () bb = [] byte ( ` { "Bar": {"Id": 128992, "Ref": 801472} } ` ) var req2 Request if err := json . Unmarshal ( bb , & req2 ); err != nil { panic ( err ) } req2 . Show () } This prints: Request has Foo: {joe 4591.25} Request has Bar: {128992 801472}

Parsing JSON streams with Decoder We've briefly seen json.Decoder before because it has some extended functionality that json.Marshal lacks. Now let's see how to use it more idiomatically, for parsing JSON streams. json.Marshal is a convenience function. It takes a full byte slice and parses its contents. But JSON data often arrives from some streaming medium like a socket, and it's occasionally useful to parse it in a more fine-grained manner. For example, the stream may contain a sequence of JSON values and we want to parse each value as soon as it arrives - json.Marshal would require us to consume the whole stream before parsing it. Here's an example that accomplishes this. It showcases using the More and Token methods for fine-grained parsing of a stream: func main () { const s = ` [ {"almonds": false}, {"cashews": true}, {"walnuts": false} ]` dec := json . NewDecoder ( strings . NewReader ( s )) t , err := dec . Token () if err != nil { panic ( err ) } if t != json . Delim ( '[' ) { panic ( "Expected '[' delimiter" ) } for dec . More () { var m map [ string ] bool err := dec . Decode ( & m ) if err != nil { panic ( err ) } fmt . Println ( "decoded" , m ) } t , err = dec . Token () if err != nil { panic ( err ) } if t != json . Delim ( ']' ) { panic ( "Expected ']' delimiter" ) } } Since json.NewDecoder accepts any io.Reader , it's very versatile. Many Go types implement the io.Reader interface, so decoders can be hooked up to sockets, files or even something like a decompressed reading using the zip package.

Encoding JSON streams with Encoder So far we've done marshaling with the json.Marshal function, which takes an object and produces a slice of bytes. The json package also offers a lower-level, more flexible API called Encoder , which can encode objects into a stream, without having to materialize a temporary []byte buffer. In Go, streams can be conveniently hooked together using interfaces provided by the io package, such as io.Writer . Here's a complete example that emits objects in JSON while also encoding them with base64 . Follow the comments in the code: package main import ( "encoding/base64" "encoding/json" "os" ) func main () { // Create an io.Writer that encodes bytes written into it as base64 and emits // them to stdout. base64writer := base64 . NewEncoder ( base64 . StdEncoding , os . Stdout ) // Create a new JSON encoder, hooking up its output to base64writer. je := json . NewEncoder ( base64writer ) je . Encode ( map [ string ] float64 { "foo" : 4.12 , "pi" : 3.14159 }) // Flush any partially encoded blocks left in the base64 encoder. base64writer . Close () }