While working on an API that was built specifically for mobile clients, I ran into an interesting problem that I couldn’t believe I hadn’t found before. When working on a REST API that deals exclusively in JSON payloads, how do you upload images? Which naturally leads onto the next question, should a JSON API then begin accepting multipart form data? Is that not going to look weird that for every endpoint, we accept JSON payloads, but then for this one we accept a multipart form? Because we will want to be uploading metadata with the image, we are going to have to read out formdata values too. That seems so 2000’s! So let’s see what we can do.

While some of the examples within this post are going to be using .NET Core, I think this really applies to any language that is being used to build a RESTful API. So even if you aren’t a C# guru, read on!

Initial Thoughts

My initial thoughts sort of boiled down to a couple of points.

The API I was working on had been hardcoded in a couple of areas to really be forcing the whole JSON payload thing. Adding in the ability to accept formdata/multipart forms would be a little bit of work (and regression testing).

We had custom JSON serializers for things like decimal rounding that would somehow manually need to be done for form data endpoints if required. We are even using snake_case as property names in the API (dang iOS developers!), which would have to be done differently in the form data post.

And finally, is there any way to just serialize what would have been sent under a multi-part form post, and include it in a JSON payload?

What Else Is Out There?

It really became clear that I didn’t know what I was doing. So like any good developer, I looked to copy. So I took a look at the public API’s of the three social media giants.

Twitter

API Doc : https://developer.twitter.com/en/docs/media/upload-media/api-reference/post-media-upload

Twitter has two different ways to upload files. The first is sort of a “chunked” way, which I assume is because you can upload some pretty large videos these days. And a more simple way for just uploading general images, let’s focus on the latter.

It’s a multi-part form, but returns JSON. Boo.

The very very interesting part about the API however, is that it allows uploading the actual data in two ways. Either you can upload the raw binary data as you typically would in a multipart form post, or you could actually serialise the file as a Base64 encoded string, and send that as a parameter.

Base64 encoding a file was interesting to me because theoretically (And we we will see later, definitely), we can send this string data any way we like. I would say that of all the C# SDKs I looked at, I couldn’t find any actually using this Base64 method, so there weren’t any great examples to go off.

Another interesting point about this API is that you are uploading “media”, and then at a later date attaching that to an actual object (For example a tweet). So if you wanted to tweet out an image, it seems like you would (correct me if I’m wrong) upload an image, get the ID returned, and then create a tweet object that references that media ID. For my use case, I certainly didn’t want to do a two step process like this.

LinkedIn

API Doc : https://developer.linkedin.com/docs/guide/v2/shares/rich-media-shares#upload

LinkedIn was interesting because it’s a pure JSON API. All data POSTs contain JSON payloads, similar to the API I was creating. Wouldn’t you guess it, they use a multipart form data too!

Similar to Twitter, they also have this concept of uploading the file first, and attaching it to where you actually want it to end up second. And I totally get that, it’s just not what I want to do.

Facebook

API Doc : https://developers.facebook.com/docs/graph-api/photo-uploads

Facebook uses a Graph API. So while I wanted to take a look at how they did things, so much of their API is not really relevant in a RESTful world. They do use multi-part forms to upload data, but it’s kinda hard to say how or why that is the case,. Also at this point, I couldn’t get my mind off how Twitter did things!

So Where Does That Leave Us?

Well, in a weird way I think I got what I expected, That multipart forms were well and truly alive. It didn’t seem like there was any great innovation in this area. In some cases, the use of multipart forms didn’t look so brutal because they didn’t need to upload metadata at the same time. Therefore simply sending a file with no attached data didn’t look so out of place in a JSON API. However, I did want to send metadata in the same payload as the image, not have it as a two step process.

Twitter’s use of Base64 encoding intrigued me. It seemed like a pretty good option for sending data across the wire irrespective of how you were formatting the payload. You could send a Base64 string as JSON, XML or Form Data and it would all be handled the same. It’s definitely proof of concept time!

Base64 JSON API POC

What we want to do is just test that we can upload images as a Base64 string, and we don’t have any major issues within a super simple scenario. Note that these examples are in C# .NET Core, but again, if you are using any other language it should be fairly simple to translate these.

First, we need our upload JSON Model. In C# it would be :

public class UploadCustomerImageModel { public string Description { get; set; } public string ImageData { get; set; } }

Not a whole lot to it. Just a description field that can be freetext for a user to describe the image they are upload, and an imagedata field that will hold our Base64 string.

For our controller :

[HttpPost("{customerId}/images")] public FileContentResult UploadCustomerImage(int customerId, [FromBody] UploadCustomerImageModel model) { //Depending on if you want the byte array or a memory stream, you can use the below. var imageDataByteArray = Convert.FromBase64String(model.ImageData); //When creating a stream, you need to reset the position, without it you will see that you always write files with a 0 byte length. var imageDataStream = new MemoryStream(imageDataByteArray); imageDataStream.Position = 0; //Go and do something with the actual data. //_customerImageService.Upload([...]) //For the purpose of the demo, we return a file so we can ensure it was uploaded correctly. //But otherwise you can just return a 204 etc. return File(imageDataByteArray, "image/png"); }

Again, fairly damn simple. We take in the model, then C# has a great way to convert that string into a byte array, or to read it into a memory stream. Also note that as we are just building a proof of concept, I echo out the image data to make sure that it’s been received, read, and output like I expect it would, but not a whole lot else.

Now let’s open up postman, our JSON payload is going to look a bit like :

{ "description" : "Test upload of an image", "imageData" : "/9j[...]" }

I’ve obviously truncated imagedata down here, but a super simple tool to turn an image into a Base64 is something like this website here. I would also note that when you send your payload, it should be without the data:image/jpeg;base64, prefix that you sometimes see with online tools that convert images to strings.

Hit send in Postman and :

Great! So my image upload worked and the picture of my cat was echo’d back to me! At this point I was actually kinda surprised that it could be that easy.

Something that became very evident while doing this though, was that the payload size was much larger than the original image. In my case, the image itself is 109KB, but the Base64 version was 149KB. So about 136% of the original image. In having a quick search around, it seems expected that a Base64 version of a file would be about 33% bigger than the original. When it comes to larger files, I think less about sending 33% more across the wire, but more the fact of reading the file into memory, then converting it into a huge string, and then writing that out… It could cause a few issues. But for a few basic images, I’m comfortable with a 33% increase.

I will also note that there was a few code snippets around for using BSON or Protobuf to do the same thing, and may actually cut down the payload size substantially. The mentality would be the same, a JSON payload with a “stringify’d” file.

Cleaning Up Our Code Using JSON Converters

One thing that I didn’t like in our POC was that we are using a string that almost certainly will be converted to a byte array every single time. The great thing about using a JSON library such as JSON.net in C#, is that how the client sees the model and how our backend code sees the model doesn’t necessarily have to be the exact same. So let’s see if we can turn that string into a byte array on an API POST automagically.

First we need to create a “Custom JSON Converter” class. That code looks like :

public class Base64FileJsonConverter : JsonConverter { public override bool CanConvert(Type objectType) { return objectType == typeof(string); } public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer) { return Convert.FromBase64String(reader.Value as string); } //Because we are never writing out as Base64, we don't need this. public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) { throw new NotImplementedException(); } }

Fairly simple, all we are doing is taking a value and converting it from a string into a byte array. Also note that we are only worried about reading JSON payloads here, we don’t care about writing as we never write out our image as Base64 (yet).

Next, we had back to our model and we apply the custom JSON Converter.

public class UploadCustomerImageModel { public string Description { get; set; } [JsonConverter(typeof(Base64FileJsonConverter))] public byte[] ImageData { get; set; } }

Note we also change the “type” of our ImageData field to a byte array rather than a string. So even though our postman test will still send a string, by the time it actually gets to us, it will be a byte array.

We will also need to modify our Controller code too :

[HttpPost("{customerId}/images")] public FileContentResult UploadCustomerImage(int customerId, [FromBody] UploadCustomerImageModel model) { //Depending on if you want the byte array or a memory stream, you can use the below. //THIS IS NO LONGER NEEDED AS OUR MODEL NOW HAS A BYTE ARRAY //var imageDataByteArray = Convert.FromBase64String(model.ImageData); //When creating a stream, you need to reset the position, without it you will see that you always write files with a 0 byte length. var imageDataStream = new MemoryStream(model.ImageData); imageDataStream.Position = 0; //Go and do something with the actual data. //_customerImageService.Upload([...]) //For the purpose of the demo, we return a file so we can ensure it was uploaded correctly. //But otherwise you can just return a 204 etc. return File(model.ImageData, "image/png"); }

So it becomes even simpler. We no longer need to bother handling the Base64 encoded string anymore, the JSON converter will handle it for us.

And that’s it! Sending the exact same payload will still work and we have one less piece of plumbing to do if we decide to add more endpoints to accept file uploads. Now you are probably thinking “Yeah but if I add in a new endpoint with a model, I still need to remember to add the JsonConverter attribute”, which is true. But at the same time, it means if in the future you decide to swap to BSON instead of Base64, you aren’t going to have to go to a tonne of places and work out how you are handling the incoming strings, it’s all in one handy place.