Lets image that you are developing an Android app that communicates with your REST backend via Retrofit which already became an industry-standard. Also you are trying to be on a bleeding edge of the technology stack so you are using Kotlin and RxJava in your project. But there is one issue — your backend developer doesn’t care about his clients (means he doesn’t care about you), so he considers that it’s ok to respond with a 30MB JSON on a GET request. And looks like you’ll have to deal with it.

The first and I believe it to be the best solution is to make you backend developer implement a pagination or, maybe, you should ask him for the access to the repo with the backend source code to implement it yourself. And of course we shouldn’t forget about GraphQL.

However I wouldn’t be writing this without the knowing that still it’s possible to solve that issue on a mobile client without getting OutOfMemoryException .

Let’s start! At first I went to my GitHub account and created a new repository to put there my large JSON: https://github.com/thenixan-blogposts/json-streaming-data. GitHub would be our REST service, we will query the one-large-file.json via GET request imagining that we are actually querying a server and there is a developer responsible for such a huuuuge response.

Then I’ll open up IntelliJ IDEA (actually you can use Android Studio, but for the example I’ve decided to build a usual JVM based application that is launched using CLI on my desktop, at least it would be faster to debug and test).

I start with adding required libraries: Retrofit and RxJava.

build.gradle file of the project

Than we need to declare the classes for the items of the JSON we are requesting:

If you’re familiar with the Retrofit, you’ll notice nothing special in here except for the @SerializedName annotation on each field — yep, I’m a perfectionist.

Benchmarking function: it will calculate memory and time usage required for the f function to run.

And the service itself — still nothing special.

Let’s test and we will get the following results:

Common usage of retrofit

Size is: 30000

Elapsed time: 1813

Used memory: 2829232

For a 22 MB JSON with a 30 000 item array in it file we need almost two seconds to load it and parse and 3 MB to hold it all in the memory.

Let’s optimize it!

The whole idea of that optimization is not to convert our JSON in the Array . Instead we can make Retrofit return us the instance of the Observable<DataItem> that emits every item of the array on-the-fly, without holding everything in memory.

So our call would be something like this:

In order to do this we must provide an instance of the TypeConverter that knows how to treat Observable 's. To do it we simply should provide a suitable one via @JsonAdapter annotation:

And a reader:

Size is: 30000

Elapsed time: 988

Used memory: 118656

We’ve dropped from 3MBs to 120KBs and slightly increased the speed of parsing. To access all the items of the array we simply have to flatMapObservalbe { it.data } on our Single that is returned by the Retrofit.

But, there’s one thing I still don’t like in that way of implementation — actually we’re still holding all our array items in the Subject and during parsing our subject is filled with the items until we subscribe on it. In perfect I want to do both reading stream and emitting the objects in the underlying Observable and consuming them in the subscribe call — for instance we can start inserting all the items in the database as soon as we get the first one while parsing the tail of the array.

For a few days I was trying to implement a pretty one solution for the task but all I came with at last was:

Note that service.loadDataWithoutAnyParsingAtAll() returns the Single<ResponseBody> and all the parsing happens in the flatMapObservable block.

It turned out that the last solution was not the fastest one but it consumed only 15KBs of memory.

Conclusion

To sum up — you’ve got three ways to implement parsing of large JSON’s

Parse it to the Array : pretty much straightforward, easy to ready but the longest one and consumes a lot of memory.

: pretty much straightforward, easy to ready but the longest one and consumes a lot of memory. Provide a custom TypeAdapter to make Retrofit able to parse arrays to Observable 's: the fastest one but still requires memory to run.

to make Retrofit able to parse arrays to 's: the fastest one but still requires memory to run. Make Retrofit return you a ResponseBody and than flatMap it to the Observable with items of the array using GSON streaming parsing: not the fastest but memory usage is dramatically low.

Here’s the link with the buildable project that contains all examples of big JSON parsing.