Update 07/04/2018: added more clarifications to highlight the purpose of this test and added more details in the conclusion.

Update 30/03/2018: the RxJava test has been updated to use Schedulers.computation() and all the tests for both RxJava and coroutines have been executed again. Thanks for all the comments that helped improve the comparison between the different implementations.

I was curious to evaluate Kotlin coroutines and RxJava in terms of performance, so I’ve decided to create some simple tests. This post is mainly about Android, but could probably be applied also to other platforms where Kotlin and RxJava are used as well.

In a typical Android app we usually perform repeatedly some operations with RxJava, so I wanted to compare speed, CPU and memory usage of the same operations implemented with both Kotlin coroutines and RxJava.

This is an initial performance test. It’s not meant to be a full benchmark. Whenever we might decide to adopt a new tool that is going to be heavily used throughout all the code, it’s important to understand if it will have an impact on the general performance of the app before making a decision on whether it’s appropriate to use it or not. The very simple test presented here is just to have a feeling whether adopting Kotlin coroutines instead of RxJava is going to have a positive or negative impact on the app performance.

The main question that some people are asking themselves is: should I replace RxJava with Kotlin coroutines in Android?

Short answer: you should really consider replacing RxJava with Kotlin coroutines in most cases, especially in Android. RxJava might still be useful in a limited amount of cases (that you actually might never have in a typical app unless you’re processing real streams) and in those cases you can mix both RxJava and coroutines anyway.

Simple reasons:

Kotlin coroutines provide much more flexibility than plain reactive programming (the code can be written in an easy to understand sequential way whenever reactive programming is not needed, but can still be written in a reactive programming style when needed by making use of the Kotlin operators on collections)

(the code can be written in an easy to understand sequential way whenever reactive programming is not needed, but can still be written in a reactive programming style when needed by making use of the Kotlin operators on collections) Kotlin provides a very rich set of operators on collections that will look similar to what you have with the RxJava operators (most of the times, in Android, we deal with collections instead of streams)

(most of the times, in Android, we deal with collections instead of streams) Kotlin coroutines can interact with RxJava when needed (this can be done following a simple pattern in many cases or as described in the guide to reactive streams with coroutines)

(this can be done following a simple pattern in many cases or as described in the guide to reactive streams with coroutines) Kotlin coroutines are very lightweight and efficient (as you’ll see later in these simple tests, the amount of memory used by RxJava is generally higher compared to coroutines and this leads to a slower app given the higher CPU usage for the garbage collection of all the objects generated by RxJava; this also translates into higher battery consumption of course)

(as you’ll see later in these simple tests, the amount of memory used by RxJava is generally higher compared to coroutines and this leads to a slower app given the higher CPU usage for the garbage collection of all the objects generated by RxJava; this also translates into higher battery consumption of course) in case you end up never using RxJava because everything can be done with Kotlin coroutines, then you can remove the dependency on the RxJava library (one less dependency for your app)

The test case

This is my test setup:

Kotlin version: 1.2.31

Kotlin coroutines library version: 0.22.5

RxJava version: 2.1.11

RxAndroid version: 2.0.2

Test device: Samsung Galaxy S6 (Android 7.0, 3 GB RAM)

This is the sequence of steps that I’m going to test:

execute an operation asynchronously on a background thread execute an operation on the Android main UI thread when the asynchronous one has finished

The sequence of steps is repeated many times.

The reason for these simple steps is that, in an Android app, we typically have to execute some operations on a background thread (e.g. for remote API calls) and then we need to handle the results in the main UI thread to interact with the UI.

Note that here I’m comparing a mature tool like RxJava with an earlier stage one like Kotlin coroutines. This means that the latter will probably still have room for improvement before reaching a mature stage.

The source code of these tests can be found on GitHub.

The code snippets of the test case

These are the code snippets of the same test case implemented with Kotlin coroutines and RxJava.

Here stubAsyncFunc() is just a function that performs a simple operation and is always executed on a background thread, while checkTestEnd() is a utility method that just checks if all the iterations of the test have completed.

In all implementations, stubAsyncFunc() is executed in parallel on multiple background threads.

Kotlin coroutines version: first approach

for (i in 1..TEST_ITERATIONS_COUNT) {

launch(UI) {

async(CommonPool) { stubAsyncFunc() }.await()

checkTestEnd(testName)

}

}

This first approach with coroutines starts a new coroutine for each test iteration. stubAsyncFunc() is executed on a background thread, then checkTestEnd() is executed on the main UI thread when stubAsyncFunc() has finished its execution.

Kotlin coroutines version: second approach

val testArray = IntArray(TEST_ITERATIONS_COUNT) launch(UI) {

testArray

.map { async(CommonPool) { stubAsyncFunc() } }

.map {

it.await()

checkTestEnd(testName)

}

}

This second approach with coroutines starts just one coroutine. For each item in testArray (the number of items is equal to the number of test iterations), stubAsyncFunc() is executed on a background thread, then checkTestEnd() is executed on the main UI thread when stubAsyncFunc() has finished its execution (checkTestEnd() is executed for each time stubAsyncFunc() has finished its execution as in the first coroutines approach).

This second approach with coroutines shows how the same result can be achieved by launching a single new coroutine context at the beginning instead of multiple contexts like in the first approach. The approach you’re going to have in your app depends on the structure of your code and your use cases, but the final result is the same for both approaches.

Note that the map operator here is an operator on collections provided out of the box by Kotlin and has basically the same meaning of the one provided by RxJava on streams.

RxJava version

val subscribeScheduler = Schedulers.computation()

val observeScheduler = AndroidSchedulers.mainThread() for (i in 1..TEST_ITERATIONS_COUNT) {

Observable.fromCallable { stubAsyncFunc() }

.subscribeOn(subscribeScheduler)

.observeOn(observeScheduler)

.subscribe { checkTestEnd(testName) }

}

Here, like in the previous cases with coroutines, we execute stubAsyncFunc() asynchronously on a background thread and then, each time it completes, we execute checkTestEnd() on the main UI thread.

Note that a new Observable is created for each test iteration because this is the typical approach that developers use in an Android app. For example, while making a remote API call, we usually have a new Observable (or Single) for each call, we process and transform the result with one or more RxJava operators and then we subscribe to the stream to get the final result that we can use in our main UI thread to update the UI.

The test results

Let’s now take a look at the test results. Each sequence of test steps has been executed for a specific amount of test iterations.

For each test result, we see these values:

base mem (the amount of memory used by the test app before starting the test)

(the amount of memory used by the test app before starting the test) max mem (the maximum amount of memory reached during the test execution)

(the maximum amount of memory reached during the test execution) delta (the difference between max mem and base mem)

(the difference between max mem and base mem) max CPU (the maximum amount of CPU usage reached during the test execution)

(the maximum amount of CPU usage reached during the test execution) time (the time, in seconds, taken to complete the total iterations of the test)

The Tests column shows if the results are related to the first coroutines approach (Coroutines 1), the second coroutines approach (Coroutines 2) or RxJava.

These tests have been executed while observing the results with the Android Profiler in Android Studio 3.0.1.

Note: the base memory that you’ll see in the following tests looks high because, as reported by the Android Profiler, almost all of it is classified as Graphics, so it’s not memory allocated by the app itself. The app is really simple with a very simple UI, but the test device has a very high resolution screen (1440 x 2560 pixels) and that is likely to be the reason for such a high base memory value while the app is in foreground.

Before executing the full test iterations for each approach, the garbage collector has been manually triggered multiple times in the Android Profiler to make sure that the used memory was going back to the original base value as much as possible. The test phone has been set to flight mode to avoid network interactions. The tests have been repeated multiple times to find the most common pattern in terms of execution time, memory and CPU usage (to make sure that any outlier execution was not taken into account because it’s not representative of the most common case).

10000 test iterations

In this case, RxJava uses a higher amount of memory because, for every test iteration, it creates many new objects. The time taken to complete all the iterations is considerably larger as well with RxJava due to the creation of all the new objects. Note that if the total memory on the test device was lower (3 GB for my test device, but many lower end phones have definitely less capacity), the memory limit for a single app would be lower as well and the time taken to complete the test would be higher because the garbage collector would keep on running whenever the limit is reached to try to free up some memory for the new test iterations.

The memory used by both the coroutines versions of the test is lower and the time taken to complete all the iterations is much lower. The CPU usage is lower as well given that there’s no need to generate many new objects in memory.

1000 test iterations

Given the lower amount of iterations compared to the 10000 iterations test, in this case the amount of memory is not that big for RxJava, but still much higher compared to the coroutines. The high maximum CPU usage for RxJava is probably due to a spike at the beginning to generate all the new RxJava objects for each test iteration.

100 test iterations

Given the limited amount of iterations, the maximum CPU usage becomes similar in all the cases and the same is true for the memory. The execution times are difficult to compare given the very fast execution in all the cases. The 100 test iterations case, as the other cases with more iterations, has been repeated multiple times trying to find the most common patterns for the different implementations. The second coroutines approach is anyway always faster. In a typical app, while using coroutines, you’ll likely end up with a mix of the first and the second coroutines approaches depending on the structure of your code.

Conclusion

The amount of memory used is an important factor in an app, especially on devices with limited total memory.

Higher memory usage translates into higher CPU usage as well because creating objects is an expensive operation and the garbage collector needs to run more frequently to get rid of all the objects when they’re not needed anymore.

Higher CPU usage translates into higher battery consumption and less smooth UI for the user.

A very common pattern implemented by developers in modern Android apps makes use of RxJava to execute remote API calls to REST services (through Retrofit) asynchronously. The retrieved objects are then processed through RxJava and the final results are passed back to the main UI thread by subscribing to the RxJava streams. This means that for each API call, new objects are generated by RxJava in memory with all the consequences already described above. Given that this usage of RxJava never handles real streams (just collections) and can easily be implemented with Kotlin coroutines and processed with the operators on collections provided by the Kotlin standard library or with plain imperative code, there’s no need to use RxJava and the app can be easily made more efficient.

Note that for each operator that we add to an RxJava chain, a new Observable object is created in memory. The same applies each time we add an operator on collections provided by the Kotlin standard library (a new collection instance is created to provide the input for the next operator in the chain). Whenever chaining operators is not really necessary, plain imperative code can avoid creating additional objects in memory. Kotlin coroutines make it easy to switch between imperative code and chains of operators on collections depending on our needs.

RxJava is still a great (and probably the perfect) tool whenever we need to process real streams.

Kotlin coroutines provide additional benefits by allowing to remove all the callbacks and thus avoid the callback hell that we might still experience by using RxJava in typical Android apps depending on the structure of our code (for more information on this topic, take a look at Playing with Kotlin in Android: coroutines and how to get rid of the callback hell).

I would be very happy to evaluate more complete performance tests on this topic so if you find an interesting one or end up implementing it yourself, please add a reference to it in the comments to this post.