October 10th 2018, our team releases a new version of our React Native app. We're happy and proud to deliver new features to our users.

But oh the horror! A few hours after the release, we're suddenly receiving an increased number of Android crashes.

Our crash reporting tool, Sentry is on fire!!

All the new errors are of the type JSApplicationIllegalArgumentException Error while updating property 'left' in shadow node of type: RCTView" .

In React Native, this can usually happen if you set a property with a wrong type. But how come we didn't catch this error when testing the app? Our new releases are well tested by every member of the team on a panel of multiple devices.

Also the errors seem quite random, they seem to happen for any combination of property and type of shadow node. These are the first 3 that were occurring for instance:

Error while updating property 'paddingTop' in shadow node of type: RCTView

Error while updating property 'height' in shadow node of type: RCTImageView

Error while updating property 'fill' of a view managed by: RNSVGPath

It also seems to happen on any device and any Android version, based on the Sentry report

A majority of Android 8.0.0 crashes but this is consistent with our user base

Let's reproduce!

So the first step when you want to fix a bug is reproducing, right? Fortunately, thanks to our Sentry logs, we know what the users are doing before triggering the crash.

Soooo let's see here...

Welp, for the vast majority of crashes, users are just opening the app and kaboum, the crash happens.

Okay, let's try to replicate then. With the store app installed on 6 different Android devices, let's try to reproduce by opening and quitting the app a few times. No luck crashing the app! All the more impossible to reproduce locally in dev mode.

All right, this seems pointless. The crashes seem quite random anyway. The crash rate is raising at about 10%. Seems like you basically have 1 chance out of 10 to have the app crash on you when starting it.

Analyzing the stacktrace

To be able to reproduce this crash, let's try to understand it, where it comes from, how it thinks...

Well the thing is, as mentioned before, we have a few different errors. And they all have similar yet not entirely similar stacktraces.

Ok then, let's take the first one and analyse:

java.lang.ArrayIndexOutOfBoundsException: length=10; index=-1 at android.support.v4.util.Pools$SimplePool.release(Pools.java:116) at com.facebook.react.bridge.DynamicFromMap.recycle(DynamicFromMap.java:40) at com.facebook.react.uimanager.LayoutShadowNode.setHeight(LayoutShadowNode.java:168) at java.lang.reflect.Method.invoke(Method.java) ... java.lang.reflect.InvocationTargetException: null at java.lang.reflect.Method.invoke(Method.java) ... com.facebook.react.bridge.JSApplicationIllegalArgumentException: Error while updating property 'height' in shadow node of type: RNSVGSvgView at com.facebook.react.uimanager.ViewManagersPropertyCache$PropSetter.updateShadowNodeProp(ViewManagersPropertyCache.java:113) ...

Ok this is where it's happening then, in android/support/v4/util/Pools.java .

Hmm, we're pretty deep into the Android support library. At this point, I'm afraid it would take too much time investigating. I want to find the quickest way to fix the issue for my users, then I'll take more time to investigate the root cause behind the bug.

Let's find another way

Another way to find the root cause of the bug is to check the new changes we made in this release. Especially changes that would affect native Android code. 2 hypotheses emerge:

we upgraded Native Navigation, a navigation solution that used native fragments on Android for each screen

we upgraded react-native-svg. There were a few exceptions related to SVG components but some had no relation with it, so maybe not a good bet

At this point since we cannot reproduce the bug, our best bet is:

downgrading one of the 2 libs roll out to 10% users, since it's trivial to do so with the Play Store Check back in a few users if the crash happens in the new release. That way we validate or invalidate our hypothesis.



But how to choose which library to downgrade?

One solution to decide is playing "head or tail", but do we really want to resolve to that?

Getting to the bottom of this

Ok, let's dig deeper in the stacktrace we had previously picked, to see if we can determine which library to select.

/** * Simple (non-synchronized) pool of objects. * * @param The pooled type. */ public static class SimplePool implements Pool { private final Object[] mPool; private int mPoolSize; ... @Override public boolean release(T instance) { if (isInPool(instance)) { throw new IllegalStateException("Already in the pool!"); } if (mPoolSize < mPool.length) { mPool[mPoolSize] = instance; mPoolSize++; return true; } return false; }

This above was where the crash was occurring. The error was java.lang.ArrayIndexOutOfBoundsException: length=10; index=-1 meaning mPool is an array of size 10, but mPoolSize=-1 .

All right, how can get there with mPoolSize=-1 ? In addition to the recycle method above, the only place where mPoolSize is modified is in the acquire method of the SimplePool class:

public T acquire() { if (mPoolSize > 0) { final int lastPooledIndex = mPoolSize - 1; T instance = (T) mPool[lastPooledIndex]; mPool[lastPooledIndex] = null; mPoolSize--; return instance; } return null; }

So the only way mPoolSize can be -1 is if hitting the line mPoolSize-- while mPoolSize=0 . But how is it possible with the condition above mPoolSize > 0 ?

Let's put a breakpoint in Android Studio and check what's going on when we start the app. I mean, there was an if condition, this piece of code could not malfunction!

And finally, revelation!

See, DynamicFromMap has a static reference to a SimplePool

private static final Pools.SimplePool<DynamicFromMap> sPool = new Pools.SimplePool<>(10);

After a few dozen hits of the play button, with carefully placed breakpoints, we can see that SimplePool.acquire and SimplePool.release are called on the mqt_native_modules threads by React Native to manage style properties of React component (below the width of a component)

But they are also called on the main thread!

Above we can see that it is used to update a fill prop on the main thread, typically for a react-native-svg component! Indeed, it turns out react-native-svg was using DynamicFromMap only since version 7 of the lib to improve the performance of native svg animations.

Aaaand the function can actually be called from 2 threads but DynamicFromMap is not using SimplePool in a thread safe manner. "Thread safe", you say? For those of you who don't know the term, I'll explain in the next part. If already know what this means, feel free to skip until "In our case, here's what could happen"

Thread safe, a little theory

So JavaScript is single threaded, so JavaScript developers usually have no need to deal with thread safety.

Java, on the other hand, supports the concept of concurrent, or multi-threaded, programs. Several threads can run within a single program and can potentially access a common data structure, which can lead to unexpected results.

Let's take a simple example, in the image below, both Thread A and Thread B both:

read the integer into their memory

increment its value

return it

Thread B could potentially access the data value before Thread A was done updating it. We expected two separate increments, to result in a final value of 19 . What we potentially get instead, was 18 . Such a scenario, where the final state of data depends on the relative order of thread operations, is called a race condition. The problem with race conditions is that they don't necessarily happen all the time. It's possible in the case above that Thread B had more work to do before incrementing the value, giving enough time to Thread A to update the value. That explains the randomness and the impossibility to reproduce our crash.

A data structure is said to be thread safe, if operations can be done concurrently by many threads, without the risk of race conditions.

When one thread is performing read on a particular data element, no other thread should be allowed to modify or delete this element (this is called atomicity). In our earlier example, if the update cycles were atomic, the race condition would have been avoided. Thread B would have waited for Thread A to complete its increment, and then it would have gone ahead.

In our case, here's what could happen:

Since DynamicFromMap is holding a static reference to the SimplePool , multiple DynamicFromMap called from different threads can concurrently call the acquire method in SimplePool .

In the image above, Thread A is calling the method, evaluating the condition to be true, but hasn't yet decreased the value of mPoolSize (which is shared with Thread B) while Thread B is concurrently calling the method and already evaluating the condition which is then true also. Then each separate call would subsequently decrease the value of mPoolSize and that's how you get an impossible value.

Fixing it

Looking into how to fix it, we found a PR on react-native that was not merged making it thread safe.

We then deployed a patched version of react native rolling it out to our users. The crash was finally fixed, hooray!

Finally, thanks to the help of Janic Duplessis (React Native core contributor), and Mikael Sand ( react-native-svg maintainer), the fix will be included in the next 0.57 minor for React Native

This bug took us some effort to fix but was a great opportunity to dig deeper into react-native and react-native-svg . A good debugger and a few well placed breakpoints go a long way. Hope you learned something useful as well!

[EDIT 02/04/2019] For French readers: Check out my talk on the subject @ParisJS! 😎