A Core Data Story

It’s been a not-too-uncommon rhetoric that Core Data is slow, buggy, or needlessly complex. I’ll be the first to admit that it’s not a perfect framework. However, far too often it is used as a scapegoat for whatever issues a developer is facing. Core Data is complex, but data management is complex. Attempts to simplify it often don’t scale. With a little bit more effort, Core Data does scale.

Core Data has made great strides in recent years, with additions such as batch updates and deletions, asynchronous fetches, unique constraints, etc. I highly recommend the What’s New in Core Data talks at every WWDC.

Case Study

My first task at my previous job was to turn the prototype app into a “ready for App Store” product. But there was a problem. After using the app daily for a few months, the founder had accrued lots of data. Logging in from a fresh install took over 2 minutes to fully synchronize. Users get impatient even with 2 seconds; this took over 120 grueling seconds.

After a bit of investigation, it turned out that all data synchronization, from the HTTP requests, to response parsing and saving in Core Data, was handled by a single framework.

I ended up replacing the library with AFNetworking for HTTP requests and custom code for the Core Data layer. How long did it take to login and sync? Less than 1 second.

Success! 100x improvement! Why the huge difference? To understand how I was able to get such a huge speedup, let’s look at what each approach was doing.

Library’s Approach

Parse each object, one at a time, from the JSON response Run a fetch request to see if this object exists Create or update Save Repeat from step 2 until done

For each object, we were doing a fetch, create/update and save. As our dataset grew to hundreds and thousands, we were executing an increasing amount of fetches.

Custom Code

Parse out all Object IDs from the JSON response Run 1 batch request to fetch all our needed objects Create a dictionary of [Object ID: Core Data object] Iterate through the JSON response. For each object, grab the pre-fetched object from our dictionary, then create or update. Save

Now we’re only executing 1 fetch for all objects, then a create/update per object, and a final save at the end. As our dataset grew, we were still only executing 1 fetch.

Sure, this is easier said than done. To get away with batch fetching our objects ahead of time, we have to ensure that mapping operations do not execute in parallel. This prevents, for example, 2 separate operations receiving data about object ABC123, not finding it, and therefore creating it. To enforce this, we ensure all data mapping occurs on a serial NSOperationQueue.

The payoffs were enormous. Since we knew exactly how our data was structured, and how our mapping operations took place, we were able to make assumptions and custom tailor our code to be as optimized as possible. As an added bonus, the entire team team gained insight into what our code was doing and got to learn how to use Core Data directly. New hires that came on later didn’t have to learn a new library, since they already knew Core Data.

The Moral of the Story

I’m not claiming that 3rd party libraries are bad and should be avoided. There is a lot of great work in the Cocoa open source community, and nearly all of my apps use at least one project from GitHub. I’m just recommending that you should understand what the library is doing or what it is replacing.

As a rule of thumb, I’d recommend not using a generic 3rd party library until you have tried Cocoa’s alternative. I don’t mean go implement your own QR code scanner or image processing library. Rather, before you jump into something like AFNetworking/AlamoFire (which is a great library that I sometimes still use), try writing some requests with NSURLSession.

Bonus — Swift!

This application was written entirely in Objective-C. However, since a lot of the data mapping operation involved transforming data from JSON to dictionaries, I was curious if the Swift compiler would yield further improvements

Swift 1 performed slightly slower than Objective-C. Swift 2.0 performed just as fast as Obj-C. However, the Swift code took a sizeable performance penalty from bridging between Obj-C and Swift. I haven’t tried this test with newer versions of Swift, but I’d be willing to bet it would be faster!