In my conversations with developers, I’ve heard a pretty common theme from them that “Core Data is hard” or “Core Data is buggy” or “I could never get it to work right and gave up on it”.

I’ve spent a lot of time using Core Data and thought I’d share my “Laws of Core Data”. These are a set of rules I’ve developed over time on how to use Core Data in such a way that it is almost entirely painless. When I follow these rules, I almost never have any problems using it.

Do not use Core Data as if it were a database It’s common to hear developers talk about and treat Core Data as if it were a database. They see that it’s powered by SQLite, and think it’s functionally equivalent. It is not. Core Data is an “object graph and persistence framework”, which is basically like a fancy kind of object-relational mapping. That means it is a whole bunch of code to help you maintain a graph (ie, a “network” of related pieces of data with a defined organization) of objects and then persist them in some fashion. It does not necessarily mean you have tables with rows of data. It does not necessarily mean that you have the ability to join across data types. It does not necessarily mean that it’s even stored as a file on your disk. Some things that Core Data can do beyond most databases: make sure bidirectional relationships are properly hooked up

use custom data validation rules

use custom data migration logic

store specific attributes outside the primary store location

serialize custom attribute types as Data

index content using Spotlight

automatic schema and data migration Having these abilities means you can have Core Data take care of a lot more logic for you than if you were using a traditional database.

Do not use Core Data as if it were a SQLite wrapper This is very much related to the first law, but is a bit more specific, and it has to do with how Core Data persists data. It is exceptionally rare to find a Core Data implementation that does not use SQLite as the persistence layer, but it does happen. Out-of-the-box, Core Data natively supports 4 different ways to “persist” data: As a SQLite file

As an XML file

As a binary file

As an in-memory representation In addition to these, Core Data also allows you to create your own persistence mechanism, by subclassing either NSAtomicStore or NSIncrementalStore . So, if you wanted, you could make Core Data save things to a git repository, or to CloudKit, or to MySQL or PostgreSQL, or to your own custom backend… Several years ago I created a framework to access the stackoverflow.com API, and networking was done via a custom Core Data store that translated Core Data requests in to API calls. It was weird, but it worked. Core Data does not have to be just SQLite. In fact, modeling your schema as if it were SQLite (or some other RDBMS variant) is a sure sign you’re “doing it wrong”. Setting up custom things like artificial foreign keys or join tables are almost never necessary and are almost always wrong.

Your NSManagedObjectContext is your “stack” Typically one of the first things that developers do when creating a Core Data stack is to create a “DataStack” object that encapsulates loading up the model, creating the store coordinator, and then creating the main NSManagedObjectContext . That “stack” object then gets passed around as your “Core Data manager” object by which you get the context you need. iOS 10.0 and macOS 10.12 added the concept of an NSPersistentContainer , which does a lot of this for you. Having a single object to load up your model and everything is great. But you don’t need to pass it around. It’s usually passed around in order to have easy access to making a new context or accessing the model. That is all unnecessary. If you do decide to pass Core Data objects around your app, then all you need is the NSManagedObjectContext (“MOC”). Your MOC has an NSPersistentStoreCoordinator (“PSC”) property, which itself has an NSManagedObjectModel (“MOM”, aka the schema). So from a single MOC, you can get any information you need about your schema, where things are being saved, what format they’re being saved in, the configuration for the persistent stores, etc. If you decide you need to create a new, one-off MOC, it’s easy to do so with your existing MOC: let existingContext: NSManagedObjectContext = ... let newContext = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType) newContext.persistentStoreCoordinator = existingContext.persistentStoreCoordinator // that's it You don’t need to pass around a “stack” object. (Creating new contexts like this isn’t ideal, because of another law further down)

Never ever ever ever ever use an NSManagedObject outside its context’s queue This law is the source of bugs when it comes to Core Data. Offhand I’d guess that more than 90% of the pain developers experience with Core Data is because of this. Core Data tries to be efficient; it typically doesn’t like to load up more data than you need, which means there are times when you ask it for data (like an object property) and it doesn’t have it handy. When this happens, it has to go load the data from its store (which might not even be a local file on disk!) before it can respond to you. This is called “faulting”. The marker value internal kept by a managed object is a “fault”, and the process of “fulfilling” (ie, retrieving the data) the fault is “faulting”. Here’s the thing: Core Data has to be safe. It has to synchronize these faulting calls with other accesses of the persistent store, and it has to do it in a way that isn’t going to interfere with other calls to fault in data. The way it does that is by expecting that all calls to fault in data happen safely inside one of its queues. Every managed object “belongs” to a particular MOC (more on this in a minute), and every MOC has a DispatchQueue that it uses to synchronize its internal logic about loading data from its persistentStoreCoordinator . If you use an NSManagedObject from outside the MOC’s queue, then the calls to fault in data are not properly synchronized and protected, which means you’re susceptible to race conditions. So, if you have an NSManagedObject , the only safe place to use it is from inside a call to perform or performAndWait on its MOC, like so: let object: NSManagedObject = ... var propertyValue: PropertyType! object.managedObjectContext.performAndWait { propertyValue = object.property } ... Using your own DispatchQueue or one of the global queues is insufficient. The managed object has to be accessed from the queue that is controlled by the MOC, and the way to do that is with the perform and performAndWait methods. There is one special case to this, and that is dealing with managed objects that belong to a MOC whose queue is the “main” queue. The DispatchQueue.main queue is bound to the main thread of your app, and so if you’re on the main thread and have a main-thread-object, you can “safely” not use perform calls because you are already inside the context’s queue. The only managed object property that is safe to use outside of a queue or pass between queues/threads is the object’s objectID : this is a Core Data-provided identifier unique for that particular object. You can access this property from anywhere, and it is the only way to “transfer” a managed object from one context to another: let objectInContextA: NSManagedObject = ... let objectID = objectInContextA.objectID let contextB: NSManagedObjectContext = ... contextB.perform { let objectInContextB = contextB.object(with: objectID) // objectInContextB is now a separate *instance* from the original object, // but both are backed by the same data in the persistent store } I will add here that it is really unfortunate we have to care about this. It’s not hard to imagine a world where managed objects deal with this sort of stuff automatically. However, this is what happens when we’re dealing with a framework that is over 14 years old and is based on another framework (EOF) that is 24 years old. The problem of “binary compatibility” is a blog post for another day.

Do not use NSManagedObject as if it were an NSObject This is a generalization of the previous law. Because of the weirdness around faulting and queue access, it’s my opinion that NSManagedObject shouldn’t actually be a subclass of NSObject . When we see NSObject in our code, we have assumptions about how they work with regards to memory management, multi-threaded access, and behavior. NSManagedObject breaks enough of these rules that it probably shouldn’t be an NSObject , but should be its own root class. So, forget that it’s an NSObject . It doesn’t really behave like one, and you shouldn’t use it as if it were.

You usually don’t need parent-child contexts One of the more esoteric features of Core Data is the ability to have relationships between contexts: you can have a MOC that is not actually backed by the NSPersistentStoreCoordinator , but is instead backed by another MOC. This has some really interesting implications, but in general: you don’t need this. The ability to have a child MOC is neat in some corner cases. Let’s review the core functionality of MOCs in order to understand those cases: MOCs load objects from its source (a PSC or another MOC)

MOCs save objects to its source (a PSC or another MOC)

MOCs enforce graph integrity when they save That’s really the core pieces. So, you would want a child MOC if: You only want to load objects that are already loaded in another MOC

You want to save objects to another MOC, but not necessarily the PSC

You want to enforce graph integrity without persisting the objects As you can see, when you deal with child contexts, you’re really dealing with transient (non-persisted) objects. You’re fundamentally changing loading and saving behavior. The times when you need this are pretty rare. You would typically want this for something like a complex sub-graph creation flow, where along each step of the flow, you need to enforce relationship integrity, but don’t want to actually save it to the persistent store until the flow is complete. And if the flow is cancelled, you don’t want any of it to be saved at all. You could do that by having a child context, doing all the flow steps in the child context, and saving it up to a parent context, but you can still delete the child context if the user aborts. They’re kind of like transactions in normal database systems. You can start importing or editing a bunch of data, and if something goes wrong or is cancelled, you can roll back the changes. Parent/child contexts are usually advocated for something like “load some data in the background, and the saving it pushes it to the main queue context”. That can work, but it does mean that in order to persist your data, you actually have to save two contexts, instead of just one (because save() -ing a context only pushes the data up one level. For a child context, the data only goes to the parent context, not all the way up to the PSC). In my opinion, using a child context like this is unnecessarily complicated. For general, non-transactional usage, I think it’s better to have two contexts (one for the main thread, one for the background) that both link directly to the PSC. Importation of data is done on the background context, and when it saves, the main queue listens for the NSManagedObjectContextDidSave notification and merges in the changes with .mergeChanges(fromContextDidSave:) method to update its internally-held objects. Even that step might be unnecessary if the context has automaticallyMergesChangesFromParent set to true .

Keep your main queue context read-only If you’re building an app that is reading information out of Core Data, displaying it to the user, and allowing minimal edits, then in my experience it’s best to keep the main queue context a “read-only” context. By having strict rules around which contexts are readable vs writable, it makes it much easier to reason about when parts of your UI should be reloaded: commands to update the UI come from a single direction (from your model towards your UI). If you allow mutation of stored information, then that can be encapsulated as a sort of “request for mutation”, sent off to the controller for this part of your model, and executed there. Performing the mutation on a Core Data object directly makes it harder to debug where changes are coming from (the data import step? editing in the UI? something else?), because you have a single point of entry. If you follow the next law as well, then this law becomes very simple to enforce.

Use an abstraction layer This is more along the lines of “general good advice” than anything specific to Core Data, but here it is: It’s generally a smart thing to hide the fact that you’re using Core Data from the rest of your app. This isn’t because you’re “ashamed” of it and need to obscure it (😉), but is more because of the fact that Core Data objects carry a decent amount of baggage with them that the rest of your app shouldn’t have to know about it (see earlier point about how objects bring along the entire stack). When you pass managed objects or contexts around your app, the temptation to just reach inside an object and pull out the PSC or the MOM or whatever and use it becomes too high. Don’t do that. Avoid violating the Law of Demeter and have a proper controller object that you can ask for what you need. You could hide a managed object behind a protocol, but that also makes it easy to forget the law about queue usage. In my opinion, you should keep the details of graph integrity and persistence to a confined part of your app, and data should only get out via custom-purpose struct values (or something like them). As a rudimentary example of what this might look like, you could do something like this: protocol ManagedObjectInitializable { init(managedObject: NSManagedObject) } class ModelController { func fetchObjects<T>(completion: @escaping (Array<T>) -> Void) where T: ManagedObjectInitializable { ... } } struct Person: ManagedObjectInitializable { let firstName: String let lastName: String ... } There are many different ways you could abstract out the details of Core Data, each with their pros and cons. But hiding Core Data like this from the rest of your app is a huge step along the road to proper encapsulation and “need-to-know” information hiding.