As a new open-source library, we present to you a read-only in-memory database that focuses on management use of master data.

GitHub — Cysharp/MasterMemory

We made it from our experience in game development up until now with an emphasis on the three points of “Saving memory (also because it is called in-memory, it pays attention to memory usage)”; “High-speed database loading (when it takes a long time to build, it will have a major effect on a game’s startup speed)”; and “High-speed search (similar queries as dictionary lookup.)” The benchmark below shows the results.

MasterMemory, SQLite, LiteDB, and RocksDB are in-process, and only Memcached is a comparison of intra-machine transmission by another process.

It is 4700 times faster than SQLite (file reading model) and the allocation of one query is zero. Furthermore, file sizes, when they are saved, are also extremely small (as they are sufficiently small, when there is an update, the everything can be put on CDN and replaced, so the operation will be easy).

In addition to being able to be used with Unity, it can be used in the same way on the server-side and .NET Core applications. As a comparison with the usage on servers, we also compared it with situations of transmissions between Memcached within the same machine, and naturally, it was overwhelmingly fast.

By separating “all in-memory, read-only”, the internal structure becomes simple, and through this, it is possible to eliminate a lot of processing, such as generating intermediate objects for queries, transmission, data deserialization, combining index with actual data, lock, etc. Furthermore, even when you compare it to simple in-process hash tables (Dictionary,) there is less memory usage, faster build times, and it has improved performance characteristics with many functions such as “range search, proximity search, multiple-keys, and secondary index.”

There is a similar concept of (embeddable write-once key-value store) called PalDB which has been developed by LinkedIn, but its implementation and performance characteristics are completely different.

Complete typed

By generating code in C# to C#, operations performed on the database are completely typed.

By basing on prior automatic generation, it contributes to improving usability as API as well as improving the performance characteristics.

The drawback is that it creates the hassle of having to always generate codes. For code completion etc., it is much more convenient if it has been typed, and if it is not automatically generated, then in addition to a lack of security it can be assumed that the amount of code written by hand will increase, so code generation is set as a prerequisite. Additionally, in the case of Unity, as we use MessagePack for C# to save internal data, and since its serializer automatic generation is also necessary, a part of the thinking is, “If you add one anyway, then why not add one more?”

The search method is flexible and for non-unique cases, it can obtain collections representing a range called RangeView<T>, and it also is possible to have multiple key values, multiple search ranges, and to search for close-range values.

Since all valid indexes are typed, besides just following the input completion, there is no possibility to write a wrong query.

With regard to Join, as queries are extremely fast, there is no problem to use it in conjunction with in LINQ to Objects like a Dictionary.

Minimum memory usage and automatic string interning

The data used (T) is actually stored in the memory and is generated by the heap at the time of database construction. Furthermore, it does not store data other than a collection of actual data (T[]) (if a secondary index is not used.) In other words, memory used in the circumstance is the minimum possible value in theory.

Also, since strings are automatically interned at the time the database is built, if any subnormal string data is present in the data, it is overwhelmingly more memory efficient compared to other methods.

For instance,

If there is that kind of data, then there are five individual instances of strings called “goblin” within the memory. In other words, it uses a memory size of “goblingoblingoblingoblingoblin.” However, since we know that it is the identical string, we only need to have just one string and its reference. The method of resolving this is interning, and when you interning, even if for instance 10,000 “goblin” appear, only one part of memory is required.

It is an extremely aggressive method, but once MasterMemory has built a database, it continues to keep the data unchanged as long as applications exist, so with this presumed notion, there is no problem even with automatic interning.

For the actual implementation, we take advantage of the flexible customization capabilities of MessagePack for C#, and during database construction = the deserialization process is hooked. Naturally, automatic interning can be disabled.

Immutable Updates

Even though the database itself is read-only, when updates are separately received, it is possible to construct a new database from the difference.

For instance, when implementing real-time communication in a game (with server applications such as MagicOnion, developed by our company,) and there is a master data update and you would like to update the global master data while retaining the data in one game as it was without referring to the globally stored database every time, if you store references, it is possible to store several versions of the database constructed by incremental updates.

Conclusion

We hope you will give it a try, as it has specifications that are compatible with more client applications (Unity) as well as with server applications (.NET Core.) Either can be used, but by combining both, you can broaden the scope of possible choices of architecture so we hope that you will consider that as well.