UnifiedMap: How it works?

Eclipse Collections comes with it’s own implementations of List, Set and Map. It also has additional data structures like Multimap, Bag and an entire Primitive Collections hierarchy. In this blog we will take a look under the hood of UnifiedMap.

UnifiedMap

UnifiedMap is the Map implementation of Eclipse Collections and is implemented very differently than a JDK HashMap.

java.util.HashMap is backed by a table of Entry objects. The Entry implementation in HashMap has hashcode, key, value, next as members, HashMap essentially caches the hashcode of keys.

UnifiedMap schematic

A UnifiedMap is backed by an array where keys and values occupy alternate slots in the array : protected transient Object[] table . The flattened array structure stores only the keys and values there by creating a leaner Map implementation with reduced memory footprint. Added benefit of having keys and values in consecutive slots is improved cache locality.

Collisions in the main array are handled by putting a special object called CHAINED_KEY in key slot. The value slot of CHAINED_KEY (slot after CHAINED_KEY slot) contains another Object[] array similar to main array called CHAINED_OBJECT with keys and values in alternate slots.

Look ups in UnifiedMap use a standard hashcode index algorithm to find the location of a key in the array. If a key is not a CHAINED_KEY then the next slot contains the value. If a key is a CHAINED_KEY then the CHAINED_OBJECT is evaluated linearly to find the required key and the next slot in CHAINED_OBJECT contains the value.

Since UnifiedMap does not cache the hashcode, for each look up, hashcode needs to be computed. So, the performance of UnifiedMap is directly dependent on the hashcode implementation of the key.

Below are a few memory and performance comparisons between JDK 1.8 HashMap and Eclipse Collections 9.0.0 UnifiedMap.

Memory Footprint (lower number the better)

Memory Comparison HashMap<Integer, Integer> vs UnifiedMap<Integer, Integer>

Memory Comparison HashMap<String, String> vs UnifiedMap<String, String>

Performance Tests (higher number the better)

Source code for memory tests and performance tests is available on GitHub.

Summary: Eclipse Collections UnifiedMap has ~50% smaller memory footprint compared to JDK HashMap, but that comes with a minor performance implication.

Eclipse Collections is open for contributions. If you like it star it.