A few days ago ##java happened to discuss sets and bit patterns and things like that, I happened to mention EnumSet and that I find it useful. The rest of the gang wanted to know how it actually measures up, so this is a short evaluation of how EnumSet stacks up for some operations. We are going to look at a few different things.

EnumSet classes

There are two different versions of EnumSet :

* RegularEnumSet when the enum has less than 64 values

* JumboEnumSet used when the enum has more than 64 values

Looking at the code, it is easy to see that RegularEnumSet stores the bit pattern in one long and that JumboEnumSet uses a long[] . This of course means that JumboEnumSet s are quite a lot more expensive, both in memory usage and cpu usage (at least one extra level of memory access).

Memory usage

I created a little program to just hold one million Set s with a few values in each of them.

Note: the enumproject.zip was built by your editor, not your author – any problems with it are the fault of dreamreal and not ernimril. Note that the project is mostly for source reference and not actually running the benchmark.

List<Set<Token>> tokens = new ArrayList<> (); for (int i = 0; i < 1_000_000; i++) { Set<Token> s = new HashSet<> (); s.add (Token.LF); s.add (Token.CR); s.add (Token.CRLF); tokens.add (s); }

Heap memory usage for this program was about 250 MB according to JVisualVM.

Changing the new HashSet<> (); into EnumSet.noneOf (Token.class); we instead get 70 MB of heap memory usage.

Using the SmallEnum instead causes the HashSet to still use about 250MB, but drops the EnumSet usage down to 39 MB. I find it quite nice to save that much memory.

CPU performance

I constructed two simple tests, shown below, that calls a few methods on a Set that is either EnumSet or HashSet , depending on run. The enums have a few Set s that contain different allocations of the enum and the isX -methods only do return xSet.contains(this);

@Benchmark public void testRegular() throws InterruptedException { SmallEnum s = SmallEnum.A; boolean isA = s.isA (); boolean isB = s.isB (); boolean isC = s.isC (); boolean res = isA | isB | isC; } @Benchmark public void testJumbo() throws InterruptedException { Token t = Token.WHITESPACE; boolean isWhitespace = t.isWhitespace (); boolean isIdentifier = t.isIdentifier (); boolean isKeyword = t.isKeyword (); boolean isLiteral = t.isLiteral (); boolean isSeparator = t.isSeparator (); boolean isOperator = t.isOperator (); boolean isBitOrShiftOperator = t.isBitOrShiftOperator (); boolean res = isWhitespace | isIdentifier | isKeyword | isLiteral | isSeparator | isOperator | isBitOrShiftOperator; }

I did the benchmarking using jmh in order to find out how fast this is.

Using HashSet:

Benchmark Mode Cnt Score Error Units EnumSetBenchmark.testJumbo thrpt 20 46787074.985 ± 2373288.078 ops/s EnumSetBenchmark.testRegular thrpt 20 124474882.016 ± 2165015.166 ops/s

Using EnumSet:

Benchmark Mode Cnt Score Error Units EnumSetBenchmark.testJumbo thrpt 20 112456096.790 ± 320582.588 ops/s EnumSetBenchmark.testRegular thrpt 20 563668720.636 ± 594323.541 ops/s

This is of course quite a silly test and one can argue that it does not do very much useful, but it still gives us quite a good indication that performance gains are there. Using EnumSet is 2.4 times faster for jumbo enums, but 4.5 times faster for small (regular) enums for this kind of operation.

I do not claim that your usage will notice the same speedup, but it might be worth checking out.

Final thoughts