Once again, we can construct the simple experiment. Both deduplication and interning and trivally implementable with HashMap and ConcurrentHashMap , which gives us a very nice JMH benchmark:

@State(Scope.Benchmark) public class StringIntern { @Param({"1", "100", "10000", "1000000"}) private int size; private StringInterner str; private CHMInterner chm; private HMInterner hm; @Setup public void setup() { str = new StringInterner(); chm = new CHMInterner(); hm = new HMInterner(); } public static class StringInterner { public String intern(String s) { return s.intern(); } } @Benchmark public void intern(Blackhole bh) { for (int c = 0; c < size; c++) { bh.consume(str.intern("String" + c)); } } public static class CHMInterner { private final Map<String, String> map; public CHMInterner() { map = new ConcurrentHashMap<>(); } public String intern(String s) { String exist = map.putIfAbsent(s, s); return (exist == null) ? s : exist; } } @Benchmark public void chm(Blackhole bh) { for (int c = 0; c < size; c++) { bh.consume(chm.intern("String" + c)); } } public static class HMInterner { private final Map<String, String> map; public HMInterner() { map = new HashMap<>(); } public String intern(String s) { String exist = map.putIfAbsent(s, s); return (exist == null) ? s : exist; } } @Benchmark public void hm(Blackhole bh) { for (int c = 0; c < size; c++) { bh.consume(hm.intern("String" + c)); } } }

The test tries to intern lots of Strings, but the actual interning happens only for the first walk through the loop, and then we only checking the String after the existing mappings. size parameter controls the number of Strings we intern, thus limiting the String table size we are dealing with. This is the usual case with interners like that.

Running this with JDK 8u131:

Benchmark (size) Mode Cnt Score Error Units StringIntern.chm 1 avgt 25 0.038 ± 0.001 us/op StringIntern.chm 100 avgt 25 4.030 ± 0.013 us/op StringIntern.chm 10000 avgt 25 516.483 ± 3.638 us/op StringIntern.chm 1000000 avgt 25 93588.623 ± 4838.265 us/op StringIntern.hm 1 avgt 25 0.028 ± 0.001 us/op StringIntern.hm 100 avgt 25 2.982 ± 0.073 us/op StringIntern.hm 10000 avgt 25 422.782 ± 1.960 us/op StringIntern.hm 1000000 avgt 25 81194.779 ± 4905.934 us/op StringIntern.intern 1 avgt 25 0.089 ± 0.001 us/op StringIntern.intern 100 avgt 25 9.324 ± 0.096 us/op StringIntern.intern 10000 avgt 25 1196.700 ± 141.915 us/op StringIntern.intern 1000000 avgt 25 650243.474 ± 36680.057 us/op

Oops, what gives? String.intern() is significantly slower! The answer lies somewhere in the native implementation ("native" does not equal "better", folks), which is clearly visible in with perf record -g :

- 6.63% 0.00% java [unknown] [k] 0x00000006f8000041 - 0x6f8000041 - 6.41% 0x7faedd1ee354 - 6.41% 0x7faedd170426 - JVM_InternString - 5.82% StringTable::intern - 4.85% StringTable::intern 0.39% java_lang_String::equals 0.19% Monitor::lock + 0.00% StringTable::basic_add - 0.97% java_lang_String::as_unicode_string resource_allocate_bytes 0.19% JNIHandleBlock::allocate_handle 0.19% JNIHandles::make_local

While the JNI transition costs quite a bit on itself, we seem to spend quite some time in StringTable implementation. Poking around it, you will eventually discover -XX:+PrintStringTableStatistics , which will print something like:

StringTable statistics: Number of buckets : 60013 = 480104 bytes, avg 8.000 Number of entries : 1002714 = 24065136 bytes, avg 24.000 Number of literals : 1002714 = 64192616 bytes, avg 64.019 Total footprint : = 88737856 bytes Average bucket size : 16.708 ; <---- !!!!!!

16 elements per bucket in a chained hash table speaks "overload, overload, overload". What is worse, that string table is not resizeable — although there was experimental work to make them resizable, that was shot down for "reasons" . It might be alleviated with setting larger -XX:StringTableSize , for example to 10M:

Benchmark (size) Mode Cnt Score Error Units # Default, copied from above StringIntern.chm 1 avgt 25 0.038 ± 0.001 us/op StringIntern.chm 100 avgt 25 4.030 ± 0.013 us/op StringIntern.chm 10000 avgt 25 516.483 ± 3.638 us/op StringIntern.chm 1000000 avgt 25 93588.623 ± 4838.265 us/op # Default, copied from above StringIntern.intern 1 avgt 25 0.089 ± 0.001 us/op StringIntern.intern 100 avgt 25 9.324 ± 0.096 us/op StringIntern.intern 10000 avgt 25 1196.700 ± 141.915 us/op StringIntern.intern 1000000 avgt 25 650243.474 ± 36680.057 us/op # StringTableSize = 10M StringIntern.intern 1 avgt 5 0.097 ± 0.041 us/op StringIntern.intern 100 avgt 5 10.174 ± 5.026 us/op StringIntern.intern 10000 avgt 5 1152.387 ± 558.044 us/op StringIntern.intern 1000000 avgt 5 130862.190 ± 61200.783 us/op