Much time has passed since we started to work on the custom allocation framework with Paul Pedriana. The core of the solution (FastAllocBase class, bug #20422) was landed into the trunk half year ago.

After that check in we started to work on JavaScriptCore class inheritances, because every class which is instantiated by operator new needs to be inherited from FastAllocBase. Now, after that half year, almost every necessary class in JavaScriptCore is inherited from FastAllocBase.

These changes made the enabling of TCmalloc on Qt-related WebKit ports possible.

Results for x86

Let's see the speed results of QtWebKit on x86-Linux (with JIT) in the following table:

QtWebKit x86-Linux System malloc TCmalloc Improvement SunSpider 774 ms 743 ms ~4.0% faster V8 3560 ms 3492 ms ~2.0% faster WindScorpion 281195 ms 269435 ms ~4.2% faster

SunSpider

774ms -> 743ms

(-31ms, ~4.0% faster) V8

3560ms -> 3492ms

(-68ms, ~2.0% faster) WindScorpion

19524ms -> 17875ms

(-1649ms, ~4.2% faster)

Results for ARM

We do benchmarking on ARM hardware as well. The effect of enabling TCmalloc on ARM (with JIT) is as follows:

QtWebKit ARM-Linux System malloc TCmalloc Improvement SunSpider 10967 ms 10480 ms ~4.4% faster V8 24172 ms 22788 ms ~5.7% faster WindScorpion 281195 ms 269435 ms ~4.1% faster

SunSpider

10967ms -> 10480ms

(-487ms, ~4.4% faster) V8

24172ms -> 22788ms

(-1384ms, ~5.7% faster) WindScorpion

281195ms -> 269435ms

(-11760ms, ~4.1% faster)

Future

The integration of the custom allocation framework for WebCore is still in progress, so I can not show performance results for the whole WebKit yet.

After all...

As the charts show, we achieved effective performance improvement with enabling TCmalloc on the Qt-port of WebKit. However, there is always a reverse of a medal... I'll talk about the memory costs in another post. :-)