Judy array (as explained on http://judy.sourceforge.net/) is a C library that provides a state-of-the-art core technology that implements a sparse dynamic array.

Judy arrays are declared simply with a null pointer.

A Judy array consumes memory only when it is populated, yet can grow to take advantage of all available memory if desired.

Not a lot of PHP developers are aware of this library which is available as an C extension (Pecl) for PHP: http://php.net/manual/en/intro.judy.php

I want to give you a quick pros/cons of implementing Judy array in your application and a brief benchmark comparison to a more common types of array implementations available in PHP.

Array Implementations Tested

Array()

ArrayObject()

SplFixedArray()

Judy()

What I Analyzed

Execution time and amount of memory it takes to create 100 instances of each implementation.

Execution time, peak and allocation of memory during Insertion, iteration and removal of 10000 items within each implementation.

Benchmarking Framework

As with my testing of common NoSQL databases, I wrote a simple benchmarking framework that you can use to run your own tests that mimic your application.

You can find it here: https://github.com/AlekseyKorzun/benchmarker-array-php

Creation of 100 Instances

As you can see, when you create 100 instances the performance different is pretty much identical in both execution and memory utilization.

100 is a pretty generous number (in my option) unless you deal with a collection driven application.

If that’s the case, as number of instances goes up Array() implementation will perform slightly better.

Appending 10000 Items

Speed wise, Array() walks away from any other implementation which is not surprising. The peak memory usage is slightly above than the rest and memory consumption is almost identical to ArrayObject().

ArrayObject(), SplFixedArray() and Judy() both finish execution at 0.06 mark.

But when it comes to memory Judy is a winner in this benchmark, leading SplFixedArray() by 80K~ on peak usage and about 78K~ on utilization.

Removing 10000 Items

During removal of items we see results identical to our append test. The only different is that SplFixedArray() seems to be 0.1s faster than ArrayObject() and Judy().

Iteration Over 10000 Items

Iteration test produced same results as removal test. Execution wise Array() won hands down and memory utilization trophy goes to Judy().

Conclusion

As you can see, on a smaller scale of things (when dealing with 10000 items) the difference between different approaches is not that great.

But it’s very clear that Array() is fast and to the point when it comes with storing data and Judy() uses less memory than memory conscious SplFixedArray().



Based on my benchmarks I can point out following things:

As your data grows, Judy() arrays store it more efficiently. No question about that.

Iteration and data manipulation will always be faster in Array() implementation.

SplFixedArray() will iterate/manipulate large data sets faster than Judy().

Compared to SplFixedArray(), you don’t have to set initial array size, which might be a plus for some developers.

When storing data, unless you are using features that ArrayObject() has to offer it’s better to stick to a simple Array() implementation.

If you are not dealing with tons of data, you may still optimize as you wish but results will be minimal unless you are serving tons of requests per second and really need to juice your application performance.

As always, I recommend forking my benchmarking application and extending it to use your own data

to get an idea of how much performance you might gain from switching over.

You can also switch one of the web servers to utilize Judy() instead of X() and observe response to the change via resource monitoring over next few days.

I hope you enjoyed reading this post!

P.S

You can view spreadsheet of benchmark data here: https://docs.google.com/spreadsheet/pub?key=0AhePUdRMAppIdC0tZWNELTRsbElpZF81V28wTnpWaEE&output=html