Node-TimSort: Fast Sorting for Node.js

UPDATE: Benchmark results of Node-TimSort with Node.js v4.1.1 available here.

TimSort is an adaptive and stable sort algorithm based on merging that requires fewer than nlog(n) comparisons when run on partially sorted arrays. The algorithm uses O(n) memory and still runs in O(nlogn) (worst case) on random arrays.

I implemented TimSort for Node.js in the timsort module, which is avalaible on Github and npm. The implementation is based on the original TimSort developed by Tim Peters for Python’s lists (code here). TimSort has been also adopted in Java starting from version 7.

Performance

To study the performance of the implementation I wrote a benchmark which is available on Github at benchmark/index.js . It compares the timsort module against the default array.sort method in the numerical sorting of different types of integer array (as described here):

Random array

Descending array

Ascending array

Ascending array with 3 random exchanges

Ascending array with 10 random numbers in the end

Array of equal elements

Random Array with many duplicates

Random Array with some duplicates

For any of the array types the sorting is repeated several times and for different array sizes, average execution time is then printed. I run the benchmark on Node v0.12.7 (both pre-compiled and compiled from source, results are very similar), obtaining the following values:

Execution Time (ns) Speedup Array Type Length TimSort.sort array.sort Random 10 2374 4256 1.79 100 12709 45903 3.61 1000 134876 479581 3.56 10000 1724563 6485514 3.76 Descending 10 1637 2869 1.75 100 2631 21267 8.08 1000 9330 352918 37.83 10000 74009 5114658 69.11 Ascending 10 1654 1751 1.06 100 2596 20159 7.77 1000 8253 340309 41.23 10000 60613 5045549 83.24 Ascending + 3 Rand Exc 10 1815 1981 1.09 100 4126 20564 4.98 1000 11490 342398 29.80 10000 85632 5062110 59.11 Ascending + 10 Rand End 10 2001 2410 1.20 100 6106 23537 3.85 1000 17195 337073 19.60 10000 99977 4868866 48.70 Equal Elements 10 1581 1710 1.08 100 2492 4562 1.83 1000 7337 31360 4.27 10000 50090 311882 6.23 Many Repetitions 10 1966 2415 1.23 100 15115 25965 1.72 1000 182287 372412 2.04 10000 2382618 5317724 2.23 Some Repetitions 10 1994 2549 1.28 100 14432 25101 1.74 1000 181708 364835 2 10000 2351346 5149683 2.19

TimSort.sort is faster than array.sort on any of the tested array types. In general, the more ordered the array is the better TimSort.sort performs with respect to array.sort (up to 80 times faster on already sorted arrays). And also, the bigger the array the more we benefit from using the timsort module.

These data strongly depend on Node.js version and the machine on which the benchmark is run. I strongly encourage you to clone the repository and run the benchmark on your own setup with:

npm run benchmark

Please also notice that:

This benchmark is far from exhaustive. Several cases are not considered and the results must be taken as partial

inlining is surely playing an active role in timsort module’s good performance

module’s good performance A more accurate comparison of the algorithms would require implementing array.sort in pure javascript and counting element comparisons

in pure javascript and counting element comparisons array.sort will probably still be faster at lexicographically sorting arrays of numbers. In this case, the timsort module inefficiently converts values to strings inside the compare function and then compares the strings. array.sort , instead, uses a smarter and faster lexicographic comparison of numbers (will try to do something similar soon).

Stability

TimSort is stable which means that equal items maintain their relative order after sorting. Stability is a desirable property for a sorting algorithm. Consider the following array of items with an height and a weight.

[ { height : 100 , weight : 80 }, { height : 90 , weight : 90 }, { height : 70 , weight : 95 }, { height : 100 , weight : 100 }, { height : 80 , weight : 110 }, { height : 110 , weight : 115 }, { height : 100 , weight : 120 }, { height : 70 , weight : 125 }, { height : 70 , weight : 130 }, { height : 100 , weight : 135 }, { height : 75 , weight : 140 }, { height : 70 , weight : 140 } ]

Items are already sorted by weight . Sorting the array according to the item’s height with the timsort module results in the following array:

[ { height : 70 , weight : 95 }, { height : 70 , weight : 125 }, { height : 70 , weight : 130 }, { height : 70 , weight : 140 }, { height : 75 , weight : 140 }, { height : 80 , weight : 110 }, { height : 90 , weight : 90 }, { height : 100 , weight : 80 }, { height : 100 , weight : 100 }, { height : 100 , weight : 120 }, { height : 100 , weight : 135 }, { height : 110 , weight : 115 } ]

Items with the same height are still sorted by weight which means they preserved their relative order.

array.sort , instead, is not guarranteed to be stable. In Node v0.12.7 sorting the previous array by height with array.sort results in:

[ { height : 70 , weight : 140 }, { height : 70 , weight : 95 }, { height : 70 , weight : 125 }, { height : 70 , weight : 130 }, { height : 75 , weight : 140 }, { height : 80 , weight : 110 }, { height : 90 , weight : 90 }, { height : 100 , weight : 100 }, { height : 100 , weight : 80 }, { height : 100 , weight : 135 }, { height : 100 , weight : 120 }, { height : 110 , weight : 115 } ]

As you can see the sorting did not preserve weight ordering for items with the same height .

Usage

To use the module in you project install the package:

npm install --save timsort

And use it:

var TimSort = require ( ' timsort ' ); var arr = [...]; TimSort . sort ( arr );

As array.sort() by default the timsort module sorts according to lexicographical order. You can also provide your own compare function (to sort any object) as:

function numberCompare ( a , b ) { return a - b ; } var arr = [...]; var TimSort = require ( ' timsort ' ); TimSort . sort ( arr , numberCompare );

You can also sort only a specific subrange of the array: