Researchers from CWI fixed a bug in the widely used object-oriented programming language Java in February 2015. They found an error in a broadly applied sorting algorithm, TimSort, which could crash programs. The bug had already been known from 2013 but was never correctly resolved. When researcher Stijn de Gouw from the CWI research group Formal Methods attempted to prove the correctness of TimSort, he encountered the bug that could threaten the security. He filed a bug report with an improved version, which has now been accepted. This version of TimSort is used by Android.

Java is used for server software, Internet-based banking services and, for instance, in computer games like Minecraft. The programming language is broadly used because it provides a lot of support in the form of libraries. Developers don’t have to invent a function to sort data, for instance, since they can simply get it from the library support. The sorting algorithm TimSort is part of the java.util.Arrays and java.util.Collections libraries. It is named after its creator, Tim Peters, who designed it in 2002 for the Python programming language, where it is now the default sorting algorithm. The sorting function is often used, for example in the analysis of data. De Gouw discovered that a previous fix of the error was wrong. The bug causes programs to crash when used on a large input that is sorted in a specific way.

For his research Stijn de Gouw used KeY, a state-of-the-art open source verification tool, to prove the mechanical correctness of TimSort. Then he designed a correction, maintaining performance. Frank de Boer, head of the Formal Methods group says: "So far, it was one of the hardest correctness proofs ever of an existing Java library. It required more than two million rules of inference and thousands of manual steps.” He adds: "With such an important language like Java, it is important that software does not crash. This result illustrates the importance of formal methods for the society." The study was co-funded by the EU project Envisage. Software is one of the strategic themes of the Centrum Wiskunde & Informatica (CWI) in Amsterdam, where the research was conducted.

The full analysis can be found on http://www.envisage-project.eu/timsort-specification-and-verification/ .

More information

- the bug report, 'TimSort fails with ArrayIndexOutOfBoundsException on worst case long arrays'

- the EU project Envisage

- CWI Formal Methods group

- the used verification tool KeY from Richard Bubel and Reiner Hähnle

Update

- an Envisage blogpost on this item by Stijn de Gouw was released on 24 February. It had already 70,000 hits on the first day.

- Google programming language Go also uses TimSort

- top researchers in Formal Methods and developers worldwide refer to this research, such as Moshe Vardi and Peter O’Hearn