A rift in the NTP world

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

The failure of the Network Time Protocol (NTP) project could be catastrophic. However, what few have noticed is that the attempts to prevent that catastrophe may have created entirely new challenges.

NTP is a Internet Engineering Task Force (IETF) standard, handled by its NTP working group. As Tom Yates described it in an LWN article:

[NTP] quietly and without much fuss performs the critical Internet function of knowing the correct time. Using it, a computer with imperfect communications links may join a distributed community of servers, each of which is either directly attached to a reliable clock, or is trying to best synchronize its clock to one or more better-synchronized members of the community.

First designed in 1985 by David L. Mills, the protocol has been coordinated in recent years by the Network Time Foundation. Today, it develops a number of related standards, including Ntimed, PTPd, Linux PTPd, RADClock, and the General Timestamp API. For most of this time, the primary manager of the project has been Harlan Stenn, who has volunteered thousands of hours at the cost of his own consulting business while his NTP work is only intermittently funded.

Several years ago, the project's inadequate funding became known in the media and Stenn received partial funding from the Linux Foundation's Core Infrastructure Initiative, which was started after the discovery of how the minimal resources of the OpenSSL project left systems vulnerable to the Heartbleed vulnerability. Searching for additional funding, Stenn contacted the Internet Civil Engineering Institute (ICEI) and began working with two of its representatives, Eric S. Raymond and Susan Sons.

However, the collaboration did not go smoothly. According to Stenn, Raymond contributed one patch and had several others rejected, but Stenn's ideas and Raymond's and Sons's were out of sync. "I spent a lot of time trying to work with Susan Sons," Stenn said in a phone interview, "Then all of a sudden I heard they have this great plan to rescue NTP. I wasn't happy with their attitude and approach, because there's a difference between rescuing and offering assistance. [Their plan was] to rescue something, quote unquote, fix it up, and turn it over to a maintenance team." Beside the fact that this plan would eliminate Stenn's role, he considered it impractical because the issue is not merely maintenance, but also continued development of the protocol. The efforts to collaborate finally collapsed when Raymond and Sons created a fork they called Network Time Protocol Secure (NTPsec).

Today, the NTP Foundation lists four main contributors, one of whom is on sabbatical and acknowledges the contributions of 33 in all. In addition, another seven work on related projects. By contrast, NTPsec lists seven contributors, including Sons. Although NTPsec began by using the NTP code, today neither NTP nor NTPsec shares code or patches with the other.

Both projects would probably more or less agree on the general outline of events given above. Yet it is difficult to be sure, since both Sons and Mark Atwood, NTPsec's Project Manager pro-tem, ignored requests for an interview. However, the details of the two project's claims could hardly be farther apart. The two projects differ on the scale and cause of NTP's current problems and the approach that should be taken to address those problems.

The NTPsec version

Sons has publicly described the NTPsec interpretation several times, including in a presentation at OSCON and in a podcast interview with Mac Slocum of O'Reilly. In the podcast, Sons depicted NTP as a faltering project run by out-of-touch developers. According to Sons, the build system was on one server whose root password had been lost. Moreover, "the standard of the code was over sixteen years out of date in terms of C coding standards" and could not be fully assessed by modern tools. "We couldn't even guarantee reproducible results across different systems," she added.

Sons also claimed that "security patches weren't being circulated in a timely manner," taking "months to years" for release. Meanwhile, "security patches were being circulated secretly and leaked," although she did not explain how. Instead she offered an anecdote about a group of script-kiddies who knew that NTP was useful for denial of service attacks while remaining unaware of its function.

However, Sons was most concerned about the aging group of developers who maintain low-level Internet software (including NTP) in general. Most of them, she said, "are older than my father.... [and] are not always up to date on the latest techniques and security issues." Many are burning out from trying to maintain critical code while working full time jobs, and Sons suggested that they "should be retired."

Faced with such chaos, Sons said, she soon realized that "the Internet is going to fall down if I don't fix this." When efforts to gain acceptance for her plans from Stenn and other NTP developers failed, Sons and Raymond started NTPsec, placing the revised code in a Git repository rather than the Bitkeeper one used by the NTP Foundation, rewriting NTP rewriting NTP scripts in Python rather than C various other languages to make attracting new developers easier, and actively promoting the project in order to attract volunteers.

In her OSCON presentation she listed several accomplishments (Sons refers to the original NTP project as "NTP Classic"):

Due to a reduction in code of over 2/3 (from 227kLOC to 74kLOC), NTPsec was immune to over 50% of NTP Classic vulns BEFORE discovery in the last year.

NTPsec patches security vulnerabilities, on average, within less than 12 hours after discovery. Note that publication is sometimes slowed to coordinate with NTP Classic releases.

NTPsec's vulnerability response has pressured NTP Classic to speed up their response from months-to-years to days-to-weeks upon threats of funders pulling out.

[...] NTPsec is poised to replace NTP Classic in the coming year in installations around the world.

Sons's perspective on her involvement is summarized by the title of her OSCON presentation: "Saving Time." She has since become president of ICEI; she described herself in the presentation as having "moved on" and is no longer involved with NTPsec on a daily basis.

Meanwhile, a web search shows that media coverage of events accepts Sons's account while rarely attempting to hear NTP's side of the story. Cory Doctorow repeated the NTPsec version, and so did Brady Dale of the Observer, while Steven J. Vaughan-Nichols recommended NTPsec over NTP. The security site UpGuard was equally unquestioning, while CircleID, a site specializing in Internet infrastructure, only revised its coverage after complaints from representatives of NTP. In public, the NTPsec version of events has become the official one.

The NTP side

NTPsec depicted NTP as being in a state of total disorder. However, in communications with me, Stenn offered a radically different story. In Stenn's version of events, NTPsec, far from being the savior of the Internet, has misplaced priorities and its contributors lack the necessary experience to develop the protocol and keep it secure.

Stenn denied many of Sons's statements outright. For example, asked about Sons's story about losing the root password, he dismissed it as "a complete fabrication." Similarly, in response to her remarks about older tools and reproducible results across different systems, Stenn responded: "We build on many dozens of different versions of different operating systems, on a wide variety of hardware architectures [...] If there was a significant problem, why hasn't somebody reported it to us?"

Asked about how current the code is, Stenn stated that "the code has been and continues to be written to compile and run on currently available and currently used systems." Stenn conceded that some code only builds on older machines, yet pointed out that many old machines are still running. "If hardware is still in use, from our point of view there is an actual benefit to doing what we can to make sure folks can build the latest code on older machines."

As for security patches, Stenn acknowledged that NTP currently lacks the funding for a much-needed replacement of Autokey, the code that authenticates NTP servers. However, he noted that NTP released five major patches in 2016, and claimed that it was up to date as of the end of November 2016. He added, "I have no idea what she's talking about [in regard to] secret circulation of patches or leaked patches."

Moreover, Stenn questioned the accomplishments listed in Sons's presentation. In particular, the reduction of NTPsec's code base, even allowing for the relative compactness of code written in Python, becomes less impressive in light of Stenn's explanation that NTP is "the only reference implementation for NTP, and that means we have to provide complete functionality." Stenn claimed that NTPsec has "removed lots of stuff that has zero reported bugs in them, like sntp, the ntpsnmd code, and various refclocks." Although a less than complete implementation might have its uses, Stenn claimed that NTPsec has gone too far in removing code, and that its bug repairs have sometimes been at the cost of reduced functionality.

In general, Stenn wondered if, after only a couple of years work, NTPsec contributors have the experience necessary to work with the code. His own understanding of the protocol has changed several times during his decades of work, and he warned that "if you don't understand how everything works and where it fits into place, when things get busy, horrible things can happen." The NTPsec story frequently spoke of free-software ideals such as openness, transparency, and a welcoming environment to all contributors, "but this isn't a democratic process. It's a scientific process, and this isn't somebody's turn to go ahead and take theirs at the wheel driving the bus."

Still, the NTPsec fork has caused some changes in the NTP project. After NTPsec began, the foundation felt the need to commission regular financial audits, and to continue code audits that were begun in 2006.

"Creative destruction ('let's see what happens if we throw something into the works') is a horrible way to provide core Internet structure," Stenn concluded.

One step forward, two steps back?

For outsiders, which version of events is closer to the truth is difficult to assess. Probably few are competent to judge. However, assigning blame is beside the point.

What is of concern is that acceptance of the two implementations of the NTP protocol has been based largely on the most appealing story, and not on the quality of the code. NTPsec's constant analogy to the need to support OpenSSL evokes an immediate concerned response from free-software supporters, but, if Stenn is correct in his assertions, the situations of NTP and OpenSSL are not usefully comparable.

In particular, having two separate projects may be no more than a duplication of effort. Although having competing projects can sometimes benefit free software, in this case, having two warring projects risks diluting the already limited resources and support being contributed to put the protocol on a reliable footing.

Despite all the efforts of both projects, the possibility remains that the dangers to the protocol are as great today as they were before anyone attempted to address them. Already, where once only Stenn was looking for support, now Raymond is in a somewhat similar position, as NTPsec has lost its Core Infrastructure Initiative funding as of September 2016. It is all too easy to imagine the struggle for survival growing worse for everyone.

[Update: As noted in the comments, it was the scripts that were rewritten in Python for NTPsec.]

