mnordhoff: mnordhoff: What if the monitoring code was aware of leap seconds? Something like: …

For NTP clients not handing out these servers in the last day before the leap is likely too late. Many ntpd’s will have been running for weeks or months already.

SNTP clients don’t care about the leap second, but it’d potentially protect them from talking to a confused server just after the leap second (until the monitoring system has kicked those out anyway).

Now, I’m not sure what the overlap between “didn’t announce the leap second” and “were off by a second after the event” were (it’s in the data though if anyone is up for trying to figure that out). I’m sure it’s less than 100%.

The downside to implementing something like that is that the pool system now has to get all this correct, beyond just having the right time. It wouldn’t be a lot of code to implement, but it doesn’t “feel good”. It’s something that’s hard to test and has to work – or the whole system would go kaboom[1].

Since it won’t help NTP clients (vs SNTP) it doesn’t seem worth the complexity.

This leap second I was around, had time and good internet connection, but generally I optimize for “will work without me sitting paying attention”. The NTP Pool doesn’t exactly have a NOC or 24/7 staff. Or staff.

I think there are other things that are more likely to be beneficial, in particular things that are actionable weeks or months in advance. For example we could track the refid/upstream of each server, learn which ones mess up the leap second and warn operators to not use those as upstream servers. Maybe.

[1] (Which I suppose is the story of leap seconds).