Speaking of NTP...

Matthew_Huff · July 13, 2015, 1:17pm

We have 5 NTP server: 2 x stratum 1 rubidium oscillator time servers with GPS sync, and 3 servers running NTP 4.2.6p5-3 synced to external internet based NTP stratum 1 servers. We monitor our NTP environment closely, and over the last 10+ years, normally all of our NTP servers are sync'ed within +/- 2 msec. Starting last Friday, we started seeing some remote NTP servers with GPS reference consistently offset by 10 msec.

Any one else seeing this?

_Stephane_Bortzmeyer · July 13, 2015, 1:26pm

a message of 14 lines which said:

We have 5 NTP server: 2 x stratum 1 rubidium oscillator time servers
with GPS sync, and 3 servers running NTP 4.2.6p5-3 synced to
external internet based NTP stratum 1 servers. We monitor our NTP
environment closely, and over the last 10+ years, normally all of
our NTP servers are sync'ed within +/- 2 msec. Starting last Friday,
we started seeing some remote NTP servers with GPS reference
consistently offset by 10 msec.

I have no idea but I just wanted to remind people that, for a few
months, RIPE Atlas probes have been able to do NTP queries
<RIPE Atlas docs | Measurement Result Format | Docs; so it may be a
cool way to monitor NTP servers from many points.

Rafael_Possamai1 · July 16, 2015, 3:53pm

Depending on how exactly you have these servers configured with relation to
one another, small variations from one single source can be augmented down
the line.

https://en.wikipedia.org/wiki/Propagation_of_uncertainty

Tony_Hain · July 16, 2015, 7:24pm

I have had a consistent 10ms offset on a set of servers for the last 5 years. After extensive one-way tracing, it turns out there is a 20ms asymmetry "within" the Seattle Westin colo between HE & Comcast, causing all the IPv6 peers appearing over the HE tunnel to be 10ms offset from everything else. There may be other instances of indirect peering causing a static asymmetric path delay, and NTP will report that as an offset of half of the difference.

Tony

Matthew_Huff · July 16, 2015, 7:43pm

Thanks. We have always had a few outliers, but we have never had a large number of external NTP have a consistent offset, and not one as big as 10ms. Something changed last Friday, probably at some peering point that caused the issue. Maybe a symmetric path got created to route around some outage. Maybe some MPLS circuit got introduced into the mix that hides the underlying path/latency. Glad to know someone else has seen something like this.

Our 3 NTP servers that sync from external sources have at least 5 upstream stratum 1 servers and are peered to each other . They have settled on a sense of time that is good within +/i 1 msec of our strata 1 clocks, so all is good, but it was a stage occurrence after we had been good for so long. Each of our servers are clients of our 2 x strata 1 servers and 3 x strata 2 NTP servers. They all look good now.