NTP question

If you have one such installation, then you really do not care about the
"accuracy" of the time. However if you have multiple such installations
then you want them all to have the same time (if you will be comparing logs
between them, for example). At some point it becomes "cheaper" to spend
thousands of dollars per site to have a single Stratum 0 timesource (for
example, the GPS system) at each site (and thus comparable time stamps)
than it is to pay someone to go though the rigamarole of computing offsets
and slew rates between sites to be able to do accurate comparison. And if
you communicate any of that info to outsiders then being able to say "my
log timestamps are accurate to +/- 10 nanoseconds so it must be you who is
farked up" (and be able to prove it) has immense value.

If your network is air gapped from the Internet then sure. If it's not, you
can run NTP against a reasonably reliable set of time sources (not random
picks from Pool) and be able to say, "my log timestamps are accurate to +/-
10 milliseconds so it must be you who is farked up." While my milliseconds
loses the pecking order contest, it's just as good for practical purposes
and a whole lot less expensive.

It's not clear to me that there's anything *wrong* with using the pool,
especially if you're using our 'pool' directive in your config file.

That directive will bring up ~10 associations and continuously evaluate
their quality, throwing out the poor performers and soliciting new
servers of currently-good quality to replace them.

This goes to "have _enough_ good-quality servers, and monitor your ntpd".

What about GPS, GLONASS, Galileo, etc.?

Hi Keith,

If your network is air gapped from the Internet then sure. If it's
not, you can run NTP against a reasonably reliable set of time
sources (not random picks from Pool) and be able to say, "my log
timestamps are accurate to +/- 10 milliseconds so it must be you who
is farked up." While my milliseconds loses the pecking order contest,
it's just as good for practical purposes and a whole lot less
expensive.

You mean something like this, which is relatively easy to achieve:

==============================================================================
offset -0.000009, frequency -0.823, time_const 30, watchdog 238
synchronised to NTP server (192.5.41.40) at stratum 2
   time correct to within 12 ms
   polling server every 1024 s

     remote refid st t when poll reach delay offset jitter

+clock.sjc.he.ne .CDMA. 1 u 287 1024 377 64.313 0.337 0.867
-tock.usnogps.na .IRIG. 1 u 5 1024 377 103.080 -2.097 0.316
-tick.usnogps.na .IRIG. 1 u 806 1024 377 103.053 -2.328 0.363
+india.colorado. .NIST. 1 u 270 1024 377 41.214 -0.159 0.113
+time-b-b.nist.g .NIST. 1 u 984 1024 377 42.609 0.200 0.045
+time-c-b.nist.g .NIST. 1 u 180 1024 377 42.563 0.201 0.064
+time-a-b.nist.g .NIST. 1 u 163 1024 377 42.639 0.137 0.032
*192.5.41.40 .PTP. 1 u 235 1024 377 12.756 -0.388 12.479
-192.5.41.41 .IRIG. 1 u 312 1024 377 13.575 -1.172 2.425
LOCAL(0) .LOCL. 10 l - 64 0 0.000 0.000 0.000
------------------------------------------------------------------------------
pll offset: -8.474e-06 s
pll frequency: -0.823 ppm
maximum error: 0.123149 s
estimated error: 0.000122 s
status: 2001 pll nano
pll time constant: 10
precision: 1e-09 s
frequency tolerance: 500 ppm

That all looks great except for the LOCAL clock at S10. In the event
you lose connectivity to the outside, this system will jump from S2 to
S10. Depending on the setup of your other systems, groups of them will
go sailing off in their own directions.

http://support.ntp.org/bin/view/Support/OrphanMode is the better solution.

If you cannot do that for some reason, please see the "Dual Time
Servers" case at
http://support.ntp.org/bin/view/Support/UndisciplinedLocalClock .

Harlan,

Why? The GPS NTP Server is Stratum-1. If it fails computer clocks will freewheel for hours or days before losing significant time, during which period you can simply order a replacement unit. If that isn’t fast enough, buy two $300 boxes. The “consensus” issue is moot, since a GPS server gets a consensus of clock time from the GPS satellite constellation.

The “enough NTP peers” you speak of are simply not necessary.

-mel via cell

Yo Mel!

Why? The GPS NTP Server is Stratum-1. If it fails computer clocks
will freewheel for hours or days before losing significant time,
during which period you can simply order a replacement unit. If that
isn’t fast enough, buy two $300 boxes. The “consensus” issue is moot,
since a GPS server gets a consensus of clock time from the GPS
satellite constellation.

I guess you slept through GPS Week Roll Over day last April 6th?

Some GPS went nuts, others did not. Many 777 and 787 were grounded that
weekend for software updates to their expensive Honeywell GPS. I'll
spare you the many more examples that hapened.

Not nice when yoar clock rolls back to 1999, or forward to 2035.

RGDS
GARY

You might be right about the GPS server. It depends on how your $300
box behaves if it loses the GPS signal.

The consensus issue isn't about the number of satellites the GPS
receiver sees, it's about the number of time sources your NTP servers see.

H

Yo Gary!

Not only did I not sleep through it, I was one of the engineers who verified that every GPS clock source in a very large aviation support network didn’t have have this bug.

I’m also an FAA licensed A&P mechanic, and have worked for airlines in fleet maintenance. Air carriers have extremely thorough systems reviews, by law, through the Airworthiness Directive program, which started identifying 2019 GPS rollover vulnerabilities in ... 2009! Nobody was surprised. If any GPS systems “went nuts”, it was through the incompetence and negligence of their owners.

-mel

I can tell you how the GPS server behaves when it loses it signal: it stops giving out verified time and lapses into Stratum-“goners” mode. But today’s RTP chips don’t start losing seconds-per-day when they are free running. Typically they might lose ten seconds per week on cheap systems. That’s of little concern if you have two GPS clocks.

But wait. What is the GPS constellation goes down? THEN we have bigger problems :slight_smile:

It’s possible to over-think the clock problem, just as it’s possible to overthink RAID storage protection. Sometimes a manual restore from backup is just fine.

-mel

Yo Mel!

I’m also an FAA licensed A&P mechanic, and have worked for airlines
in fleet maintenance. Air carriers have extremely thorough systems
reviews, by law, through the Airworthiness Directive program, which
started identifying 2019 GPS rollover vulnerabilities in ... 2009!
Nobody was surprised. If any GPS systems “went nuts”, it was through
the incompetence and negligence of their owners.

How many GPS owners happen to have $30,000 GPS simulators to check
their $300 GPS/NTP servers? Some of mine did, most did not.

Seems to me the negligence is in the GPS manufacturer that failed to
notify their customers.

To be fair, Avidyne and Telit did notify their customers, but not with
a fix or enough lead time to swap out the units.

RGDS
GARY

Yo Mel!

Gary, Gary, Gary,

You don’t need a $30,000 GPS simulator to verify if a GPS product in your inventory has the rollover bug. You simply ask the supplier to certify that they don’t have the rollover bug. They use their _$100,000_ GPS simulator If needed, but usually it’s done with a trivial code review.

If the supplier can’t provide such a certification, then they are no longer a supplier. This tends to persuade them to certify.

If you as an air carrier (or any other critical GPS consumer) fail to ask for such a certification in time to field a replacement, that’s your fault.

You might not be aware, but zero US air carriers had any unplanned downtime from the GPS rollover. I can’t say the same thing for certain Asian air carriers :slight_smile:

-mel via cell

I’m talking about _my_ GPS server. I have no idea what you’ve cobbled up :slight_smile:

-mel

For those wondering what a GPS certification letter for the rollover bug looks like, here’s Garmin’s. Note the phrase “for many years, Garmin has anticipated and prepared for this event…”:

What is the GPS Week Number Rollover (WNRO)?

The GPS system is world renowned for its ability to provide accurate and reliable positioning and timing information worldwide. The GPS satellites transmit to users the date and time accurate to nanoseconds. However, back in 1980, when the GPS system first began to keep track of time, the date and time was represented by a counter that could only count forward to a maximum of 1024 weeks, or about 19.7 years. After 1024 weeks had elapsed, this counter “rolled over” to zero, and GPS time started counting forward again. This first rollover occurred in August of 1999. The second rollover will occur on April 6, 2019.

Is My Device Affected?

For many years, Garmin has anticipated and prepared for this event. Regardless, Garmin has been performing exhaustive testing of current and legacy devices to determine if they will be affected by the GPS week number rollover. Our testing shows the vast majority of Garmin GPS devices will handle the WNRO without issues.

What is the Effect of a GPS Week Number Rollover Issue?

For GPS devices that are affected, after the rollover occurs, an incorrect date and time will be displayed. This incorrect time will also be used to timestamp track logs, compute sunrise and sunset, and other functions that rely upon the correct date and time. However, the positioning accuracy will not be affected. The device will continue to deliver the same positioning performance as before the rollover.

-mel

Why don’t data centers provide a GPS signal along with power and air conditioning?
Installing a distribution amplifier for 1.5 GHz is not rocket science.

(Or an Ethernet with IEEE1588 precise time, but that is probably asking too much.)

Grüße, Carsten

I’d like to give a plug for Symetricom products like the Time Provider 1100. I used these in my previous life at a half dozen sites.
They function as ntp servers and peer with each other over a network. In addition (and most important to me) they provided BITS clocks to our optical gear and pbx’s. Very reliable and you could waste all sorts of money by equipping them with 1 or 2 oscillators, rubidium if you liked. The antenna needed a clear view of the sky and we mounted these at roof level to avoid lightning. They were heated to avoid icing.
Good stuff, never had an issue with rollovers, software was upgradable.

Did the vendor ever ship an actual software upgrade?

It continues to surprise me that there is still hardware being sold that doesn't even support IPv6.

for our PCI-DSS audit, the rational for at least -one- local source, instead of depending on pool.ntp.org, was “backhoe fade”.
it was worth the $135 for an NTP source using GPS. the cable run up the elevator shaft for the antenna works without needing OSHPD permits.

We are very happy with the result.

/Wm

Passes the backhoe test, but might have an issue with the Die Hard Elevator Shaft Fight Scene checks.

:slight_smile:

If your data center is suffering from both backhoe face and a Die Hard Fight Scene,
the *real* question is whether you're going to care about NTP when the Halon dumps
and the emergency power interlock shuts down all your hardware...

In other words, you got bigger problems. :slight_smile: