leap second outage

Frank_Bulk1 · July 1, 2015, 3:12am

We experienced our first leap second outage -- our SHE (super head end) is
using (old) Motorola encoders and we lost those video channels. They
restarted all those encoders to restore service.

Frank

Netfortius · July 1, 2015, 3:30am

This was supposed to have happened @midnight UTC, right? Meaning that we
are past that event. Under which scenarios should people be concerned about
midnight local time? Lots of confusing messages flying all over...

Josh_Luthman · July 1, 2015, 3:32am

That is my understanding as well. The event was about 3.5 hours ago.

Josh Luthman
Office: 937-552-2340
Direct: 937-552-2343
1100 Wayne St
Suite 1337
Troy, OH 45373

Nicholas_Suan1 · July 1, 2015, 3:42am

Correct, the leap second gets inserted at midnight UTC.

"Leap seconds can be introduced in UTC at the end of the months of December

or June, depending on the evolution of UT1-TAI. Bulletin C is mailed every
six months, either to announce a time step in UTC or to confirm that there
will be no time step at the next possible date."

ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat

Joe · July 1, 2015, 4:14am

A leap sec causing issues. For about 40 years now, there have been
these leap seconds to no real issue. All of these are "go-forwards"
and even MS AD (I believe) treat them as a little bump (nothing to see
here move along). So unless you have really a tight VPN (non-standard
conforming) I'd hope that nothing has happend, and if it did chances
are it's etheir coincidence or intentional.
I certainly hope I am around to collect on the
https://en.wikipedia.org/wiki/Year_2038_problem for retirement.
I think we've all seen the "big to do" regarding Y2K to know better
Maybe I am wrong, but...

Just my 2¢s
-Joe

Harlan_Stenn · July 1, 2015, 4:47am

Joe writes:

A leap sec causing issues. For about 40 years now, there have been
these leap seconds to no real issue. All of these are "go-forwards"

No, they're all "go-backwards" events. That's no big deal to things
that don't care about monotonic time, or to folks who aren't in
violation of something if their timestamps are off by a second.

What I'm about to say may not be as stupid as it sounds: The problems
here aren't problems for cases where it's not a problem. It is a
problem where it *is* a problem.

It's a case where one person's signal is another person's noise.

H

Jean-Francois_Mezei · July 1, 2015, 5:08am

In fairness, systems should be used to NTP making adjustments to the
system clock of a second or less.

However, in systems that expect tightly synchronized clocks, they would
want all the nodes to make the NTP adjustement at the same time.

Mikael_Abrahamsson · July 1, 2015, 5:38am

This is both an operating system and application problem.

http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time
http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time

This is similar to the jiffycounter wrapping, since this doesn't happen that often, it's not commonly tested for. Good way is to start the jiffy counter so it wraps after 10 minutes of uptime. That way you'll run into any bugs quickly. Either we should abolish the leap second or we should make leap second adjustments (back and forth) on a monthly basis to exercise the code.

This is a hard sell though...

Harlan_Stenn · July 1, 2015, 5:59am

Mikael Abrahamsson writes:

This is similar to the jiffycounter wrapping, since this doesn't happen
that often, it's not commonly tested for. Good way is to start the jiffy
counter so it wraps after 10 minutes of uptime. That way you'll run into
any bugs quickly. Either we should abolish the leap second or we should
make leap second adjustments (back and forth) on a monthly basis to
exercise the code.

This is a hard sell though...

and it's perversely interesting. It would even be tolerable when the
difference between UTC and UT1 is such that the insertions and deletions
maintain the +/- .9 s difference. There would even be enough time to
warn folks about this.

H

Colin_Johnston · July 1, 2015, 6:44am

oracle linux did this
Jul 1 02:01:29 oraclelinux ntpd[600]: 0.0.0.0 061c 0c clock_step -1.006445 s
Jul 1 02:01:29 oraclelinux ntpd[600]: 0.0.0.0 0615 05 clock_sync
Jul 1 02:01:29 oraclelinux systemd: Time has been changed
Jul 1 02:01:30 oraclelinux ntpd[600]: 0.0.0.0 c618 08 no_sys_peer
all seemed fine after this

sophus utm did this
2015:07:01-00:59:59 cloudsophosvm kernel: [653957.707421] Clock: inserting leap second 23:59:60 UTC
all seemed fine after this

Colin

Frank_Bulk1 · July 1, 2015, 12:05pm

Yes, happened at 7 pm Central (0:oo UTC).

James_Hess · July 1, 2015, 12:42pm

quickly. Either we should abolish the leap second or we should make leap
second adjustments (back and forth) on a monthly basis to exercise the code.

See.... maybe there should some day be building codes for
commercially marketed software that provide minimum independent
formal testing to be done by licensed independent testers, including
leap seconds and such.

The leap second issues are possibly rare and intermittent, therefore,
having a few per month is not necessarily giving adequate exposure
to code paths that may go wrong during an insert/del event.

There's never been a negative leap second, only insertions, but how
deletions are implemented might expose new bugs, since there hasn't
been one before, And you can only have one leap per 24 hours,
positive or minus, pick one.

& Shouldn't this kind of 'exercise' be done during the QA process
before releasing new system software, rather than mucking with clock
accuracy?

There is a recent article with some Leap Second 'stress testing' code:
How to clear the Leap Second Insertion flag after it has been received? - Red Hat Customer Portal

Readily available test methods are available, there ought to be
little legitimate excuse for anyone writing serious software that has
long-running processes or threads not to include evaluation for
possible leap second issues and other possible clock-related issues
such as clock stepping, DST, and Year 2038 in their standard smoke
tests....

Frank_Bulk1 · July 1, 2015, 8:52pm

And just 12.5% of them required TLC. =)

Harlan_Stenn · July 2, 2015, 12:41am

Jimmy Hess writes:

> quickly. Either we should abolish the leap second or we should make leap
> second adjustments (back and forth) on a monthly basis to exercise the code
.

See.... maybe there should some day be building codes for
commercially marketed software that provide minimum independent
formal testing to be done by licensed independent testers, including
leap seconds and such.

And NTF's Certification and Compliance programs are going to do this.
At least as soon as NTF has the resources to get this moving.

The leap second issues are possibly rare and intermittent, therefore,
having a few per month is not necessarily giving adequate exposure
to code paths that may go wrong during an insert/del event.

If they happened every 6 month's time that would be often enough, but
the earth hasn't slowed down that much yet. There will be enough times
that we could insert or delete one every month and still have |UT-UT1|
be under .9 seconds.

If it was announced that "starting in 6 months' time we'll be inserting
or deleting a leap second every month or so that would give folks enough
time to prep for it, and I'm pretty confident that the leap-second would
soon become a non-event.

There's never been a negative leap second, only insertions, but how
deletions are implemented might expose new bugs, since there hasn't
been one before, And you can only have one leap per 24 hours,
positive or minus, pick one.

Yup.

& Shouldn't this kind of 'exercise' be done during the QA process
before releasing new system software, rather than mucking with clock
accuracy?

leap second handling is a "mechanism" question. Which one to choose is
a "policy" question. IMO, a vendor should provide adequate mechanism.
The customer should get to choose policy.

There is a recent article with some Leap Second 'stress testing' code:
How to clear the Leap Second Insertion flag after it has been received? - Red Hat Customer Portal

Readily available test methods are available, there ought to be
little legitimate excuse for anyone writing serious software that has
long-running processes or threads not to include evaluation for
possible leap second issues and other possible clock-related issues
such as clock stepping, DST, and Year 2038 in their standard smoke
tests....

Yes. And even so, testing these things takes time and equipment.