Recent NTP pool traffic increase

Yo All!

Someone on nanog was reporrting on the new NTP mystery. He suggested
doing a dump similar to this:

# tcpdump -nvvi eth0 port 123 |grep "Originator - Transmit Timestamp:"

And I do indeed get odd results. Some on my local network...

This is from a chronyd host to an ntpsec host. I monitor them both
continuously and both seem to be keeping good time.

17:36:11.369329 IP (tos 0x0, ttl 64, id 21405, offset 0, flags [DF], proto UDP (
17), length 76)
    204.17.205.7.50937 > 204.17.205.27.123: [udp sum ok] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecifi
ed), poll 6 (64s), precision 32
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp: 0.000000000
          Originator Timestamp: 3691013707.207257069 (2016/12/17 17:35:07)
          Receive Timestamp: 276521666.321684728 (2044/11/11 10:02:42)
          Transmit Timestamp: 3684123061.899235956 (2016/09/29 00:31:01)
            Originator - Receive Timestamp: +880475255.114427658
            Originator - Transmit Timestamp: -6890645.308021113

That 'Receive Timestamp' is strange.

Here is another one from the same chronyd host, to another ntpsec host:

17:36:23.395415 IP (tos 0x0, ttl 64, id 3599, offset 0, flags [DF], proto UDP (1
7), length 76)
    204.17.205.7.33551 > 204.17.205.1.123: [udp sum ok] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecifi
ed), poll 6 (64s), precision 32
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp: 0.000000000
          Originator Timestamp: 3691013718.824150890 (2016/12/17 17:35:18)
          Receive Timestamp: 1779216017.648483479 (2092/06/24 18:08:33)
          Transmit Timestamp: 1405803137.064633429 (2080/08/24 20:20:33)
            Originator - Receive Timestamp: -1911797701.175667410
            Originator - Transmit Timestamp: +2009756714.240482539

Note both the 'Receive Timestamp' and 'Transmit Timestamp' are both strange.

All three hosts have GPS for local time.

Here is one from a laptop, running chrony, that has not GPS:

17:36:52.643814 IP (tos 0x0, ttl 64, id 24624, offset 0, flags [DF], proto UDP (
17), length 76)
    204.17.205.21.41485 > 204.17.205.8.123: [udp sum ok] NTPv4, length 48
        Client, Leap indicator: (0), Stratum 0 (unspecified), poll 6 (64s), pre
cision 32
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp: 0.000000000
          Originator Timestamp: 3691013747.797479298 (2016/12/17 17:35:47)
          Receive Timestamp: 317494016.811980062 (2046/02/28 15:15:12)
          Transmit Timestamp: 127487236.597620268 (2040/02/21 11:35:32)
            Originator - Receive Timestamp: +921447565.014500764
            Originator - Transmit Timestamp: +731440784.800140969

I have only seen this oddity from chronyd hosts...

RGDS
GARY

Yo All!

# tcpdump -nvvi eth0 port 123 |grep "Originator - Transmit Timestamp:"

And I do indeed get odd results. Some on my local network...

To follow up on my own post, so this can be promply laid to rest.

After some discussion at NTPsec. It seems that chronyd takes a lot
of 'creative license' with RFC 5905 (NTPv4). But it is not malicious,
just 'odd', and not new.

So, nothing see here, back to the hunt for the real cause of the new
NTP traffic.

RGDS
GARY

I also have a similar experience with an increased load.

I'm running a pretty basic Linode VPS and I had to fine tune a few things in order to deal with the increased traffic. I can clearly see a date around the 14-15 where my traffic increases to 3-4 times the usual amounts.

I did a quick dump and in 60 seconds I was hit by slightly over 190K IPs

http://i.imgur.com/mygYINk.png

Weird stuff

Laurent

My WAG is that the one plus updated firmeware on that day and they baked in
the pool.

Complete WAG, but time and distributed sources including wireless networks

I noticed now many customers using tp-links reported issues with internet connection.
Analyzing internet traffic, i noticed that tp-link seems excessively requesting ntp from those ip addresses, and not trying others:

  > 192.5.41.40.123: NTPv3, Client, length 48
  > 192.5.41.41.123: NTPv3, Client, length 48
  > 133.100.9.2.123: NTPv3, Client, length 48

I'm asking customer to make photo of device, to retrieve model and revision, and checking other customers as well, if they are abusing same servers.

Many sorry! Update, seems illiterate in english (worse than me, hehe) customer was not precise about model of router, while he reported issue.

I noticed now many customers using specific models of routers reported issues with internet connection.
Analyzing internet traffic, i noticed that this routers seems excessively requesting ntp from those ip addresses, and not trying others:

  > 192.5.41.40.123: NTPv3, Client, length 48
  > 192.5.41.41.123: NTPv3, Client, length 48
  > 133.100.9.2.123: NTPv3, Client, length 48

I'm asking customer to make photo of device, to retrieve model and revision, and checking other customers as well, if they are abusing same servers.
There is definitely pattern, that all of them are using just this 3 hardcoded servers. Problem is that many customers are changing mac of router, so i cannot clearly
identify vendor by first mac nibbles.
He sent me 2 photos, one of them LB-Link (mac vendor lookup 20:f4:1b says Shenzhen Bilian electronic CO.,LTD), another is Tenda (c8:3a:35 is Tenda).
If it is necessary i can investigate further.

I also have a similar experience with an increased load.

I'm running a pretty basic Linode VPS and I had to fine tune a few
things in order to deal with the increased traffic. I can clearly see a
date around the 14-15 where my traffic increases to 3-4 times the usual
amounts.

From a source network point of view we see devices come online and hit ~35 unique NTP servers within a few seconds.

I'll try to see if I can track down what type of devices they are.

I found devices doing lookups for all of these at the same time {0,0.uk,0.us,asia,europe,north-america,south-america,oceania,africa,europe}.pool.ntp.org and then it proceeds to use everything returned, which explains why everyone is seeing an increase.

I'm not sure if this issue relevant to discussed topic, Tenda routers here for a while on market, and i think i noticed this issue just now,
because NTP servers they are using supposedly for healthcheck went down (or NTP owners blocked ISP's i support, due such routers).

At least after checking numerous users, i believe Tenda hardcoded those NTP IPs. What worsen issue, that in Lebanon several times per day, for example at 18pm - short electricity cutoff,
and majority of users routers will reboot and surely reconnect, so it will look like a countrywide spike in NTP traffic.

I checked for a 10min also this NTP ips in dns responses, none of thousands of users tried to resolve any name with them over any DNS server, so i conclude they are hardcoded somewhere in firmware.

Here is traffic of Tenda router after reconnecting (but not full powercycle, i dont have it in my hands). But as you can see, no DNS resolution attempts:

20:15:59.305739 PPPoE [ses 0x1483] CHAP, Success (0x03), id 1, Msg S=XXXXXX M=Authentication succeeded
20:15:59.306100 PPPoE [ses 0x1483] IPCP, Conf-Request (0x01), id 1, length 12
20:15:59.317840 PPPoE [ses 0x1483] IPCP, Conf-Request (0x01), id 1, length 24
20:15:59.317841 PPPoE [ses 0x1483] IPCP, Conf-Ack (0x02), id 1, length 12
20:15:59.317867 PPPoE [ses 0x1483] IPCP, Conf-Nack (0x03), id 1, length 18
20:15:59.325253 PPPoE [ses 0x1483] IPCP, Conf-Request (0x01), id 2, length 24
20:15:59.325273 PPPoE [ses 0x1483] IPCP, Conf-Ack (0x02), id 2, length 24
20:15:59.335589 PPPoE [ses 0x1483] IP 172.17.49.245.123 > 133.100.9.2.123: NTPv3, Client, length 48
20:15:59.335588 PPPoE [ses 0x1483] IP 172.17.49.245.123 > 192.5.41.41.123: NTPv3, Client, length 48
20:15:59.335588 PPPoE [ses 0x1483] IP 172.17.49.245.123 > 192.5.41.40.123: NTPv3, Client, length 48

Here is example of Tenda traffic if it is unable to reach destination, it repeats request each 10 seconds endlessly, my guess they are using ntp to show
status of internet connection.
So, now that NTP servers getting quite significant DDoS such way.

19:57:52.162863 IP (tos 0x0, ttl 64, id 38515, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 192.5.41.40.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
19:57:52.163277 IP (tos 0x0, ttl 64, id 38516, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 192.5.41.41.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
19:57:52.164435 IP (tos 0x0, ttl 64, id 38517, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 133.100.9.2.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177063.000000000 (2016/12/19 22:57:43)
19:58:02.164781 IP (tos 0x0, ttl 64, id 38518, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 192.5.41.40.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)
19:58:02.164884 IP (tos 0x0, ttl 64, id 38519, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 192.5.41.41.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)
19:58:02.165061 IP (tos 0x0, ttl 64, id 38520, offset 0, flags [none], proto UDP (17), length 76)
     172.16.31.67.123 > 133.100.9.2.123: [udp sum ok] NTPv3, length 48
  Client, Leap indicator: (0), Stratum 0 (unspecified), poll 0 (1s), precision 0
  Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
    Reference Timestamp: 0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp: 0.000000000
    Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)
      Originator - Receive Timestamp: 0.000000000
      Originator - Transmit Timestamp: 3691177073.000000000 (2016/12/19 22:57:53)

Am I the only one who read that and started wondering if some engineer writing
CPE code read a recommendation someplace to "query 3-5 different servers" and
managed to miss the "-"?

Quoting David <opendak@shaw.ca>:

I found devices doing lookups for all of these at the same time {0,0.uk,0.us,asia,europe,north-america,south-america,oceania,africa,europe}.pool.ntp.org and then it proceeds to use everything returned, which explains why everyone is seeing an increase.

I'm very interested to find out what devices these are. This would explain why places like New Zealand are getting massive amounts of NTP traffic from North America.

Thanks, David. That perfectly matches the list of servers used by
older versions of the ios-ntp library[1][2], which would point toward
some iPhone app being the source of the traffic.

[1] https://github.com/jbenet/ios-ntp/blob/d5eade6a99041094f12f0c976dd4aaeed37e0564/ios-ntp-rez/ntp.hosts
[2] https://github.com/jbenet/ios-ntp/blob/5cc3b6e437a6422dcee9dec9da5183e283eff9f2/ios-ntp-lib/NetworkClock.m#L122

That would make sense - I see a lot of iCloud related lookups from these hosts as well.

Also, app.snapchat.com generally seems to follow just after the NTP pool DNS lookups. I don't have an iPhone to test that though.

Thanks,

the new Mario app perhaps? :slight_smile:

Quoting David <opendak@shaw.ca>:

replying off list.

If anything comes from this, I'd love to hear about it. As a student in the field, this is the kind of stuff I live for! :wink:

Pretty awesome to see the chain of events after seeing a post on the [pool] list!

Laurent

We - at Snap - were forwarded this thread just a few hours ago and are
investigating. Please email me should you still be looking for a contact
for Snapchat.

Thank you,
Jad

https://news.ntppool.org/2016/12/load/

https://community.ntppool.org/t/recent-ntp-pool-traffic-increase/18

<https://en.wikipedia.org/wiki/NTP_server_misuse_and_abuse#Notable_cases>