Recent NTP pool traffic increase

Jose_Gerardo_Perales · December 15, 2016, 10:45pm

Hi,

We've recently experienced a traffic increase on the NTP queries to NTP pool project (pool.ntp.org) servers. One theory is that some service provider NTP infraestructure failed approximately 2 days ago and traffic is now being redirected to servers belonging to the NTP pool project.

Does anyone from the service provider community have any comments?

Gerardo Perales

Blake_Hudson · December 15, 2016, 11:00pm

I would think if a service provider failed, the stats would bear that out. For example, if one of the top ISPs in the world was forwarding requests, then you would likely see an increase in the number of queries generated from IP addresses registered to that organization. A similar effect could occur if a large ISP recently started distributing NTP servers as part of their DHCP options when they had not previously. If historical query data is not available, the current data could be used to make an educated guess and follow up on the likely data trails as currently visible.

I would also not rule out the possibility that a Netgear, DLink, T-mobile or some other vendor or distributor of access gear pushed out a firmware update which enabled NTP when it previously was disabled or otherwise changed a device's NTP settings or behavior.

--Blake

Dan_Drown · December 15, 2016, 11:07pm

Quoting Jose Gerardo Perales Soto <gerardo.perales@axtel.com.mx>:

We've recently experienced a traffic increase on the NTP queries to NTP pool project (pool.ntp.org) servers. One theory is that some service provider NTP infraestructure failed approximately 2 days ago and traffic is now being redirected to servers belonging to the NTP pool project.

Does anyone from the service provider community have any comments?

To add some more numbers to this, I'm seeing 4x the usual NTP traffic to my server in pool.ntp.org, starting Dec 13.

Top source ASNs by % of NTP traffic seen by my server (I don't have pre-Dec 13 traffic by ASN handy)

sprint 4.0%
verizon-wireless 3.4%
tmobile 2.9%
att-wireless 2.8%
comcast 2.1%
orange 1.8%
sky 1.6%
twc 1.0%
att 1.0%
swisscom 0.9%
saudinet 0.8%
virgin 0.6%
opaltelecom 0.5%
qwest 0.5%
eli 0.2%
verizon 0.2%

Possibly related is the new iOS release. Does the new iOS generate more NTP traffic? Can anyone measure that?

Joel_Jaeggli · December 15, 2016, 11:13pm

IOS uses time.appple.com which is widely available.

Allan_Liska · December 15, 2016, 11:20pm

I manage two NTP servers in the pool, one in the US and the other in
EMEA. FWIW The US server has seen a spike in traffic, but I have not
seen a similar spike on the EMEA server.

allan

On 12/15/2016 at 5:46 PM, "Jose Gerardo Perales Soto" wrote:Hi,

We've recently experienced a traffic increase on the NTP queries to
NTP pool project (pool.ntp.org) servers. One theory is that some
service provider NTP infraestructure failed approximately 2 days ago
and traffic is now being redirected to servers belonging to the NTP
pool project.

Does anyone from the service provider community have any comments?

Gerardo Perales

Kraig_Beahn1 · December 16, 2016, 2:01am

How much of a traffic increase?

Dobbins_Roland · December 16, 2016, 2:50am

Do you have flow telemetry, which provides a lot more information than basic pps/bps stats?

Are you seeing normal timesync queries, or lots of level-6/level-7 admin command attempts?

Dan_Drown · December 16, 2016, 3:09am

Quoting Roland Dobbins <rdobbins@arbor.net>:

Do you have flow telemetry, which provides a lot more information than basic pps/bps stats?

Sources are pretty widely spread out among cell networks/home internet, seem to be mostly US based. I'm not seeing a large amount of traffic per single IP or single subnet. This seems more like "someone pushed out bad firmware" rather than something malicious.

Are you seeing normal timesync queries, or lots of level-6/level-7 admin command attempts?

SNTP Client timesync queries make up 91.3% of the traffic to my server.

The following NTP settings being most the popular (47% of all traffic to my server):

stratum=0, poll=4, precision=-6, root delay=1, root dispersion=1, reference timestamp=0, originator timestamp=0,
receive timestamp=0

Dobbins_Roland · December 16, 2016, 3:16am

Everything old is new again . . .

<pages.cs.wisc.edu/~plonka/netgear-sntp/>

Dobbins_Roland · December 16, 2016, 3:17am

<http://pages.cs.wisc.edu/~plonka/netgear-sntp/>

Dobbins_Roland · December 16, 2016, 4:19am

Over on nznog, Cameron Bradley posited that this may be related to a TR-069/-064 Mirai variant, which makes use of a 'SetNTPServers' exploit. Perhaps one of them is actually setting timeservers? This SANS writeup details the SOAP strings:

<https://isc.sans.edu/forums/diary/Port+7547+SOAP+Remote+Code+Execution+Attack+Against+DSL+Modems/21759>

Dobbins_Roland · December 16, 2016, 9:40am

Looking at the source IP distribution, does a significant proportion of the larger query base seem to originate out-of-region?

Dobbins_Roland · December 16, 2016, 9:44am

And are do they appear to be mostly broadband access networks, or . . . ?

aott01 · December 16, 2016, 5:27pm

Hi,

> Looking at the source IP distribution, does a significant proportion
> of the larger query base seem to originate out-of-region?

And are do they appear to be mostly broadband access networks, or . . .
?

Datapoints are via nfsen (nflow/sflow collection) from a US west coast
network lab that has "three" NTP pool servers, one IPv4 only set to 25
Mbps, the other one IPv4 and IPv6 on the same server both set to 100Mbps
at the NTP pool registration site.

Traffic is about 4 times P95 in the last 3 days from what it was before, and
the increase is IPv4 on the server that has IPv4 and IPv6. IPv6 traffic is
in line with what it used to be, no large increase.

The server with higher bandwidth and IPv4+IPv6 is seeing a large increase
on IPv4, from single hosts that seem to be in broadband networks and a certain
site's crawler that is hosted on AWS. The latter almost looks like someone
hardcoded a config instead of relying on the pool's DNS.

The top talker abuses something in the protocol, this does not look for real and
I will contact Verizon/FiOS

tcpdump -nvvi hme0 port 123 and host 98.113.213.d|grep "Originator - Transmit Timestamp:"
            Originator - Transmit Timestamp: 2123062516.816546608 (1967/04/12 11:35:16)
            Originator - Transmit Timestamp: 862276608.564645656 (1927/04/30 01:16:48)
            Originator - Transmit Timestamp: 3399899220.431115995 (2007/09/27 16:27:00)
            Originator - Transmit Timestamp: 140873162.935483905 (1904/06/19 11:26:02)
            Originator - Transmit Timestamp: 1878223676.912769495 (1959/07/09 16:47:56)
            Originator - Transmit Timestamp: 2713286246.929585296 (1985/12/24 18:37:26)
            Originator - Transmit Timestamp: 3219464534.831489402 (2002/01/08 07:42:14)
            Originator - Transmit Timestamp: 2210689093.339715993 (1970/01/20 16:18:13)
            Originator - Transmit Timestamp: 3899283084.650125848 (2023/07/25 14:11:24)
[...]

nfdump -M /var/nfsen/profiles-data/live/dmz208_0201:br1 -T -R 2016/12/13/nfcapd.201612131630:2016/12/16/nfcapd.201612161630 -n 10 -s record/bytes -A proto,srcip,dstport -6 "dst ip j.k.l.235 and proto udp"
Aggregated flows 51346
Top 10 flows ordered by bytes:
Date first seen Duration Proto Src IP Addr Dst Pt Packets Bytes bps Bpp Flows
2016-12-13 16:31:22.608 259394.340 UDP 98.113.213.d 123 12.3 M 1.1 G 34107 90 3000
2016-12-13 16:50:31.649 253960.650 UDP 54.236.1.d 123 126976 11.4 M 359 90 31
2016-12-13 17:43:29.760 255090.188 UDP 54.236.1.d 123 114688 10.3 M 323 90 28
2016-12-13 20:23:39.198 211054.259 UDP 54.236.1.d 123 90112 8.1 M 307 90 22
2016-12-13 22:29:12.265 218623.774 UDP 204.177.184.d 123 61440 5.5 M 202 90 15
2016-12-14 04:12:44.389 102634.717 UDP 162.243.191.d 123 61440 5.5 M 431 90 15
2016-12-13 22:10:33.226 223641.048 UDP 198.199.99.d 123 53248 4.8 M 171 90 13
2016-12-13 21:31:18.841 194915.427 UDP 220.253.150.d 123 53248 4.8 M 196 90 13
2016-12-13 20:01:40.452 242771.757 UDP troublemaker 123 49152 4.4 M 145 90 12
2016-12-14 05:21:20.634 208902.664 UDP 54.236.1.d 123 40960 3.7 M 141 90 10
Summary: total flows: 60396, total bytes: 21023451720, total packets: 233586118, avg bps: 648125, avg pps: 900, avg bpp: 90
Time window: 1970-01-01 00:00:01 - 2016-12-16 16:34:54
Total flows processed: 29676807, Blocks skipped: 0, Bytes read: 1662858132
Sys: 7.730s flows/second: 3839128.8 Wall: 7.722s flows/second: 3842810.0

Note: "troublemaker" is a host on the internal network that has a known issue
with NTP time keeping, it originates a lot of packets and steps a lot.

Reply to me directly if you want more details.

-andreas

Ask_Bjorn_Hansen · December 19, 2016, 2:26am

Hi Jose,

It’s more widespread than a particular service provider, so it seems more likely it’s a software update for some “IoT” device or similar.

The increase in DNS queries was on the “non-vendor” names, so it’s difficult to know who it is without being on a local network with one of the bad device

The increase in DNS queries is much smaller than the increase in NTP queries that are being seen, so it’s not just more clients, but badly behaving ones.

https://status.ntppool.org/incidents/vps6y4mm0m69

If you have NTP servers that can be added to the pool. it’d be greatly appreciated.

http://www.pool.ntp.org/join.html

Ask