Windows 2008/2012 arp timeout process

Greetings Nanog,

I apologize in advance if this should be directed towards a server/systems discussion list, but I've noticed some (what I think are) issues with the way windows 2008/2012 handles arp. I started noticing some high arp processes on some of our 6500s running sup720s, and after performing some captures of packets being punted to the cpu I found that there were quite a few repeat sources. After digging into the sources, it looks like windows 2008/2012 systems are sending arp refresh requests quite frequently.

According to this article ( http://support.microsoft.com/kb/949589 ), if the neighbor entry is in use for the IP it should not go stale. Specifically:

"If the entry is in the "Reachable" state, Windows Vista TCP/IP hosts do not send ARP requests to the network. Therefore, Windows Vista TCP/IP hosts use the information in the cache. If an entry is not used, and it stays in the "Reachable" state for longer than its "Reachable Time" value, the entry changes to the "Stale" state. If an entry is in the "Stale" state, the Windows Vista TCP/IP host must send an ARP request to reach that destination."

I know that states Windows Vista, but the "applies to" section lists the other OSes.

I've replicated this in my lab (server pinging its own gateway while capturing traffic), and I am seeing the same issue:

222 10:05:18.462720 Dell_a6:dc:52 All-HSRP-routers_0a ARP Who has 10.36.0.1? Tell 10.36.0.31
223 10:05:18.464759 All-HSRP-routers_0a Dell_a6:dc:52 ARP 10.36.0.1 is at 00:00:0c:07:ac:0a
1886 10:06:31.962218 Dell_a6:dc:52 All-HSRP-routers_0a ARP Who has 10.36.0.1? Tell 10.36.0.31
1887 10:06:31.963004 All-HSRP-routers_0a Dell_a6:dc:52 ARP 10.36.0.1 is at 00:00:0c:07:ac:0a
3348 10:07:23.461682 Dell_a6:dc:52 All-HSRP-routers_0a ARP Who has 10.36.0.1? Tell 10.36.0.31
3349 10:07:23.471003 All-HSRP-routers_0a Dell_a6:dc:52 ARP 10.36.0.1 is at 00:00:0c:07:ac:0a

I've tried this on various devices, and the only place I don't see this behavior is on wireless interfaces.

I'm more of a linux guy, and performing the same tests there I see the behavior stated in this article (which is what I would expect) - http://linux-ip.net/html/ether-arp.html . Specifically:

"Entries in the ARP cache are periodically and automatically verified unless continually used."

Has anyone run into this issue before ? Have a fix ? Point me to any documentation or other distros that I should ask ?

TIA,
James

Hi James,

Is your windows client seeing traffic from the 6500 with the real (Burned
in) MAC address of your 6500? If so it may be re-arping to find out which
of the MAC addresses is the 'right' one to use, the real MAC or the HSRP
MAC.

My memory is fuzzy, but I think I've seen issues like that before. Sorry
its been a while so I can't remember anything more specific.

-Marcel

No, but to isolate any possible layer2 traffic that could affect the issue, one of my colleagues performed host to guest testing in a VM and we are seeing the same issue.

14:28:30.420589 00:1c:42:d7:92:84 > 00:1c:42:00:00:08, ethertype ARP (0x0806), length 42: Request who-has 10.211.55.2 (00:1c:42:00:00:08) tell 10.211.55.3, length 28
14:28:30.420684 00:1c:42:00:00:08 > 00:1c:42:d7:92:84, ethertype ARP (0x0806), length 60: Reply 10.211.55.2 is-at 00:1c:42:00:00:08, length 46
14:29:03.421388 00:1c:42:d7:92:84 > 00:1c:42:00:00:08, ethertype ARP (0x0806), length 42: Request who-has 10.211.55.2 (00:1c:42:00:00:08) tell 10.211.55.3, length 28
14:29:03.421505 00:1c:42:00:00:08 > 00:1c:42:d7:92:84, ethertype ARP (0x0806), length 60: Reply 10.211.55.2 is-at 00:1c:42:00:00:08, length 46
14:29:36.423363 00:1c:42:d7:92:84 > 00:1c:42:00:00:08, ethertype ARP (0x0806), length 42: Request who-has 10.211.55.2 (00:1c:42:00:00:08) tell 10.211.55.3, length 28
14:29:36.423463 00:1c:42:00:00:08 > 00:1c:42:d7:92:84, ethertype ARP (0x0806), length 60: Reply 10.211.55.2 is-at 00:1c:42:00:00:08, length 46
14:30:09.424479 00:1c:42:d7:92:84 > 00:1c:42:00:00:08, ethertype ARP (0x0806), length 42: Request who-has 10.211.55.2 (00:1c:42:00:00:08) tell 10.211.55.3, length 28

The "real" traffic was just pings between the host/vm, and a raw capture was performed and the only mac addresses in use were the ones listed above.