black listing of web traffic

Hi list

I have a problem that I can't seem to find a solution to yet. My student
network is being NATted out and anyone who's on that network had troubles
accessing random websites.
For example, going to www.apple.com or www.facebook.com would work great,
but store.apple.com would either not load or take forever to open up.

I've had that problem last week and thought I tracked it down to the NAT ip
being black listed with one of the span black lists. Even though that IP is
not used for mail out, that somehow seemed to affect it. Changing it to a
different one seemed to solve the problem and I got that original address of
the list in the mean time. Changed it back and everything was well, until
today.
Same symptoms, but now I don't see us listed anywhere.
The best description of the symptoms seems to be that that IP is rate
limited or something.

Anyone seen that? Are there any blacklists for web access?

PS. I checked everything under my control and i don't see a bottle neck
anywhere or anything like and IPS working up or something....

I know that cisco either are or have integrated the IronPort
reputation service into their IPS devices, maybe a check on www.senderbase.org
  could help.

Chris Campbell

Other than the Spamhaus DROP list, I've never heard of blacklisting being applied to IP routing. Were some of your IPs somehow on their DROP list?

http://www.spamhaus.org/drop/

The RBL was originally distributed via BGP.

Tony.

True...and I was a subscriber, so I should have remembered that...but it was roughly a decade ago and in that form dead most of that time. Irrelevant to this guy's current issue.

Can't find my IP on any of the black lists. Don't have any proxies. Sites
that behave poorly are consistent. That is to say that facebook.com,
apple.com would always come up without an issue, but cnn.com,
forever21.com(i know, don't ask, students),
store.apple.com would consistently take forever to come up.

Just wanted to check of rate-limiting web clients is a common practice
nowdays in the industry. If it's not, it's probably an unlikely cause of my
troubles...

Thanks,
Andrey

Andrey Gordon wrote:

Can't find my IP on any of the black lists. Don't have any proxies. Sites
that behave poorly are consistent. That is to say that facebook.com,
apple.com would always come up without an issue, but cnn.com,
forever21.com(i know, don't ask, students),
store.apple.com would consistently take forever to come up.

Just wanted to check of rate-limiting web clients is a common practice
nowdays in the industry. If it's not, it's probably an unlikely cause of my
troubles...

Other things you might want to check out include whether your NAT
gateway is well-behaved in the presence of PMTU discovery, TCP
timestamps, and ECN. The web sites your students are having trouble
with may share some property that, correctly or not, is interacting
poorly with your NAT implementation.

(I remain astonished at the number of "big name" web sites out there
that send out their content with the DF bit set, then drop the
"fragmentation required" ICMP packets they get back on the floor.)

Jim Shankland

Andrey Gordon wrote:

Can't find my IP on any of the black lists. Don't have any proxies. Sites
that behave poorly are consistent. That is to say that facebook.com,
apple.com would always come up without an issue, but cnn.com,
forever21.com(i know, don't ask, students),
store.apple.com would consistently take forever to come up.

Just wanted to check of rate-limiting web clients is a common practice
nowdays in the industry. If it's not, it's probably an unlikely cause of my
troubles...

It could be that the problem sites have some form of load balancer that has an issue keeping state on multiple sessions from the same IP.

You mentioned that changing the source IP fixed it. Is this a temporary fix that breaks after several users access the sites from the new IP?

Thx to all the folks replying off the list.

The more I trouble shoot the more I'm convinced that it's not the sites that
are doing rate-limiting. I went to a website of one of my previous employers
(a small company). Chances of them having a fancy reverse proxy with some
sort of black list filtering are slim to none, yet their site barely opens
up as well.

Must be something that either my firewall device is doing (which is what is
doing the NATting) or I don't' know what else. I'm working with my firewall
guy since f/w is his domain and I have no clue about that vendor of the
firewalls (PaloAlto).

Thanks all for the suggestions. I'll keep digging.

Could it be a dns issue? Some sites trying to resolve your ip address and others don't?

By changing my outbound IP address to a different one (i suspect effectively
resetting sessions) the problem was solved. So, after that I set it back to
the original source NAT. And the sites open up just fine still. It really
behaves like a NAT table exhaustion, but the firewall only reports 13000
sessions in progress for all the NAT addresses on that firewall. I'm
thinking memory leak or something. We only put that device in place this
winter break and this is the second time this is happening. Last time was
about 2-3 weeks ago.

Seems to be fixed for now and the f/w dude is opening a ticket with the f/w
vendor.

Thanks to all,
The problem seems to be fixed by changing the NAT ip to something else and
than back.

It does seem much like NAT exhaustion even though the f/w claims only 13K
session for two dynamic NATs and about 20 static ones.
What I don't get is why there is consistency in opening sites. Why does
facebook open all the time and store.apple.com barely opens all the time.
I'd say if it would be NAT exhaustion, they would all behave the same way
meaning open and then not open and then open again.

It is solved for the time being.
Again, thanks to all.

That's not surprising behaviour on a PaloAlto unit, they are still
very young in the market and my colleagues have had issues with NAT
and proxy arp in the recent past.

Chris Campbell

This sounds like possibly a hash table with a spectacularly poor hash function,
causing most of your entries to be in only a few hash buckets. You hit one
of the 497 buckets that has 0 or 1 or 3 entries, it works great. You hit one
of 3 buckets that has 4,000+ entries in it, things suck. (You Linux geeks
can quit smirking - Linux had a very similar issue in its networking stack
not so long ago).

Never underestimate the ability of vendor engineers to write hilariously
poor code:

http://thedailywtf.com/Articles/Else-where.aspx

You really gotta assume that your firewall code (or any other code, for that
matter) was written by that programmer until proved otherwise.

A few months ago I was involved in a hard-to-troubleshoot intermittent
problems similar to yours. I finally diagnosed a faulty or overloaded
state table somewhere in one of the cheap plastic routers they were
using. All problems ended when I replaced the cheap plastic stuff with a
x86 hardware running pf or iptables, I forget exactly which
(irrelevant).

Could it be that you have some arp-poisoning going on? That was my first
thought in the above situation, but Wireshark showed otherwise.
The clue to the state tables - it was mainly SSL/TLS that was getting
expired/dropped.

Gord

My guess the fault drives some SSL/TLS sessions through some
loadbalancers mad, but not all :slight_smile:

Gord

You mentioned this was a student network. Could it be your students are running bit torrent clients and your ISP doesn't like that so they are rate limiting you? This might explain why apple loads and facebook doesn't. I do not know much about facebooks architecture, but I would guess they would use a CDN or have their own so the facebook traffic would stay entirely in your ISP's network(less need to rate limit) and apples traffic may need to go through a peer.

Or, could it be your students are running bit torrent and exhausting the state tables on your firewall.

Dylan Ebner, Network Engineer
Consulting Radiologists, Ltd.