Blocked port 25?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In the last couple of days, I have received complaints from customers not able to receive email from certain sites. From these sites, I can't connect to our mail server, on other sites, I can. We have tried sending email, and we have also tried telnet on port 25 to the server. I can't seem to find a correlation. There is no firewall on our network. We have an access list to filter port 25, but this server is allowed. Our mail server is also our DNS server. From the sites that I can't connect to our server on port 25, I can query the DNS server using nslookup and get a response.

I tried tcptraceroute from one of the sites where I have a unix account, but it is behind a firewall, and it dies after the first hop. I'm stumped. Any suggestions.

Byron L. Hicks
Network Engineer
NMSU ICT

In the last couple of days, I have received complaints from customers
not able to receive email from certain sites.

  If I understand you correctly, you are saying that these sites are not able
to send mail to you. Assuming that they are diverse sites that don't have
significant similarities, this suggests that the problem is on your end.

From these sites, I
can't connect to our mail server, on other sites, I can.

  I don't understand what this is supposed to mean. It's their mail servers
that are supposed to try to connect to your mail server.

We have tried
sending email, and we have also tried telnet on port 25 to the server.
I can't seem to find a correlation. There is no firewall on our
network. We have an access list to filter port 25, but this server is
allowed. Our mail server is also our DNS server. From the sites that
I can't connect to our server on port 25, I can query the DNS server
using nslookup and get a response.

  This doesn't tell you anything about why their mailservers might not be
able to reach your mailserver.

I tried tcptraceroute from one of the sites where I have a unix
account, but it is behind a firewall, and it dies after the first hop.
I'm stumped. Any suggestions.

  You really haven't given a clear description of the problem. When you say
customers can't receive email from certain sites, I'm assuming this means
people at those sites send email to your customers and the email does not
appear in your customers inboxes. From this, I would conclude that their
mailservers are not able to (or willing to) send the email to your
mailserver.

  When you say you can't connect to your server on port 25, where exactly are
you trying from? Did you try emailing (or calling) the administrators of
those sites? If you use SPF, are your records valid? Do the senders get any
bounces?

  Your statement of the problem is lack of specifics. We can't check your SPF
records. We can't check if those domains have a common provider. So all we
can do is tell you to troubleshoot.

  DS

  If I understand you correctly, you are saying that these sites are not able
to send mail to you. Assuming that they are diverse sites that don't have
significant similarities, this suggests that the problem is on your end.

In theory, I agree. But I'm running out of options in my troubleshooting and I'm looking for some wisdom from some of the experts.

From these sites, I
can't connect to our mail server, on other sites, I can.

  I don't understand what this is supposed to mean. It's their mail servers
that are supposed to try to connect to your mail server.

I understand that. I have unix account access at one of the sites that cannot connect to our mail servers. I have sent test email, and I have tried to telnet to port 25 on the mail server, and the connection times out. I have put a Finisar network analyzer on the ethernet port of our border router, and I don't see the traffic even crossing the router. We have no firewall, and the access-list is right on the router (we are receiving mail from other sites). What else can I look at?

  When you say you can't connect to your server on port 25, where exactly are
you trying from?

I have a unix account on a server at one of the remote sites that cannot send email to NMSU.

Did you try emailing (or calling) the administrators of
those sites?

They just point to me and say "The problem is on your end, fix it." Much like you are saying in this email.

If you use SPF, are your records valid? Do the senders get any
bounces?

We aren't getting bounces from our mail server. Their mail servers are bouncing the messages because the connection to our mail server timed out.

  Your statement of the problem is lack of specifics. We can't check your SPF
records. We can't check if those domains have a common provider.

The domains in question do not have a common provider. We are not using SPF.

So all we
can do is tell you to troubleshoot.

I understand that. Let me restate my request: If anyone on nanog cannot send email to nmsu.edu, please send me a tcptraceroute on port 25 to our mail server. I need some forensics to help me diagnose this problem. You will have to reply to me at byronhicks@byronhicks.com to keep the noise level down on the list. Thanks in advance for any help that I will receive.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yo Byron!

I've found a similar problem... From www.traceroute.org i can trace to a
server from some countries and not others... and even some states and
not others... For example, I could traceroute to a specific site in
Atlanta from baltimore but not from Iowa, i could traceroute from one
provider in Dublin but not another... i could traceroute from France but
not from UK... The problem seems to be with the yipes network... not
sure what's causing it though... it's been going on for a while.

I've found a similar problem... From www.traceroute.org i can trace to a

server from some countries and not others... and even some states and
not others...

This is something that was seen when DCEF over so-called parallel
paths was commonly used. It was a result of a hashing algorithm
that chose certain source IP addresses to take one path and
other IP addresses to take another. In that case, the effect
showed up when one person at a site had a problem and another
person did not. Obviously, DCEF alone will not cause problems
because it just sends traffic down different paths and there
was something else causing the two paths to behave differently.

Nowadays some people are using MPLS to load balance traffic
between two LSPs. But if the two routers at the ends of
the alternate LSPs do not have the same view of the network
then you can get this sort of effect.

Presumably you have verified that traffic sourced in one
provider's network behaves differently depending on what
part of the network you source it from? If I were you, I
would isolate the problem to one provider's network and then
work with their NOC to do further troubleshooting.

--Michael Dillon