Yahoo Mail problems ? (queue issues in general)

Anyone else seeing Yahoo mail queue up today ? Some of their servers respond in about 10secs with the HELO banner, most others take more than 2m. Because of the recent increase in SPAM, I was looking to reduce the wait time for the initial HELO to 2m from 5m. However, the RFC calls for 5m on the HELO and another 5m for the MAIL command.

Having a process block like that for up to 10m seems a bit excessive to deliver one email (and its probably a bounce to boot!). What are others doing? This problem seems to becoming more and more acute.

  ---Mike

Anyone else seeing Yahoo mail queue up today ? Some of their servers
respond in about 10secs with the HELO banner, most others take more than
2m. Because of the recent increase in SPAM, I was looking to reduce the
wait time for the initial HELO to 2m from 5m. However, the RFC calls for 5m
on the HELO and another 5m for the MAIL command.

Do you have a handle on whether the delay is between the first SYN packet and
finally completing the 3-packet handshake, or is it between that and when the
220 banner actually arrives? Or are both phases an issue?

Having a process block like that for up to 10m seems a bit excessive to
deliver one email (and its probably a bounce to boot!). What are others
doing? This problem seems to becoming more and more acute.

What I do is the *first* attemt to deliver the mail has a highly-non-compliant
5 second timeout (which is just enough for an initial SYN, 2 retransmits, and a
few hundred ms budget for RTT for a SYN+ACK) for the 3-packet handshake, and
then subsequent retries in the background are given a longer 5-min timeout. (I
gathered some stats for quite sime time before deploying that - out of several
million connection attempts, I found less than a dozen that took over 5 seconds
that did in fact complete in under 5 minutes). Once the 3-packet handshake
succeeds, they then get a 5 minute timeout to get the 220 banner out. Probably
not perfect, but it's close enough to keep the queues manageable...

Also, YMMV, so gather your own stats....

> Anyone else seeing Yahoo mail queue up today ? Some of their servers
> respond in about 10secs with the HELO banner, most others take more than
> 2m. Because of the recent increase in SPAM, I was looking to reduce the
> wait time for the initial HELO to 2m from 5m. However, the RFC calls for 5m
> on the HELO and another 5m for the MAIL command.

Do you have a handle on whether the delay is between the first SYN packet and
finally completing the 3-packet handshake, or is it between that and when the
220 banner actually arrives? Or are both phases an issue?

Both, depending on which A record I get

Also mixed in are things like

421 mta174.mail.scd.yahoo.com Resources temporarily unavailable. Please try again later.

Here is an example of one which took quite a long time to respond to the S and then the HELO banner never came up

14:03:10.653498 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 74: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 198626121 0> (DF) [tos 0x10] (ttl 64, id 21505, len 60)
14:03:13.649303 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 74: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 198626421 0> (DF) [tos 0x10] (ttl 64, id 21521, len 60)
14:03:16.849310 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 74: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 198626741 0> (DF) [tos 0x10] (ttl 64, id 21531, len 60)
14:03:20.049332 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 60: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460> (DF) [tos 0x10] (ttl 64, id 21536, len 44)
14:03:23.249367 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 60: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460> (DF) [tos 0x10] (ttl 64, id 21543, len 44)
14:03:26.449416 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 60: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460> (DF) [tos 0x10] (ttl 64, id 21547, len 44)
14:03:32.649436 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 60: 205.211.164.51.2013 > 64.156.215.5.25: S [tcp sum ok] 944590797:944590797(0) win 57344 <mss 1460> (DF) [tos 0x10] (ttl 64, id 21576, len 44)
14:03:32.728687 0:90:27:5d:4e:ee 0:1:29:2c:b6:30 0800 60: 64.156.215.5.25 > 205.211.164.51.2013: S [tcp sum ok] 4275443659:4275443659(0) ack 944590798 win 65535 <mss 1460> (ttl 55, id 11594, len 44)
14:03:32.728717 0:1:29:2c:b6:30 0:90:27:5d:4e:ee 0800 60: 205.211.164.51.2013 > 64.156.215.5.25: . [tcp sum ok] 1:1(0) ack 1 win 58400 (DF) [tos 0x10] (ttl 64, id 21579, len 40)

So in the above case, the process just blocks (with sendmail, it does eat a lot of RAM) waiting to hit the HELO timeout.

> Having a process block like that for up to 10m seems a bit excessive to
> deliver one email (and its probably a bounce to boot!). What are others
> doing? This problem seems to becoming more and more acute.

What I do is the *first* attemt to deliver the mail has a highly-non-compliant

Yes, this is sort of what I have as well. 9 seconds on the initial connect in my case. That gets the lion's share through. The subsequent deliverys are much more patient. In this day and age, you would think

define(`confTO_HELO', `1m')
define(`confTO_MAIL', `2m')

would be safe....

         ---Mike