Linux shaping packet loss

Chris3 · December 8, 2009, 1:13pm

Hi All,

It would be appreciated if anyone using TC on Linux for shaping could please
help with an intermittent problem on an egress interface.

I'm seeing about ten per cent of packet loss for all classes at seemingly
quiet times and random parts of the day using about forty classes and
250Mbps. I've isolated it to the egress HTB qdisc.

Any TC experts out there have a spare minute please ? Any thoughts on the
RED qdisc ?

Thanks very much,

Chris

Bret_Clark · December 8, 2009, 2:43pm

Won't say I'm an expert with TC, but anytime I see packet loss on an interface I always check the interface itself...10% packet loss is pretty much what you would get if there was a duplex problem. I always try to hard set my interfaces on both the Linux machines and Switches.

Bret

Chris wrote:

sthaug · December 8, 2009, 3:01pm

Won't say I'm an expert with TC, but anytime I see packet loss on an
interface I always check the interface itself...10% packet loss is
pretty much what you would get if there was a duplex problem. I always
try to hard set my interfaces on both the Linux machines and Switches.

Used to set everything hard five years ago. Nowadays auto works just
fine most of the time.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Joe_Abley2 · December 8, 2009, 3:13pm

I find there is a lot of hard-coded wisdom that hard-coded speed duplex are the way to avoid pain.

The last time I saw anybody do a modern survey of switches, routers and hosts, however, it seemed like the early interop problems with autoneg on FE really don't exist today, and on balance there are probably more duplex problems caused by hard-configured ports that are poorly maintained in the heat of battle than there are because autoneg is flaky.

I've also heard people say that whatever you think about autoneg in Fast Ethernet, on Gigabit and 10GE interfaces it's pretty much never the right idea to turn autoneg off.

I am profoundly ignorant of the details of layer-2. It'd be nice to have more than vague rhetoric to guide me when configuring interfaces. What reliable guidance exists for this stuff?

Joe

Chris3 · December 8, 2009, 3:14pm

Thanks, Steiner and everyone for the input. It's good to see the list is
still as friendly as ever.

There are two paths I'm trying to get my head round after someone offlist
helpfully suggested putting cburst and burst on all classes.

My thoughts are that any dropped packets on the parent class is a bad thing:

qdisc htb 1: root r2q 10 default 265 direct_packets_stat 448 ver 3.17
Sent 4652558768 bytes 5125175 pkt (dropped 819, overlimits 10048800
requeues 0)
rate 0bit 0pps backlog 0b 28p requeues 0

Until now I've had Rate and Ceil at the same values on all the classes but I
take the point about cburst and burst allowing greater levels of borrowing
so I've halved the Rate for all classes and left the Ceil the same.

I've gone done this route mainly because I really can't risk breaking things
with incorrect cburst and burst values (if anyone can please tell me on an
i686 box at, say, 10Mbps the ideal values I can translate them into higher
classes, TC seems to work them out as 1600b/8 mpu by default and the timing
resolution confuses me.)

Thanks again,

Chris

Matlock_Kenneth_L · December 8, 2009, 3:18pm

The biggest problem with duplex had to do with 100mb.

Cisco (and a lot of other companies) decided in their infinite wisdom
that at 100mb if auto-negotiation fails, to use half duplex as the
default. So if you have both sides at auto, or both sides hard-set it's
all good. But if one side is hard-set and the other is auto, a lot of
times the auto device will come up 100/Half.

These days at 1Gb+ Full-Duplex seems to be the 'default' for
auto-negotiation failures.

Ken Matlock
Network Analyst
Exempla Healthcare
(303) 467-4671
matlockk@exempla.org

Tony_Finch · December 8, 2009, 3:47pm

I find there is a lot of hard-coded wisdom that hard-coded speed duplex
are the way to avoid pain.

That was definitely true in the mid-to-late 1990s.

The last time I saw anybody do a modern survey of switches, routers and
hosts, however, it seemed like the early interop problems with autoneg
on FE really don't exist today, and on balance there are probably more
duplex problems caused by hard-configured ports that are poorly
maintained in the heat of battle than there are because autoneg is
flaky.

Yes. The autoneg specification was fixed in 1998 so modern kit should
interoperate properly.

I've also heard people say that whatever you think about autoneg in Fast
Ethernet, on Gigabit and 10GE interfaces it's pretty much never the
right idea to turn autoneg off.

Autoneg is a required part of the gig E specification so you'd only be
causing yourself trouble by turning it off. (I don't know if it'll also
break automatic MDI/MDI-X (crossover) configuration, for an example of
something that's nice to have.)

Tony.

sthaug · December 8, 2009, 4:13pm

The biggest problem with duplex had to do with 100mb.

Cisco (and a lot of other companies) decided in their infinite wisdom
that at 100mb if auto-negotiation fails, to use half duplex as the
default.

No, that wasn't those companies deciding to do so in their infinite
wisdom. That was those companies deciding to follow the IEEE standard!

Cisco and others may be to blame for a lot of things, but not this one.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Brian_Bruns · December 8, 2009, 4:38pm

behavior later on that may cause a bit of grief. Our SAN (LHN NSM260) refused flat out to do 802.3ad - giving us duplex errors. Took us around an hour of diagnosing - first we thought it was the switch, then we thought it was the cables we used, etc. Finally it dawned on me that my partner is notorious for hard coding ports on our own equipment.

Low and behold, after her swearing up and down that there's no way its that, we set both ends to auto negotiate and boom, bonding came up happy as a clam.

Only one port on our entire setup that is hard coded - 10BaseT-FD - and thats only because the darn thing refuses to auto negotiate to full duplex for 10BaseT links. I'm almost positive that a year or two down the line, we're going to forget that is there when we change the link to 100BaseT.

Michael_Holstein · December 8, 2009, 6:08pm

From my own experience, turning off auto negotiate can lead to unusual
behavior later on

I too had this crop up in an unusual manner .. the hardware was HP with
Intel Pro 1000 on one side, and Cisco 65xx on the other. Neither side
saw errors, and (most) everything seemed to work .. however, one java
app that depended on SSL would constantly fail when it tried to retrieve
a file.

Both sides hard coded (didn't matter to what) wouldn't work. When we
upgraded to GigE blades, it still wouldn't work in any hard-coded
configuration, even though the O/S (Win2k3 .. RDP, FTP, etc.) appeared
to work.

Set both sides auto/auto : bam. problem solved.

The app was Sonicwall Email Security (on the off-chance someone else is
fighting that same issue).

Cheers,

Michael Holstein
Cleveland State University

Scott_Howard · December 8, 2009, 6:24pm

Thankfully it's even more than a "seems to be" - it's written into the IEEE
spec that if duplex negotiation fails then the default is full duplex for
1Gbps, as opposed to HDX for 100Mbps and earlier.

Scott

Nickola_Kolev · December 8, 2009, 7:09pm

Hi All,

It would be appreciated if anyone using TC on Linux for shaping could please
help with an intermittent problem on an egress interface.

Well, it's unbelievable, but almost 5 hours and 11 mails later not even
one of them has mentioned something different than L2
incompatibilities! And this, IMHO, has nothing to do with Chris
problems.

I'd really expect more from the guys that make the Internet run...
Anyway...

I'm seeing about ten per cent of packet loss for all classes at seemingly
quiet times and random parts of the day using about forty classes and
250Mbps. I've isolated it to the egress HTB qdisc.

I'd start with a careful revisit of each class and the classifier that
goes with it. I'd pay special attention to go with u32/hash classifiers
(filters), and not with iptables.

You can try to visualize the number of packets in each class
(queued,dropped), and that way you will probably could see where the
problem is.

Any TC experts out there have a spare minute please ? Any thoughts on the
RED qdisc ?

As for this, I'd suggest to take a look at:

[1] Random Early Detection (RED)
[2] http://www.opalsoft.net/qos/DS-26.htm

Nathan_Ward · December 8, 2009, 8:48pm

Yes it will break auto MDI/MDI-X.

Simon_Horman · December 9, 2009, 1:07am

Silly question, but are you leaving some headroom?

Its a little while since I've worked with HTB and
on the kernel that is in use, but trying to use much
more than 90% of the link capacity caused troubles for me.
In particular I'm referring to the ceil value of the root class.

I also noticed that at higher packet rates (I was doing gigabit in a lab)
that increasing r2q helped me. However I was looking at (UDP) throughput
not packet loss.

gord · December 9, 2009, 5:47am

Apologies to all on handheld devices. If you're not into BSD or Linux TC
operationally, skip this post. Due to my usual rambling narrative style
for "alternative" troubleshooting I was going to mail this direct to the
OP but I was persuaded AMBJ by a co-conspirator to post this to list in
full.

Bazy · December 9, 2009, 6:02am

Hi Chris,

Try setting txqueuelen to 1000 on the interfaces and see if you still
get a lot of packet loss.

Bazy

gord · December 9, 2009, 6:38am

Yes, good point and well worth a try. Rereading Chris's post about
"250Mbps" and "forty queues", the "egress" could well be bumping the end
of a default fifo line.

If 1000 is too high for your kit try pushing it upwards gradually from
the default of 100 (?) but back off if you get drops or strangeness in
ifconfig output on the egress i/f.

I append grep-ped ifconfig outputs into a file every hour on a cron job
until I'm happy that strangeness doesn't happen, they never do when
you're watching sadly.

TC problems aren't always about the TC itself, the physical interfaces
are inherently part of the "system", as my long rambling 5am+
up-all-night-over-ssh post about reseating NICs was trying to hint at.

Nice one Bazy

Gord

gord · December 9, 2009, 7:05am

meh! 6am+insomniac blues

for a Gigeth it's more likely to be 1000 already, so push it up to 10000
in stages - you get the idea.

Nickola_Kolev · December 9, 2009, 7:18am

На Wed, 09 Dec 2009 06:38:31 +0000
gordon b slater <gordslater@ieee.org> написа:

> Hi Chris,
>
> Try setting txqueuelen to 1000 on the interfaces and see if you
> still get a lot of packet loss.
>

Yes, good point and well worth a try. Rereading Chris's post about
"250Mbps" and "forty queues", the "egress" could well be bumping the
end of a default fifo line.

If 1000 is too high for your kit try pushing it upwards gradually from
the default of 100 (?) but back off if you get drops or strangeness in
ifconfig output on the egress i/f.

The default *is* 1000. From the ifconfig man page:

txqueuelen length

Set the length of the transmit queue of the device. It is useful to
set this to small values for slower devices with a high atency (modem
links, ISDN) to prevent fast bulk transfers from disturbing interactive
traffic like telnet too much.

So, if you should touch it if and only if you want to have (supposedly)
finer grained control on queueing, as the hardware device also does
some reordering before it puts the data on the wire.

Chris3 · December 10, 2009, 12:49pm

Thanks to all that replied.

Trial and error it is ... I'm now waiting (22 hours later) for it to break
again after I changed the priority on the "default" catch-all class. It
lasted five days before.

I'm looking at CBQ but it's not at all friendly relative to HTB.

If I'm forced to go down the proprietary traffic-shaping route. What's good
for really cheap gigabit, redundant, high throughput (including during
64-byte UDP attacks) shapers ? Suggestions appreciated.

Chris