RINA - scott whaps at the nanog hornets nest :-)

It's really quiet in here. So, for some Friday fun let me whap at the hornets nest and see what happens... >;-)

http://www.ionary.com/PSOC-MovingBeyondTCP.pdf

It's really quiet in here. So, for some Friday fun let me whap at the hornets nest and see what happens... >;-)

http://www.ionary.com/PSOC-MovingBeyondTCP.pdf

Who ever wrote that doesn't know what they're talking about. LISP is
not the IETF's proposed solution (the IETF don't have one, the IRTF do),
and streaming media was seen to be one of the early applications of the
Internet - these types of applications is why TCP was split out of
IP, why UDP was invented, and why UDP has has a significantly
different protocol number to TCP.

Arguments about locator/identifier splits aside (which I happen to agree
with), this thing goes off the deep end on page 7 when it starts talking
about peering infrastructure. Infact pretty much every sentence on that
page is blatantly wrong. :slight_smile:

It's really quiet in here. So, for some Friday fun let me whap at the hornets nest and see what happens... >;-)

http://www.ionary.com/PSOC-MovingBeyondTCP.pdf

Who ever wrote that doesn't know what they're talking about. LISP is
not the IETF's proposed solution (the IETF don't have one, the IRTF do),

Um, I would not agree. The IRTF RRG considered and is documenting a lot of things, but did not
come to any consensus as to which one should be a "proposed solution."

Regards
Marshall

It's really quiet in here. So, for some Friday fun let me whap at the hornets nest and see what happens... >;-)

http://www.ionary.com/PSOC-MovingBeyondTCP.pdf

This tired bumblebee concludes that another instance of "Two bypassed
computer scientists who are angry that ISO OSI didn't catch on gripe
about this, and call IP esp. IPv6, names in effort to taint it." isn't
enough to warrant anything but a yawn.

More troubling might be http://www.iec62379.org/ and what they (I think
they are ATM advocates of the most bellheaded form) are trying to push
into ISO standard. Including gems like "Research during the decade
leading up to 2010 shows that the connectionless packet switching
paradigm that is inherent in Internet Protocol is unsuitable for an
increasing proportion of the traffic on the Internet. " Sic!

Now that is something to bite into.

>
>>
>>
>> It's really quiet in here. So, for some Friday fun let me whap at the hornets nest and see what happens... >;-)
>>
>>
>> http://www.ionary.com/PSOC-MovingBeyondTCP.pdf
>>
>
> Who ever wrote that doesn't know what they're talking about. LISP is
> not the IETF's proposed solution (the IETF don't have one, the IRTF do),

Um, I would not agree. The IRTF RRG considered and is documenting a lot of things, but did not
come to any consensus as to which one should be a "proposed solution."

I probably got a bit keen, I've been reading through the IRTF RRG
"Recommendation for a Routing Architecture" draft which, IIRC, makes a
recommendation to pursue Identifier/Locator Network Protocol rather
than LISP.

Regards,
Mark.

SCTP is a great protocol. It has already been implemented in a number of stacks. With these benefits over that theory, it still hasn't become mainstream yet. People are against change. They don't want to leave v4. They don't want to leave tcp/udp. Technology advances, but people will only change when they have to.

Jack (lost brain cells actually reading that pdf)

Sent: Saturday, November 06, 2010 9:45 AM
To: nanog@nanog.org
Subject: Re: RINA - scott whaps at the nanog hornets nest :slight_smile:

>
> It's really quiet in here. So, for some Friday fun let me whap at
the hornets nest and see what happens...>;-)
>
>
> http://www.ionary.com/PSOC-MovingBeyondTCP.pdf
>

SCTP is a great protocol. It has already been implemented in a number
of
stacks. With these benefits over that theory, it still hasn't become
mainstream yet. People are against change. They don't want to leave v4.
They don't want to leave tcp/udp. Technology advances, but people will
only change when they have to.

Jack (lost brain cells actually reading that pdf)

I believe SCTP will become more widely used in the mobile device world. You can have several different streams so you can still get an IM, for example, while you are streaming a movie. Eliminating the "head of line" blockage on thin connections is really valuable.

It would be particularly useful where you have different types of traffic from a single destination. File transfer, for example, might be a good application where one might wish to issue interactive commands to move around the directory structure while a large file transfer is taking place.

If you really want to shake a hornet's nest, try getting people to get rid of this idiotic 1500 byte MTU in the "middle of the internet" and try to get everyone to adopt 9000 byte frames as the standard. That change right there would provide a huge performance increase, load reduction on networks and servers, and with a greater number of native ethernet end to end connections, there is no reason to use 1500 byte MTUs. This is particularly true with modern PMUT methods (such as with modern Linux kernels ... /proc/sys/net/ipv4/tcp_mtu_probing set to either 1 or 2).

While the end points should just be what they are, there is no reason for the "middle" portion, the long haul transport part, to be MTU 1500.

http://staff.psc.edu/mathis/MTU/

> Sent: Saturday, November 06, 2010 9:45 AM
> To: nanog@nanog.org
> Subject: Re: RINA - scott whaps at the nanog hornets nest :slight_smile:
>
> >
> > It's really quiet in here. So, for some Friday fun let me whap at
> the hornets nest and see what happens...>;-)
> >
> >
> > http://www.ionary.com/PSOC-MovingBeyondTCP.pdf
> >
>
> SCTP is a great protocol. It has already been implemented in a number
> of
> stacks. With these benefits over that theory, it still hasn't become
> mainstream yet. People are against change. They don't want to leave v4.
> They don't want to leave tcp/udp. Technology advances, but people will
> only change when they have to.
>
>
> Jack (lost brain cells actually reading that pdf)

I believe SCTP will become more widely used in the mobile device world. You can have several different streams so you can still get an IM, for example, while you are streaming a movie. Eliminating the "head of line" blockage on thin connections is really valuable.

It would be particularly useful where you have different types of traffic from a single destination. File transfer, for example, might be a good application where one might wish to issue interactive commands to move around the directory structure while a large file transfer is taking place.

If you really want to shake a hornet's nest, try getting people to get rid of this idiotic 1500 byte MTU in the "middle of the internet"

I doubt that 1500 is (still) widely used in our Internet... Might be,
though, that most of us don't go all the way to 9k.

mh

I doubt that 1500 is (still) widely used in our Internet... Might be,
though, that most of us don't go all the way to 9k.

mh

Last week I asked the operator of fairly major public peering points if they supported anything larger than 1500 MTU. The answer was "no".

There's still a metric buttload of SONET interfaces in the core that
won't go above 4470.

So, you might conceivably get 4k MTU at some point in the future, but
it's really, *really* unlikely you'll get to 9k MTU any time in the next
decade.

Matt

There's still a metric buttload of SONET interfaces in the core that
won't go above 4470.

So, you might conceivably get 4k MTU at some point in the future, but
it's really, *really* unlikely you'll get to 9k MTU any time in the
next
decade.

Matt

Agreed. But even 4470 is better than 1500. 1500 was fine for 10G
ethernet, it is actually pretty silly for GigE and better.

This survey that Dykstra did back in 1999 points out exactly what you
mentioned:

http://sd.wareonearth.com/~phil/jumbo.html

And that was over a decade ago.

There is no reason, in my opinion, for the various peering points to be
a 1500 byte bottleneck in a path that might otherwise be larger.
Increasing that from 1500 to even 3000 or 4500 gives a measurable
performance boost over high latency connections such as from Europe to
APAC or Western US. This is not to mention a reduction in the number of
ACK packets flying back and forth across the Internet and a general
reduction in the number of packets that must be processed for a given
transaction.

1500 was fine for 10G

I meant, of course, 10M ethernet.

>
> Last week I asked the operator of fairly major public peering points
if they supported anything larger than 1500 MTU. The answer was "no".
>

There's still a metric buttload of SONET interfaces in the core that
won't go above 4470.

So, you might conceivably get 4k MTU at some point in the future, but
it's really, *really* unlikely you'll get to 9k MTU any time in the
next
decade.

Matt

There is no reason why we are still using 1500 byte MTUs at exchange points.

From Dykstra's paper (note that this was written in 1999 before wide deployment of GigE):

(quote)

Does GigE have a place in a NAP?

Not if it reduces the available MTU! Network Access Points (NAPs) are at the very "core" of the internet. They are where multiple wide area networks come together. A great deal of internet paths traverse at least one NAP. If NAPs put a limitation on MTU, then all WANs, LANs, and end systems that traverse that NAP are subject to that limitation. There is nothing the end systems could do to lift the performance limit imposed by the NAP's MTU. Because of their critically important place in the internet, NAPs should be doing everything they can to remove performance bottlenecks. They should be among the most permissive nodes in the network as far as the parameter space they make available to network applications.

The economic and bandwidth arguments for GigE NAPs however are compelling. Several NAPs today are based on switched FDDI (100 Mbps, 4 KB MTU) and are running out of steam. An upgrade to OC3 ATM (155 Mbps, 9 KB MTU) is hard to justify since it only provides a 50% increase in bandwidth. And trying to install a switch that could support 50+ ports of OC12 ATM is prohibitively expensive! A 64 port GigE switch however can be had for about $100k and delivers 50% more bandwidth per port at about 1/3 the cost of OC12 ATM. The problem however is 1500 byte frames, but GigE with jumbo frames would permit full FDDI MTU's and only slightly reduce a full Classical IP over ATM MTU (9180 bytes).

A recent example comes from the Pacific Northwest Gigapop in Seattle which is based on a collection of Foundry gigabit ethernet switches. At Supercomputing '99, Microsoft and NCSA demonstrated HDTV over TCP at over 1.2 Gbps from Redmond to Portland. In order to achieve that performance they used 9000 byte packets and thus had to bypass the switches at the NAP! Let's hope that in the future NAPs don't place 1500 byte packet limitations on applications.

(end quote)

Having the exchange point of ethernet connections at >1500 MTU will not in any way adversely impact the traffic on the path. If the end points are already at 1500, this change is completely transparent to them. If the end points are capable of >1500 already, then it would allow the flow to increase its packet sizes and reduce the number of packets flowing through the network and give a huge gain in performance, even in the face of packet loss.

Completely agree with you on that point. I'd love to see Equinix, AMSIX, LINX,
DECIX, and the rest of the large exchange points put out statements indicating
their ability to transparently support jumbo frames through their
fabrics, or at
least indicate a roadmap and a timeline to when they think they'll be able to
support jumbo frames throughout the switch fabrics.

Matt

It would be absolutely trivial for them to enable jumbo frames, there is
just no demand for them to do so, as supporting Internet wide jumbo
frames (particularly over exchange points) is highly non-scalable in
practice.

It's perfectly safe to have the L2 networks in the middle support the
largest MTU values possible (other than maybe triggering an obscure
Force10 bug or something :P), so they could roll that out today and you
probably wouldn't notice. The real issue is with the L3 networks on
either end of the exchange, since if the L3 routers that are trying to
talk to each other don't agree about their MTU valus precisely, packets
are blackholed. There are no real standards for jumbo frames out there,
every vendor (and in many cases particular type/revision of hardware
made by that vendor) supports a slightly different size. There is also
no negotiation protocol of any kind, so the only way to make these two
numbers match precisely is to have the humans on both sides talk to each
other and come up with a commonly supported value.

There are two things that make this practically impossible to support at
scale, even ignoring all of the grief that comes from trying to find a
clueful human to talk to on the other end of your connection to a third
party (which is a huge problem in and of itself):

#1. There is currently no mechanism on any major router to set multiple
MTU values PER NEXTHOP on a multi-point exchange, so to do jumbo frames
over an exchange you would have to pick a single common value that
EVERYONE can support. This also means you can't mix and match jumbo and
non-jumbo participants over the same exchange, you essentially have to
set up an entirely new exchange point (or vlan within the same exchange)
dedicated to the jumbo frame support, and you still have to get a common
value that everyone can support. Ironically many routers (many kinds of
Cisco and Juniper routers at any rate) actually DO support per-nexthop
MTUs in hardware, there is just no mechanism exposed to the end user to
configure those values, let alone auto-negotiate them.

#2. The major vendors can't even agree on how they represent MTU sizes,
so entering the same # into routers from two different vendors can
easily result in incompatible MTUs. For example, on Juniper when you
type "mtu 9192", this is INCLUSIVE of the L2 header, but on Cisco the
opposite is true. So to make a Cisco talk to a Juniper that is
configured 9192, you would have to configure mtu 9178. Except it's not
even that simple, because now if you start adding vlan tagging the L2
header size is growing. If you now configure vlan tagging on the
interface, you've got to make the Cisco side 9174 to match the Juniper's
9192. And if you configure flexible-vlan-tagging so you can support
q-in-q, you've now got to configure to Cisco side for 9170.

As an operator who DOES fully support 9k+ jumbos on every internal link
in my network, and as many external links as I can find clueful people
to talk to on the other end to negotiate the correct values, let me just
tell you this is a GIANT PAIN IN THE ASS. And we're not even talking
about making sure things actually work right for the end user. Your IGP
may not come up at all if the MTUs are misconfigured, but EBGP certainly
will, even if the two sides are actually off by a few bytes. The maximum
size of a BGP message is 4096 octets, and there is no mechanism to pad a
message and try to detect MTU incompatibility, so what will actually
happen in real life is the end user will try to send a big jumbo frame
through and find that some of their packets are randomly and silently
blackholed. This would be an utter nightmare to support and diagnose.

Realistically I don't think you'll ever see even a serious attempt at
jumbo frame support implemented in any kind of scale until there is a
negotiation protocol and some real standards for the mtu size that must
be supported, which is something that no standards body (IEEE, IETF,
etc) has seemed inclined to deal with so far. Of course all of this is
based on the assumption that path mtu discovery will work correctly once
the MTU valus ARE correctly configured on the L3 routers, which is a
pretty huge assumption, given all the people who stupidly filter ICMP.
Oh and even if you solved all of those problems, I could trivially DoS
your router with some packets that would overload your ability to
generate ICMP Unreach Needfrag messages for PMTUD, and then all your
jumbo frame end users going through that router would be blackholed as
well.

Great idea in theory, epic disaster in practice, at least given the
mechanisms currently at our disposal. :slight_smile:

Completely agree with you on that point. I'd love to see Equinix,
AMSIX, LINX,
DECIX, and the rest of the large exchange points put out statements
indicating
their ability to transparently support jumbo frames through their
fabrics, or at
least indicate a roadmap and a timeline to when they think they'll be
able to
support jumbo frames throughout the switch fabrics.

Matt

Yes, in moving from SONET to Ethernet exchange points, we have actually
reduced the potential performance of applications across the network for
no good reason, in many cases.

I agree with the rest, but actually, I've found that juniper has a manual physical mtu with a separate logical mtu available, while cisco sets a logical mtu and autocalculates the physical mtu (or perhaps the physical is just hard set to maximum). It depends on the equipment in cisco, though. L3 and L2 interfaces treat mtu differently, especially noticeable when doing q-in-q on default switches without adjusting the mtu. Also noticeable in mtu setting methods on a c7600(l2 vs l3 methods)

In practice, i think you can actually pop the physical mtu on the juniper much higher than necessary, so long as you set the family based logical mtu's at the appropriate value.

Jack

I agree, but it is definitely a slow start. I personally like the fact that SCTP actually fixes the host addressing issues and other benefits which can be especially useful with v6 and mobile IP.

Jack

>> I doubt that 1500 is (still) widely used in our Internet... Might be,
>> though, that most of us don't go all the way to 9k.
>>
>> mh
>
> Last week I asked the operator of fairly major public peering points if they supported anything larger than 1500 MTU. The answer was "no".
>

There's still a metric buttload of SONET interfaces in the core that
won't go above 4470.

So, you might conceivably get 4k MTU at some point in the future, but
it's really, *really* unlikely you'll get to 9k MTU any time in the next
decade.

Right, though I'm unsure of "decade" since we're moving off SDH/Sonet
quite agressively.

mh