IP and Optical domains?

Glen_Kent · June 18, 2016, 9:27pm

HI,

I was reading the following article:
http://www.lightreading.com/optical/sedona-boasts-multilayer-network-orchestrator/d/d-id/714616

It says that "The IP layer and optical layer are run like two separate
kingdoms," Wellingstein says. "Two separate kings manage the IP and optical
networks. There is barely any resource alignment between them. The result
of this is that the networks are heavily underutilized," or, from an
alternative perspective, "they are heavily over-provisioned."

Can somebody shed more light on what it means to say that the IP and
optical layers are run as independent kingdoms and why do ISPs need to
over-provision?

Thanks, Glen

Mikael_Abrahamsson · June 18, 2016, 10:00pm

You have a group that runs fiber+dwdm+sonet(or SDH). You have another group that runs IP. When the IP guys ask "please tell us how the optical network is designed, and can we coordinate how they're built and btw, we want to put DWDM optics in our routers", the answer from the fiber+dwdm+sonet group is "no, but we can help you with transport using our transponders, please just order circuits, just give us addresses for each end and we'll take care of things, don't you worry your little IP engineer brain how things are transported long distance".

I believe this is still the case at a lot of ISPs. Not all, hopefully not even most, but I'm sure there are some.

Bandy_Rush1 · June 18, 2016, 11:07pm

You have a group that runs fiber+dwdm+sonet(or SDH). You have another
group that runs IP. When the IP guys ask "please tell us how the optical
network is designed, and can we coordinate how they're built and btw, we
want to put DWDM optics in our routers", the answer from the
fiber+dwdm+sonet group is "no, but we can help you with transport using
our transponders, please just order circuits, just give us addresses for
each end and we'll take care of things, don't you worry your little IP
engineer brain how things are transported long distance".

I believe this is still the case at a lot of ISPs. Not all, hopefully not
even most, but I'm sure there are some.

you underestimate the extent of the dogged determination of circuitzilla
to hang on to the fiber with her/his fingernails.

randy

Glen_Kent · June 19, 2016, 1:28am

Mikael,

Thanks. I was looking at a technical problem. I say this because you may
not have this problem when both are networks are being run by the same
vendor equipment, say Alcatel-Lucent (or Nokia now). What are the technical
problems because of which ISPs need to over-provision when there are IP and
optical domains involved. OR rather let me rephrase my question -- what is
the technical challenge involved in setting up an end to end path between
two IP domains that have an optical domain in between.

Thanks, Glen

Mark_Tinka1 · June 20, 2016, 7:09am

Mikael,

Thanks. I was looking at a technical problem. I say this because you may
not have this problem when both are networks are being run by the same
vendor equipment, say Alcatel-Lucent (or Nokia now).

Even then.

This isn't the first time the industry has tried to collapse Transport +
IP into a single system.

Many of us will remember the days of IPoDWDM. That flopped. Then came
GMPLS, which flopped even more.

That said, the hunt should not stop, and there probably is value for
networks that run both their own Transport + IP infrastructure.

For networks that lease all of their transport, not sure how this will
help as transport providers will not open their networks up to 3rd party
IP networks.

What are the technical
problems because of which ISPs need to over-provision when there are IP and
optical domains involved. OR rather let me rephrase my question -- what is
the technical challenge involved in setting up an end to end path between
two IP domains that have an optical domain in between.

It's two different expenses. If routers made good DWDM switches, this
would not be much of a problem, but they don't. So you need to two teams
managing two different sets of kit and opex, which is what the industry
has been trying to solve for some time now. How do we collapse both of
these cost centres into one manageable expense, considering that the
primary reason transport networks exist and expand today is to carry IP
traffic?

Mark.

Mikael_Abrahamsson · June 20, 2016, 7:28am

Many of us will remember the days of IPoDWDM. That flopped.

Errrr, it didn't flop at all. I know lots of operators that do this.

For networks that lease all of their transport, not sure how this will help as transport providers will not open their networks up to 3rd party IP networks.

Yeah, that's harder. Doing pure photonic transport is operationally difficult without management integration between optic transport provider and customer. That part hasn't happened.

It's two different expenses. If routers made good DWDM switches, this would not be much of a problem, but they don't. So you need to two teams managing two different sets of kit and opex, which is what the industry has been trying to solve for some time now. How do we collapse both of these cost centres into one manageable expense, considering that the primary reason transport networks exist and expand today is to carry IP traffic?

I know operators who have collapsed their "core transport group" to handle Fiber+DWDM+SDH+IP (design/planning/3rd line operations). I know others where the IP and optical teams work very closely together and plan the network together.

If your main business is transporting IP/MPLS then this is obvious that you need to have the teams work closely together. If your main business is to L2 switch or bit transport lots of TDM/L2 traffic, then it's less obvious.

Mark_Tinka1 · June 20, 2016, 7:59am

Errrr, it didn't flop at all. I know lots of operators that do this.

Not the technology - I meant the goal, i.e., that IPoDWDM will merge the
optical and IP domains, simplify operations, remove the need for
grey-light transponders 100%, make alien wavelengths more accessible,
make GMPLS the unifying protocol between departments, e.t.c.

It failed from that standpoint, but I do know a lot of networks that use
it successfully.

We've received requests for the same from our customers for our
Transport service, but when they do the math on the optics, they just
end up taking a regular EoDWDM port instead.

Yeah, that's harder. Doing pure photonic transport is operationally
difficult without management integration between optic transport
provider and customer. That part hasn't happened.

And this has always been my biggest concern.

Collapsing the optical and IP domains only, then, really appeals to
operators that run their own network end-to-end. This relegates the
opportunity to incumbents or ISP's and content providers that invest in
their own dark fibre.

Then again, the incumbents are a huge market for equipment and software
vendors, so this will go on anyway, and those who lease capacity on a
100% basis will have to find their feet in all the mud.

I know operators who have collapsed their "core transport group" to
handle Fiber+DWDM+SDH+IP (design/planning/3rd line operations). I know
others where the IP and optical teams work very closely together and
plan the network together.

If your main business is transporting IP/MPLS then this is obvious
that you need to have the teams work closely together. If your main
business is to L2 switch or bit transport lots of TDM/L2 traffic, then
it's less obvious.

Agree.

I run a team that manages both Transport and IP, so it's easier for us
from this perspective. But several other operators, especially the large
incumbents, have Chinese walls between both teams.

Mark.

Masataka_Ohta · June 20, 2016, 9:59am

Glen Kent wrote:

It says that "The IP layer and optical layer are run like two separate
kingdoms," Wellingstein says. "Two separate kings manage the IP and optical
networks. There is barely any resource alignment between them.

> Can somebody shed more light on what it means to say that the IP and
> optical layers are run as independent kingdoms

The problem is not optical at all but caused by poor L3 routing
protocols and operational attempts to compensate them at L2.

That is, with a L3 routing protocol having 1ms of HELO
intervals, all the thing to be done at L2 is to watch BER/FER
above some threshold.

> and why do ISPs need to over-provision?

To act against failures.

But, if everything is visible at L3, over-provisioned bandwidth
can be used even if there is no failure.

Visible at L3 means that parallel point to point links
between a pair of routers have distinct pairs of IP addresses
and BGP routes should flip only upon failure of all the
(or almost all the) links.

A remaining, but minor, inefficiency could be mismatch of metric
at L1 and L3, that is, ASPATHLEN increases for transit
services are not roughly proportional to geographic distances
of the transit services.
Masataka Ohta

Mark_Tinka1 · June 20, 2016, 10:14am

The problem is not optical at all but caused by poor L3 routing
protocols and operational attempts to compensate them at L2.

Ummh, how so.

Layer 2 transport is required in any scenario. Dark fibre, for example,
would not have any optical kit on it, and can be fired through
router-to-router optics. How is this any different from a routing
perspective?

That is, with a L3 routing protocol having 1ms of HELO
intervals, all the thing to be done at L2 is to watch BER/FER
above some threshold.

Ummh, BFD works, and this can be used even in grey-light situations
where the router has no DWDM visibility into the link state.

To act against failures.

Or to support growth.

But, if everything is visible at L3, over-provisioned bandwidth
can be used even if there is no failure.

We primarily over-provision to support growth. Resiliency comes as
secondary benefit.

If you are deploying additional bandwidth just for protection, I hope
you're my competitor.

Visible at L3 means that parallel point to point links
between a pair of routers have distinct pairs of IP addresses
and BGP routes should flip only upon failure of all the
(or almost all the) links.

iBGP uptime is par for the course.

The main advantage of having parallel links across the same path is to
increase bandwidth (through load balancing). This is an IGP operation.

A remaining, but minor, inefficiency could be mismatch of metric
at L1 and L3, that is, ASPATHLEN increases for transit
services are not roughly proportional to geographic distances
of the transit services.

If the circuits are on-net, BGP takes the IGP metric into account when
trying to get to a target NEXT_HOP.

AS_PATH length is an inter-domain concept. One has to manage their eBGP
routing using those protocol specific methods to manage latency. This is
where a successful operator out-maneuvers their competition, so I don't
see it as a protocol or transport limitation, per se.

Mark.

Mikael_Abrahamsson · June 20, 2016, 10:45am

So if you have a fiber break, you're not going to have enough overcapacity in your network to remain uncongested until this fiber break is fixed?

Mark_Tinka1 · June 20, 2016, 11:13am

That was my point - we will have enough capacity on diverse routes to
handle the outage.

We deploy additional capacity primarily for growth. The resiliency
aspect comes as an added advantage.

The diverse paths are already in place. So it's just about adding more
bandwidth between the paths in an equal manner.

Mark.

Masataka_Ohta · June 20, 2016, 8:38pm

Mark Tinka wrote:

Layer 2 transport is required in any scenario.

Yes, of course, as I wrote:

> all the thing to be done at L2 is to watch BER/FER
> above some threshold.

I don't deny L2 exist, though, if L3 protocols were properly
designed, L2 protection is not required.

> Dark fibre, for example,

would not have any optical kit on it, and can be fired through

> router-to-router optics.

That's L1, which is also required to exist.

We primarily over-provision to support growth. Resiliency comes as
secondary benefit.

If you are deploying additional bandwidth just for protection, I hope
you're my competitor.

So, you deny the original point of "The result of this is that the
networks are heavily underutilized". OK.

Masataka Ohta

Mark_Tinka1 · June 21, 2016, 6:27am

I don't deny L2 exist, though, if L3 protocols were properly
designed, L2 protection is not required.

I'd like to hear your proposals on how Layer 3 protocols can be better
designed to manage transport characteristics.

That's L1, which is also required to exist.

It's Layer 1 and Layer 2. Ethernet is running over those optics, albeit
with no "traditional" optical equipment in between.

So, you deny the original point of "The result of this is that the
networks are heavily underutilized". OK.

Under- or over-utilization means different things to different people.

We upgrade at 50% utilization. Others do it at 70% utilization. Others
do it at 100% utilization. Heck, I know some that do it at 40% utilization.

Since not all operations are the same, I can't tell another person what
I think under- or over-utilization is.

Mark.

Masataka_Ohta · June 22, 2016, 8:20am

Mark Tinka wrote:

I'd like to hear your proposals on how Layer 3 protocols can be better
designed to manage transport characteristics.

By not managing transport characteristics at all except
that links are on or off (or, if you want to guarantee QoS,
a little more than that).

L3 protocols know links are off if L2 operators actively
turn them off or if the protocols detect consecutive lack
of L3 HELO generated frequently enough.

L2 operators turns links off for maintenance and
BER degradations need unscheduled maintenance.

That's L1, which is also required to exist.

It's Layer 1 and Layer 2. Ethernet is running over those optics, albeit
with no "traditional" optical equipment in between.

So, no disagreement, here.

So, you deny the original point of "The result of this is that the
networks are heavily underutilized". OK.

We upgrade at 50% utilization. Others do it at 70% utilization. Others
do it at 100% utilization. Heck, I know some that do it at 40% utilization.

I'm afraid "heavily" implies a lot less utilization.

Masataka Ohta

Mark_Tinka1 · June 22, 2016, 8:29am

By not managing transport characteristics at all except
that links are on or off (or, if you want to guarantee QoS,
a little more than that).

But how do Layer 3 protocols manage transport characteristics today?

Unless I misunderstand your statement.

L3 protocols know links are off if L2 operators actively
turn them off or if the protocols detect consecutive lack
of L3 HELO generated frequently enough.

L2 operators turns links off for maintenance and
BER degradations need unscheduled maintenance.

Again, this does not seem too removed from what happens already today.

Unless I misunderstand what you are saying.

I'm afraid "heavily" implies a lot less utilization.

I don't disagree with what you imply by "heavily". What I am saying is
"a lot less" or "a lot more" is not a universal measure. It means
different things to different people, as business operations (which
largely drive this kind of thing) differ widely.

Mark.

Masataka_Ohta · June 22, 2016, 9:58am

Mark Tinka wrote:

By not managing transport characteristics at all except
that links are on or off (or, if you want to guarantee QoS,
a little more than that).

But how do Layer 3 protocols manage transport characteristics today?

Today??? You asked "can be better designed", didn't you?

And, don't miss the following assumption:

> L3 HELO generated frequently enough.

> Again, this does not seem too removed from what happens already today.

The problem, if any, is that doing much more than that
results in "heavily underutilized" network.

> I don't disagree with what you imply by "heavily".

The implication is not mine.

Masataka Ohta

Mark_Tinka1 · June 22, 2016, 10:07am

Today??? You asked "can be better designed", didn't you?

But IP does not manage transport characteristics. If packets can't get
through, they are dropped. Fairly simple.

Typical awareness about the transport layer is not normally privy to IP.
Yes, IPoDWDM means the visbility is there, but really, all it's doing is
cutting off a link just before the thresholds are met, to avoid packet loss.

Yes, BFD does provide IP some awareness, but this is not inherent in IP
itself.

The problem, if any, is that doing much more than that
results in "heavily underutilized" network.

Sorry, I'm just not getting your angle - could be something getting lost
in translation.

Not sure how frequent Hello messages exchanged by routing protocols
leads to a heavily under-utilized network.

Mark.

Masataka_Ohta · June 22, 2016, 11:17am

Mark Tinka wrote:

Typical awareness about the transport layer is not normally privy to IP.
Yes, IPoDWDM means the visbility is there, but really, all it's doing is
cutting off a link just before the thresholds are met, to avoid packet loss.

What? "the visibility is there"?

I think you mean IPoDWDM something so much different from
usual ways to have IP over something.

Do you have any reference to it?

For my definition of IPoDWDM, see, for example:

  "Standardization of optical packet switching with
  many-wavelength packets"
  Standardization of optical packet switching with many-wavelength packets | IEEE Conference Publication | IEEE Xplore

or my newest paper in HPSR2016.

Masataka Ohta

Mark_Tinka1 · June 22, 2016, 11:49am

What? "the visibility is there"?

I think you mean IPoDWDM something so much different from
usual ways to have IP over something.

Do you have any reference to it?

I said "visbility" due to what IPoDWDM can offer.

But I also said IP has no real "awareness" about the physical
infrastructure. It just knows it can't send/receive packets anymore.

With IPoDWDM, one could infer that the IP layer will quickly re-route
due to DWDM characteristics (related to fibre conditions). However, in
actual fact, what IP really sees is the link going away, and thus,
triggering an IGP reconvergence. It does not really know that the
degraded optical signal quality on the fibre was the cause, it just
knows that the link disappeared.

There is no difference if IP is running directly over fibre (in
Ethernet). The difference with IPoDWDM is that the re-routing is done
before the fibre actually loses link, because the line card is
monitoring the optical signal and deciding whether to keep the port up
or not. This is to minimize (or avoid) packet loss incurred by failing
only after link failure (which would be the case with generic IP running
directly over fibre (again, in Ethernet).

Whatever the case, IP is not aware about the state of the physical link.
It just sees the link going away.

But something tells me you know all this already, so...

For my definition of IPoDWDM, see, for example:

    "Standardization of optical packet switching with
    many-wavelength packets"
    Standardization of optical packet switching with many-wavelength packets | IEEE Conference Publication | IEEE Xplore

or my newest paper in HPSR2016.

Interesting.

Do you know of any implementations?

Mark.

Jason_Iannone · June 22, 2016, 3:56pm

The IP and Transport groups are customers of each other. When I need
a wire, I ask the Transport group to deliver a wire. This is pretty
simple division of labor stuff. Transport has the intimate knowledge
of the layer 1 infrastructure and IP has intimate knowledge of
services. Sure there is information share, but I don't need to assign
wavelengths or protection groups or channels. I don't need to know if
I'm getting an OTU or some other lit service (except when I do need to
know). We use clear jargon to order services from each other.
"Please deliver two diverse, unprotected circuits between cilli1 and
cilli2." If I want LACP or spanning-tree, I want OTU or another means
of ensuring L2 tunneling, so I either predefine these requirements
before we start our relationship or I explicitly order it.

When I think of converging IP and Transport, I think of combining the
extraordinary depth of knowledge required by each group's individual
contributors. You just turned your 100k employee into a 175k
employee. On top of that, add that we're all becoming software
developers and you've got a three horned unicorn. In the end I guess
this is the cycle of convergence to distribution and back writ HR.