constant FEC errors juniper mpc10e 400g

Tom_Beecher · April 18, 2024, 6:17pm

FEC is occurring at the PHY , below the PCS.

Even if you’re not sending any traffic, all the ethernet control frame juju is still going back and forth, which FEC may have to correct.

I think (but not 100% sure) that for anything that by spec requires FEC, there is a default RS-FEC type that will be used, which may be able to be changed by the device. Could be fixed though, I honestly cannot remember.

Aaron · April 18, 2024, 6:45pm

Thanks. What “all the ethernet control frame juju” might you be referring to? I don’t recall Ethernet, in and of itself, just sending stuff back and forth. Does anyone know if this FEC stuff I see concurring is actually contained in Ethernet Frames? If so, please send a link to show the ethernet frame structure as it pertains to this 400g fec stuff. If so, I’d really like to know the header format, etc.

-Aaron

jako.andras · April 18, 2024, 8:01pm

What "all the ethernet control frame juju" might you be referring
to? I don't recall Ethernet, in and of itself, just sending stuff back and
forth.

I did not read the 100G Ethernet specs, but as far as I remember
FastEthernet (e.g. 100BASE-FX) uses 4B/5B coding on the line, borrowed
from FDDI. Octets of Ethernet frames are encoded to these 5-bit
codewords, and there are valid codewords for other stuff, like idle
symbols transmitted continuously between frames.

Gigabit Ethernet (1000BASE-X) uses 8B/10B code on the line (from Fibre
Channel). In GE there are also special (not frame octet) PCS codewords
used for auto-negotiation, frame bursting, etc.

So I guess these are not frames that you see, but codewords representing
other data, outside Ethernet frames.

András

Charles_Polisher · April 18, 2024, 8:02pm

IEEE Std 802.3™‐2022 Standard for Ethernet
(§65.2.3.2 FEC frame format p.2943)

Also helpful, generally:
ITU-T 2000 Recommendation G975 Forward Error Correction for Submarine Systems
https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-G.975-200010-I!!PDF-E&type=items

Tom_Beecher · April 18, 2024, 8:07pm

I’m being sloppy with my verbiage, it’s just been a long time since I thought about this in detail, sorry.

The MAC layer hands bits to the Media Independent Interface, which connects the MAC to the PHY. The PHY converts the digital 1/0 into the form required by the media transmission type; the ‘what goes over the wire’ L1 stuff. The method of encoding will always add SOME number of bits as overhead. Ex, 64b/66b means that for every 64 bits of data to transmit, 2 bits are added, so 66 actual bits are transmitted. This encoding overhead is what I meant when I said ‘ethernet control frame juju’. This starts getting into the weeds on symbol/baud rates and stuff as well, which I dont want to do now cause I’m even rustier there.

When FEC is enabled, the number of overhead bits added to the transmission increases. For 400G-FR4 for example, you start with 256b/257b , which is doubled to 512b/514b for ($reason I cannot remember), then RS-FEC(544,514) is applied, adding 30 more bits for FEC. Following the example, this means 544 bits are transmitted for every 512 bits of payload data. So , more overhead. Those additional bits can correct up to 15 corrupted bits of the payload.

All of these overhead bits are added in the PHY on the way out, and removed on the way in. So you’ll never see them on a packet capture unless you’re using something that’s actually grabbing the bits off the wire.

( Pretty sure this is right, anyone please correct me if I munged any of it up.)

Saku_Ytti1 · April 19, 2024, 6:01am

The frames in FEC are idle frames between actual ethernet frames. So
you recall right, without FEC, you won't see this idle traffic.

It's very very good, because now you actually know before putting the
circuit in production, if the circuit works or not.

Lot of people have processes to ping from router-to-router for N time,
trying to determine circuit correctness before putting traffic on it,
which looks absolutely childish compared to FEC, both in terms of how
reliable the presumed outcome is and how long it takes to get to that
presumed outcome.

Mark_Tinka4 · April 19, 2024, 7:51am

FEC is amazing. At higher data rates (100G and 400G) for long and ultra long haul optical networks, SD-FEC (Soft Decision FEC) carries a higher overhead penalty compared to HD-FEC (Hard Decision FEC), but the net OSNR gain more than compensates for that, and makes it worth it to increase transmission distance without compromising throughput. Mark.

Saku_Ytti1 · April 19, 2024, 8:08am

FEC is amazing.

At higher data rates (100G and 400G) for long and ultra long haul optical networks, SD-FEC (Soft Decision FEC) carries a higher overhead penalty compared to HD-FEC (Hard Decision FEC), but the net OSNR gain more than compensates for that, and makes it worth it to increase transmission distance without compromising throughput.

Of course there are limits to this, as FEC is hop-by-hop, so in
long-haul you'll know about circuit quality to the transponder, not
end-to-end. Unlike in wan-phy, OTN where you know both.

Technically optical transport could induce FEC errors, if there are
FEC errors on any hop, so consumers of optical networks need not have
access to optical networks to know if it's end-to-end clean. Much like
cut-through switching can induce errors via some symbols to
communicate the CRC errors happened earlier, so the receiver doesn't
have to worry about problems on their end.

Mark_Tinka4 · April 20, 2024, 6:57am

This would only matter on ultra long haul optical spans where the signal would need to be regenerated, where - among many other values - FEC would need to be decoded, corrected and re-applied. SD-FEC already allows for a significant improvement in optical reach for a given modulation. This negates the need for early regeneration, assuming other optical penalties and impairments are satisfactorily compensated for. Of course, what a market defines as long haul or ultra long haul may vary; add to that the variability of regeneration spacing in such scenarios being quite wide, on the order of 600km - 1,000km. Much of this will come down to fibre, ROADM and coherent pluggable quality.

Mark_Tinka4 · April 20, 2024, 7:00am

This would only matter on ultra long haul optical spans where the signal would need to be regenerated, where - among many other values - FEC would need to be decoded, corrected and re-applied. SD-FEC already allows for a significant improvement in optical reach for a given modulation. This negates the need for early regeneration, assuming other optical penalties and impairments are satisfactorily compensated for. Of course, what a market defines as long haul or ultra long haul may vary; add to that the variability of regeneration spacing in such scenarios being quite wide, on the order of 600km - 1,000km. Much of this will come down to fibre, ROADM and coherent pluggable quality. Mark.

Saku_Ytti1 · April 20, 2024, 11:25am

In most cases, modern optical long haul has a transponder, which
terminates your FEC, because clients offer gray, and you like
something a bit less depressing, like 1570.42nm.

This is not just FEC terminating, but also to a degree autonego
terminating, like RFI signal would be between you and transponder, so
these connections can be, and regularly are, provided without proper
end-to-end hardware liveliness, and even if they were delivered and
tested to have proper end-to-end HW liveliness, that may change during
operation, so line faults may or may not be propagated to both ends as
RFI assertion, and even if they are, how delayed they are, they may
suffer delay to allow for optical protection to engage, which may be
undesirable, as it eats into your convergence budget.

Of course the higher we go in the abstraction, the less likely you are
to get things like HW livelines detection, like I don't really see
anyone asking for this in their pseudowire services, even though it's
something that actually can be delivered. In Junos it's a single
config stanza in interface, to assert RFI to client port, if
pseudowire goes down in the operator network.

Mark_Tinka4 · April 20, 2024, 11:37am

In our market (Africa), for both terrestrial and submarine services, OTN-type circuits are not typically ordered. Network operators are not really interested in receiving the additional link data that OTN or WAN-PHY provides. They truly want to leave the operation of the underlying transport backbone to the transport operator. The few times we have come across the market asking for OTN is if they want to groom 10x 10G into 1x 100G, for example, to deliver structured services downstream. Even when our market seeks OTN from European backhaul providers to extend submarine access into Europe and Asia-Pac, it is often for structured capacity grooming, and not for OAM benefit. It would be interesting to learn whether other markets in the world still make a preference for OTN in lieu of Ethernet, for the OAM benefit, en masse. When I worked in Malaysia back in the day (2007 - 2012), WAN-PHY was generally asked for for 10G services, until about 2010; when folk started to choose LAN-PHY. The reason, back then, was to get that extra 1% of pipe bandwidth :-). Mark.

Saku_Ytti1 · April 20, 2024, 11:39am

Oh I don't think OTN or WAN-PHY have any large deployment future, the
cheapest option is 'good enough' and whatever value you could extract
from OTN or WAN-PHY, will be difficult to capitalise, people usually
don't even capitalise the capabilities they already pay for in the
cheaper technologies.
Of course WAN-PHY is dead post 10GE, a big reason for it to exist was
very old optical systems which simply could not regenerate ethernet
framing, not any features or functional benefits.

Mark_Tinka4 · April 20, 2024, 11:52am

A handful of OEM’s still push OTN like it has just been invented, especially those still pushing “IPoDWDM” :-). Fair point, if you have a highly-meshed metro network with lots of drops to customers across a ring-mesh topology, there might be some value in OTN when delivering such services at low speeds (10G, 25G, 2.5G, 1G). But while the topology is valid, most networks aren’t using high-end optical gear to drop low-speed services, nowadays. Even though on a per-bit basis, they might be cheaper than 1U IP/MPLS router looking to do the same job if all you are considering is traffic, and not additional services that want to eat packets. In our market, we are trending toward a convergence between 10G and 100G orders intersecting for long haul and submarine asks. But pockets of 10G demand still exist in many African countries, and none of them have any WAN-PHY interest of any statistical significance. That said, I don’t expect any subsea cables getting built in the next 3 years and later will have 10G as a product on the SLTE itself… it wouldn’t be worth the spectrum. Mark.

Mark_Tinka4 · April 20, 2024, 11:56am

And what we find with EU providers is that Ethernet and OTN services are priced similarly. It’s a software toggle on a transponder, but even then, Ethernet still continues to be preferred over OTN. Mark.

thedCo · April 20, 2024, 12:41pm

LAN PHY dominates in the US too. Requests for WAN PHY were almost exclusively for terrestrial backhaul extending off of legacy subsea systems that still commonly had TDM-framed services. It’s been a couple of years since I’ve been in optical transport directly but these requests were essentially non-existent after 2018 or so. OTN became somewhat more common from 2014 onward as optical system interop improved, but actually was more common in the enterprise space as providers would generally go straight to fiber in most use cases, and with dark fiber opex costs coming down in many markets, I see OTN requests as winnowing here as well.

Dave Cohen
craetdave@gmail.com

Mark_Tinka4 · April 20, 2024, 3:50pm

What really changed the game was coherent detection, which breathed new life into legacy subsea cables that were built on dispersion-managed fibre. Post-2014 when uncompensated (and highly dispersed) fibre has been the standard for subsea builds (even for SDM cables), coherent optical systems are the mainstay. In fact, because linear dispersion can be accurately calculated for the cable span, uncompensated cables are a good thing because the dispersion compensation happens in very advanced coherent DSP's in the optical engine, rather than in the fibre itself.

WAN-PHY did not extend to 40G or 100G, which can explain one of the reasons it lost favour. For 10G, its availability also depended on the type of device, its NOS, line card and/or pluggable at the time, which made it hard to find a standard around this if you built multi-vendor networks or purchased backhaul services from 3rd party providers that had non-standard support for WAN-PHY/OTN/G.709. In other words, LAN-PHY (and plain Ethernet) became the lowest common denominator in the majority of cases for customers.

In 2024, I find that operators care more about bringing the circuit up than using its link properties to trigger monitoring, failover and reconvergence. The simplest way to do that is to ask for plain Ethernet services, particularly for 100G and 400G, but also for 10G. In practice, this has been reasonably reliable in the past 2 - 3 years when procuring 100G backhaul services. So for the most part, users of these services seem to be otherwise happy.

Mark.

Tarko_Tikan · April 20, 2024, 4:19pm

hey,

That said, I don't expect any subsea cables getting built in the next 3 years and later will have 10G as a product on the SLTE itself... it wouldn't be worth the spectrum.

10G wavelengths for new builds died about 10 years ago when coherent 100G became available, submarine or not. Putting 10G into same system is not really feasible at all.

Mark_Tinka4 · April 20, 2024, 6:31pm

I was referring to 10G services (client-side), not 10G wavelengths (line side).

Mark.

borg · April 20, 2024, 7:36pm

Erm, WAN-PHY did not extend into 40G because there was not much
of those STM-256 deployment? (or customers didnt wanted to pay for those).

WAN-PHY was designed so people could encapsulate Ethernet frames
right into STM-64. Once world moved out of SDH/SONET stuff, there was
no more need for WAN-PHY anymore.