[Nanog] Cogent Router dropping packets

Joe Greco wrote:
> For those unfamiliar, Cogent has a system where you set up an EBGP peering
> with the Cogent router you're connected to, for the purposes of announcing
> your routes into Cogent. However, these are typically smaller, aggregation
> class routers, and do not handle full tables - so you don't get your routes
> from that router. To get a full table FROM Cogent, you need to set up an
> EBGP multihop session with them, to their nearest full-table router. I
> believe they actually do all their BGP connections in that manner.

Depends on the service you purchase. Fast Ethernet seems to be delivered
as eBGP-multihop (the first hop is just a L3 switch), however DS-3 is
handled as a single BGP session. I'm not sure if GigE or SONET services
are handled as multihop or not.

GigE is, though perhaps not in all cases (we had a client buying x00Mbps
delivered over gigE, which was definitely multihop).

Probably all depends what hardware they have at each POP....

In part, I'm sure. There is also a certain benefit to having consistency
throughout your network, and it sometimes struck me that many of the folks
working for Cogent had a bit more than average difficulty dealing with the
unusual situation. This is not meant harshly, btw. Generally I like the
Cogent folks, but they (and their products) have their faults, just as any
of the competition does.

It may also help to remember that there's "legacy" Cogent and then there's
PSI/etc. Perhaps there are some differences as a result.

The more things you can do using the same template, the less difficult it
is to support. On the flip side, the less flexible you are ...

... JG

I do have to say that the PSI net side of cogent is very good. We use
them in Europe without many issues. I stay far away from the legacy
cogent network in US.

Manolo

Joe Greco wrote:

Not sure what you are talking about, cogent is all AS174... Other
than a few odd routers doing DS3 aggregation I don't think there is any
old PSInet network online (other than the AS number and IP addresses).
Cogent integrated acquisitions quite quickly (I was an aleron customer
and it only took two months from the purchase close for us to move from
AS4200 to 174).

As for the two BGP peer question, they do it anywhere where they have
Ethernet distribution, at least as far I can tell. That being said, we
don't use them anymore since we could not get them to play-ball on
pricing at larger commits either (I won't buy cogent if they don't at
least match the terms of our cheapest large-network transit provider).
:slight_smile:

John van Oppen
Spectrum Networks LLC
206.973.8302 (Direct)
206.973.8300 (main office)

You still haven't explained the failure modes you've experienced as a
result of cogent's A/B peer configuration, only fronted.

Inquiring minds would like to know!

Well it had sounded like I was in the minority and should keep my mouth
shut. But here goes. On several occasions the peer that would advertise
our routes would drop and with that the peer with the full bgp tables
would drop as well. This happened for months on end. They tried blaming
our 6500, our fiber provider, our IOS version, no conclusive findings
where ever found that it was our problem. After some testing at the
local Cogent office by both Cogent and myself, Cogent decided that they
could "make a product" that would allow us too one have only one peer
and two to connect directly to the GSR and not through a small catalyst.
Low and behold things worked well for some time after that.

  This all happened while we had 3 other providers on the same router
with no issues at all. We moved gbics, ports etc around to make sure it
was not some odd ASIC or throughput issue with the 6500.

   Hope this answers the question.

Manolo

Paul Wall wrote:

manolo wrote:

Well it had sounded like I was in the minority and should keep my mouth
shut. But here goes. On several occasions the peer that would advertise
our routes would drop and with that the peer with the full bgp tables
would drop as well.

That doesn't sound like the problem has anything to do with their
multihop-eBGP configuration - It just appears that whatever you were
directly connected to was flaking out. If they had moved you to a
directly connected BGP session and it all worked, that would be one
argument, but you also moved from a junky 3550 or something to the GSR
in the process. I'd argue that if the switch could handle full tables
and you just had a single session, you would probably have experienced
the same issue.

I've ran with both direct and multihop with Cogent, and I honestly never
noticed any difference in stability. I hear what you're saying, and I
think you have a valid argument in some respects, but I just think the
BGP problem is a symptom, not a cause.