NANOG Digest, Vol 90, Issue 1

Hello Dennis,

I am very happy because somebody is on the same page.

Message: 20
Date: Tue, 30 Jun 2015 14:37:55 -0400
From: Dennis B <infinityape@gmail.com>
To: Roland Dobbins <rdobbins@arbor.net>
Cc: nanog@nanog.org
Subject: Re: GRE performance over the Internet - DDoS cloud mitigation
Message-ID:
        <
CAPr+j8J4vs2y8C6AB3FWGhrVF-GLt02inzvxsPs86m2-ChN6eg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Depends on what performance considerations you are trying to address,
technically.

The question is how can we guarantee the GRE/BGP performance (control
traffic) during the time between detection and mitigation?

Exactly

GRE decapsulation?
IE: Hardware vs Software?

Software.

Routing of the Protocol over the internet?
IE: If the inbound path is saturated, what is the availability of the GRE
tunnel?

Yes.

User-experience with GRE packet overhead?
IE: TCP Fragmentation causing PMTUD messages for reassembly?

Not the main concern right now, however I would like to hear from you in
this ponit as well.

I've worked at Prolexic for 7 years and now Akamai for 1.4 yrs, post
acquisition.

We are contacting AKamai for the solution by the way, and we are contacting
the Prolexic's founders acquired company defense.net (now F5) as well :slight_smile:

Immediately, I can think of mul

tiple scenarios' (3) that come to mind on
how to solve any one of these categories.

Would you like to learn more? lol

Sure I would love to :slight_smile:

Message: 23

Date: Tue, 30 Jun 2015 16:32:54 -0400
From: Dennis B <infinityape@gmail.com>
To: Roland Dobbins <rdobbins@arbor.net>
Cc: nanog@nanog.org
Subject: Re: GRE performance over the Internet - DDoS cloud mitigation
Message-ID:
        <
CAPr+j8LC7h_LLU+j5kwQcvxwLd8Pd+jwP5W7f62Ph2i7g6ZsTg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Roland,

Agreed, Ramy's scenario was not truly spot on, but his question still
remains. Perf implications when cloud security providers time to
detect/mitigate is X minutes. How stable can GRE transports and BGP
sessions be when under load?

This is the question.

In my technical opinion, this is a valid argument, which deems wide
opinion. Specifically, use-cases about how to apply defense in depth
logically in the DC vs Hybrid vs Pure Cloud.

Our defense model will be your so called "in depth logically in the DC",
however, we are protecting our NW infrastructure, and we are trying to
reach a wholesale agreement in order to protect our customer accordingly.

One more thing to elaborate, we have our own DDoS mitigation equipment, and
it is located in the edge of the network nearest to the high capacity
Internet circuits to minimize the local transit cost.

I hope it is clear now.

Thanks,

Ramy

This is not what you were asking about in your original post on this topic - you were talking about BGP sessions inside GRE tunnels, which is not how most (any?) DDoS mitigation services operate, to my knowledge.

GRE is used over the Internet for many different applications, including post-DDoS-mitigation re-injection of legitimate traffic onwards to the server/services under protection. Hardware-based GRE processing is required on both ends for anything other than trivial speeds; in general, the day of software-based Internet routers is long gone, and any organization still running software-based routers on their transit/peering edge is at risk.

DDoS mitigation providers using GRE for re-injection should set the MTU on their mitigation center diversion interfaces to 1476, and MSS-clamping on those same interfaces to 1436, as a matter of course.

This is not a new model; it has been extant for many years. There are a variety of overlay and transit-focused DDoS mitigation service providers who utilize this model. In your original post on this topic, you also made the assertion that these issues had not been addressed by DDoS mitigation service operators; that assertion is incorrect.

To clarify, I'm referring to GRE processing on routers; hardware processing is pretty much a requirement on routers. Other types of devices can often handle GRE at significantly higher rates than software-based routers.

To Ramy,

Thank you for the acknowledgement. DDoS Mitigation service providers,
regardless if its pure cloud, hybrid cloud, or CPE only, all face these
challenges when it comes to DDoS Attacks.

Can you restate your question again or rephrase it for the forum? Seems
there is some confusion or maybe people didn't grasp it.

My understanding of the question RAMY asked was around DDoS mitigation
providers and during the Time-to-Detect, Time-to-Start-to-Mitigate. How do
businesses protect themselves when attack traffic is NOT stopped at
first?.IE: Defense in depth

NOTE: Some DDoS mitigation providers offer Time-to-stop-the-attack SLA's.

Its all moot though. These types of solutions do not guarantee up time
during the initial attack start time, PERIOD!

How can anyone guarantee up-time during a 40Gbps attack and lets say - all
you have are 2 x 1GB CAT5E links over multi-homed BGP providers. Having
larger port capacity (IE: 10GB ports) only gives you minutes/hours to react
and redirect to a Cloud Provider.

The time to start mitigation (average industry time) 30 - 45 minutes. What
is happening to your WAN infrastructure when there is 40Gbps of attack on
your doorstep.

Will your 2Gbps worth of aggregated ISP bandwidth keep sessions up? No, it
will get saturated, BGP will flap and any GRE connections or any other
traffic will be lost. This means, even with local CPE mitigation, things
will bounce. This is 1 scenario of 1000's.

There are positive security models that you can employ as as stop gap to
prevent these types of scenario's, but mostly its on the Service Providers
best practices or traffic posture model. IE: On-Demand, On-Demand with
monitoring, Always-On monitoring, Always-On monitoring and mitigation.

Having local mitigation for DDoS attacks is a loosing battle in my opinion.
Its only buying you time to redirect. It does solve problems like attacks
that are low in scale that you can consume with your port capacity or quick
to hit and run attacks (1-2min durations). But then you need
auto-mitigation enabled and that leads to collateral damage most of the
times for legitimate traffic.

Pretty sure other SP's will offer different opinion. This is my technical
opinion, not representing Akamai or any other companies official position.

From an engineering perspective, assume when an advisory targets your

business and they have 1/2 way decent attacking nodes, expect an outage.
Message that to the board but explain that you have every capability to
mitigate these risks. Given the SP you go with has enough staff, resources,
capacity, technology, SLA, and knowledge/experience in the attack vector
hitting you.

If you want to "learn more", keep up the engagements with the market DDoS
providers you are communicating with and ask these tough questions. If
anyone "sells" you the perfect solution, they are LIEING to you!

On a personal note, thank you for reaching out privately in email and
explaining who you are talking too. Trust me when I tell you, I know the
organization VERY WELL from the other competitor you are looking at and i
will offer you my candid opinion of them, if you'd like. My friend runs
their SOC over there, an old colleague of mine from when i was in the SOC
blocking attacks.

Love this topic!

Dennis

P

Bob Watson