Networking Pearl Harbor in the Making

Date: Mon, 7 Nov 2005 11:21:20 -0500
From: Eric Gauthier <eric@roxanne.org>
Cc: nanog@merit.edu

I'm not exactly "in the know" on this one, but the heap-overflow advisory
that we've seen indicates that the IOS updates Cisco put out are not patches
for this problem:

  "Cisco has devised counter-measures by implementing extra checks to
  enforce the proper integrity of system timers. This extra validation
  should reduce the possibility of heap-based overflow attack vectors
  achieving remote code execution."
from Networking, Cloud, and Cybersecurity Solutions - Cisco

We've asked Cisco for a better explanation - namely, are their recommended
updates "patches" to the problem (i.e. repairs) or simply mitigating
updates that make is harder to exploit. The wording of their advisory seems
to indicate the latter. This latter case is what worries me since it implies
that there is a fundamental problem in IOS, the problem still exists even after
patching, and that Cisco can't readily repair it. Unfortunately, so far we've
gotten the run-around and haven't been able to get a better answer, again
leading me to believe the worst.

Most exploits (be it IOS or some other target) require multiple things to occur
before the "desired effect" is achieved.

    "buffer overflow" exploits. in general. involve a minimum of two things:
        1) "smashing" memory outside of the area you 'should' have been limited
           to.
        2) having 'some other code' accept/use that 'improperly modified' memory
           believing it to be 'valid' content.

Causing =any= step of the exploit process to fail means that the attempt
_does_not_succeed_.

Re-coding to eliminate all 'possible' buffer overflow situations is a *big*
job. The required field-length checking for every multi-byte copy/move
operation does have a significant negative impact on performance, as well.

Merely _identifying_ the 'tainted' (by being in contact -- directly or in-
directly -- with 'user-supplied' data) data-structures is a task measured
in man-years. As is isolating _all_ the points where such tainting occurs.
Then, and only then, can you begin to -plan- how to remove the taint, whether
by sanity-based bounds-checking, 'clipping' to known limits, explicit length
checks, or whatever else is appropriate.

*AFTER* all that, you can =start= implementing the code that removes taint.

It _can_ be much quicker (in terms of "time to delivery to the field") to
go after one of the 'other things' that has to happen for an exploit to
"work".

...

Most exploits (be it IOS or some other target) require multiple things to occur
before the "desired effect" is achieved.

    "buffer overflow" exploits. in general. involve a minimum of two things:
        1) "smashing" memory outside of the area you 'should' have been limited
           to.
        2) having 'some other code' accept/use that 'improperly modified' memory
           believing it to be 'valid' content.

Causing =any= step of the exploit process to fail means that the attempt
_does_not_succeed_.

Re-coding to eliminate all 'possible' buffer overflow situations is a *big*
job. The required field-length checking for every multi-byte copy/move
operation does have a significant negative impact on performance, as well.

Merely _identifying_ the 'tainted' (by being in contact -- directly or in-
directly -- with 'user-supplied' data) data-structures is a task measured
in man-years. As is isolating _all_ the points where such tainting occurs.
Then, and only then, can you begin to -plan- how to remove the taint, whether
by sanity-based bounds-checking, 'clipping' to known limits, explicit length
checks, or whatever else is appropriate.

*AFTER* all that, you can =start= implementing the code that removes taint.

It _can_ be much quicker (in terms of "time to delivery to the field") to
go after one of the 'other things' that has to happen for an exploit to
"work".

There actually is automated code to identify and correct stack overflows
on Linux. Formerly StackGuard, then Immunix, it looks like it's now
Novell AppArmor (*shudder*).

Date: Mon, 7 Nov 2005 14:43:54 -0600 (CST)
From: Robert Bonomi

Re-coding to eliminate all 'possible' buffer overflow situations is a *big*
job. The required field-length checking for every multi-byte copy/move
operation does have a significant negative impact on performance, as well.

Getting "owned" can also have a significant negative impact on
performance. Of course, maybe the attacker will be benevolent, so
perhaps all will be okay...

Correctness before speed. Who wants a machine that just gives bad
results faster?

Merely _identifying_ the 'tainted' (by being in contact -- directly or in-
directly -- with 'user-supplied' data) data-structures is a task measured
in man-years. As is isolating _all_ the points where such tainting occurs.

Sounds like a pretty good argument for "do it right the first time".

Then, and only then, can you begin to -plan- how to remove the taint, whether
by sanity-based bounds-checking, 'clipping' to known limits, explicit length
checks, or whatever else is appropriate.

Hopefully the code is modular. e.g., running cscope and searching for
strcpy(3) invocations is easier than tracking down implemented-in-place
equivalents.

Eddy