RE: Microsoft spokesperson blames ICANN

Blaming it on ICANN, even indirectly, is about as clueless a move as can be
imagined. ICANN has no direct authority over the root, the USG/DOC/NTIA has
reserved that privilege for itself alone. They have let out the operations
contract to NSI, who manages the a.root-servers.net and has been doing so
for years.

MHSC has been working with MS engineering and support staff, for the past
few months, to integrate BIND8.2.2p7 servers with Win2K/DNS/AD. It isn't
easy because of semantic inconsistency, radically diverse architecture
concepts, and [above all] severe lack of documentation on MS part (as well
as a few roaches in the SRV update stuff).

From our efforts, it is not at all surprising that someone, at MSFT, munged

the DNS configuration, totally. Even their best guru could have done it, due
to the murky nature of the config. I suspect that there are less than 100
ppl that could even have a clue, in this area, and they don't all have the
same pieces of clue.

Win2K DNS is self-consistent, BIND is self-consistent, they may be mutually
consistent, but that has yet to be determined. MSFT works in a glasshouse
with many of the panes painted-over, others are distorted. Similary, there
is a tendency, among *nix folk, to discount anything MSFT (a mistake, IMHO).

What's needed is for some(one/group) that has a though understanding of both
systems, at the design level, to sort out the bits. This is properly, a
development project, one that most system admins are unsuited for.
Unfortunately, it is left at the system admin level, IMHO.

[ On Wednesday, January 24, 2001 at 13:09:45 (-0800), Roeland Meyer wrote: ]

Subject: RE: Microsoft spokesperson blames ICANN

From our efforts, it is not at all surprising that someone, at MSFT, munged
the DNS configuration, totally. Even their best guru could have done it, due
to the murky nature of the config. I suspect that there are less than 100
ppl that could even have a clue, in this area, and they don't all have the
same pieces of clue.

That's absolutely idiotic (of M$, that is !;-). Even more idiotic than
putting all their nameservers in one basket, so to speak.

I'd bet any high-school kid who had any experience whatsoever at
installing Linux or FreeBSD could no doubt blow a real OS and a native
BIND install onto any sufficiently capable set of four machines in about
an hour or so and provided that someone could cough up at least a
half-baked zone file from somewhere to load on them they'd all be online
and answering to the registered nameserver IP numbers in no time flat.
Certainly in less than what's apparently going to be at least 23 hours
now!

Heck I know a half dozen or more people around the world who would have
put their dislike of M$ away for a short period and loaded a zone file
or two on their own nameservers for M$ if only M$ could have managed to
get the .COM zone updated with new delegations.... What ever happened
in this community to asking the community for help when you're caught
between a rock and a hard place? (Not that a company the size of M$
should have to ask for a handout -- they no doubt have significant IP
connectivity in as many places around the world as almost anyone else!)

MS has nothing and no-one to blame but their own stupidity and arrogance
in this. Meanwhile they're so damn big and "important" to so many users
that this outage is having both a direct and an indirect negative impact
on a lot of ISPs around the world! "Hey! The Internet must be broken
if I can't get to M$.COM!"

What's needed is for some(one/group) that has a though understanding of both
systems, at the design level, to sort out the bits. This is properly, a
development project, one that most system admins are unsuited for.
Unfortunately, it is left at the system admin level, IMHO.

No, what's needed is for M$ to learn that they need to deploy software
that's capable of the task even if it didn't come from a box and doesn't
have their logo branded on it. Squishing things together that were
never meant to be squished together is only going to cause a big mess.
Err, has already caused a big mess, at least for M$ and those who deal
with them! :wink:

They'd also do well to learn a bit about network geography and just
exactly how authoritative nameserver visibility from various locations
on this wonderful Internet of ours can directly affect their bottom
line!

[ On Wednesday, January 24, 2001 at 13:09:45 (-0800), Roeland Meyer wrote: ]
> Subject: RE: Microsoft spokesperson blames ICANN
>
> From our efforts, it is not at all surprising that someone, at MSFT, munged
> the DNS configuration, totally. Even their best guru could have done it, due
> to the murky nature of the config. I suspect that there are less than 100
> ppl that could even have a clue, in this area, and they don't all have the
> same pieces of clue.

{OBofftopic: hmm, look at the two timestamps, above. did greg reply to roeland's
e-mail before it was written?}

by now i think we are realizing that it's probably more of some kind of
server-level/network-level attack, and not a DNS phuque-up. i got
plenty o'pings earlier without nary a drop, although the nameservers
didn't reply.

{Important Point:} nevertheless:

That's absolutely idiotic (of M$, that is !;-). Even more idiotic than
putting all their nameservers in one basket, so to speak.

I'd bet any high-school kid who had any experience whatsoever at
installing Linux or FreeBSD could no doubt blow a real OS and a native
BIND install onto any sufficiently capable set of four machines in about
an hour or so and provided that someone could cough up at least a
half-baked zone file from somewhere to load on them they'd all be online
and answering to the registered nameserver IP numbers in no time flat.
Certainly in less than what's apparently going to be at least 23 hours
now!

{Oblinux: there are a few itty-bitty "server" distro's out there that you
could probably load up in under 15 minutes. also, the e-smith-style
"appliance" distros are also quick to load.}

Heck I know a half dozen or more people around the world who would have
put their dislike of M$ away for a short period and loaded a zone file
or two on their own nameservers for M$ if only M$ could have managed to
get the .COM zone updated with new delegations.... What ever happened
in this community to asking the community for help when you're caught
between a rock and a hard place? (Not that a company the size of M$
should have to ask for a handout -- they no doubt have significant IP
connectivity in as many places around the world as almost anyone else!)

whoa, slow down... microsoft apparently hasn't quite figured out what
hit them (and in these later hours there's implications that there is
more than one issue happening here). any large company is gonna take
some non-trivial amount of time to figure things out so that the report
to the upper management (ultimately) will be complete, including not
only what happened, who's responsible, etc., but also what steps were
taken to keep it from happening again. keeping running notes on all of
this just makes it slow. take that resulting time and double it when a
company has claimed (and, y'know, perhaps it's true) in the past that
they possess clue. and finally, take that second time and triple if
it's a public company (where somebody can get sued).

i'm not making excuses for microsoft, but more clueful companies have had
worse times of it, even in the recent past. give 'em a chance.

MS has nothing and no-one to blame but their own stupidity and arrogance
in this. Meanwhile they're so damn big and "important" to so many users
that this outage is having both a direct and an indirect negative impact
on a lot of ISPs around the world! "Hey! The Internet must be broken
if I can't get to M$.COM!"

whoa! whoah!! take it easy... chill... let's kick 'em when and where they
deserve it, after all the smoke clears. until then, i think this forum should
be supportive of internet-connected networks that are facing big troubles.
whatever is happening to microsoft today could happen to someone far
dearer tomorrow (or today, of course). we all might learn something
useful from this. (and maybe not.)

No, what's needed is for M$ to learn that they need to deploy software
that's capable of the task even if it didn't come from a box and doesn't
have their logo branded on it. Squishing things together that were
never meant to be squished together is only going to cause a big mess.
Err, has already caused a big mess, at least for M$ and those who deal
with them! :wink:

They'd also do well to learn a bit about network geography and just
exactly how authoritative nameserver visibility from various locations
on this wonderful Internet of ours can directly affect their bottom
line!

try: http://secondary.easydns.com

[ On Wednesday, January 24, 2001 at 20:30:12 (-0500), Henry Yen wrote: ]

Subject: Re: Microsoft spokesperson blames ICANN

> [ On Wednesday, January 24, 2001 at 13:09:45 (-0800), Roeland Meyer wrote: ]

{OBofftopic: hmm, look at the two timestamps, above. did greg reply to roeland's
e-mail before it was written?}

As far as I can tell all my system clocks are close enough to true
network time that NTP hasn't been complaining! :slight_smile:

Note though that my logs show my reply being sent at 18:01 -0500 and the
message came back to me with the date header intact and reading:

so perhaps the error is actually in your MUA (i.e. in its formation of
the "On ... wrote:" line when preparing the quoted message). Apparently
it gets the "AM" and "PM" wrong when converting from a 24-hour clock to
a 12-hour clock. It should have written "06:01:29PM -0500".

whoa, slow down... microsoft apparently hasn't quite figured out what
hit them (and in these later hours there's implications that there is
more than one issue happening here).

In this particular case it's totally irrelevant what hit them. They
needed to get at least one solid reliable replacement nameserver up and
running and answering on one of those IP addresses as soon as humanly
possible if they were to try an mitigate the damage. If it were me and
running the show and if I had even a hint that there were malicious
agents responsible I'd have grabbed as many raw packets off the network
as I could conveniently and quickly store, then I'd have literally
pulled the plug on at least two of the machines and sent the works off
for forensic analysis while a new, slightly different, and far more
secure, machine was brought in to provide this most critical service.

Given the hindsight gained from reading their announcment (and guessing
what really happened), perhaps they even did that, but it shouldn't have
taken them so many more hours to figure out that the world still wasn't
seeing their DNS no matter what they did to those servers.

Of course it wouldn't have been nearly so critical an issue requiring
such quick and dirty action if they would have had more diverse DNS
servers.

Part of the problem of course is that they may not have percieved the
full extent of their problem as quickly as some of us from the outside
were imagining it to be, though that's somewhat difficult to understand,
especially given the nature of the discussion in open forums such as
this one....

I wouldn't have been "kicking" them while they were still down if it
wasn't that they'd clearly and obviously tied their own noose and
stepped into it and then pulled the lever on their own trap-door
themselves!

The comedy of errors in their recovery attempts and the enormous delay
in returning their DNS to operational status points out several grave
operational errors, but none of those errors should ever have caused any
visible problems in the first place -- the root cause of their problems
remains in the fact that they did not follow the best common practices
already well documented by other's who have learned these lessons from
the school of hard knocks.

I'm going to play devils advocate here.

* I bet any high school kid setup Linux or FreeBSD box will probably die
  under the load of M$'s zones - the default out-of-the-box config
  is nice, but not *nice*.
* You have no idea whether M$'s DNS servers are serving static zone
  files, back ended to a database, talking to a mapper of some sort,
  whatever.

As someone mentioned, there are things such as maintenence windows which
explaining to management you need to break can sometimes be painful.

That said, I think it being dead for 23 hours is a little strange, but
then we don't know the exact story so we could be pointing the blame
at exactly the wrong place(s).

Adrian

I bet Microsoft had to go get a Unix box and install DNS on it so they could
get backup and running.. lol
Morris Allen
VidcomNet, Inc.

[ On Thursday, January 25, 2001 at 19:17:15 (+0800), Adrian Chadd wrote: ]

Subject: Re: Microsoft spokesperson blames ICANN

>
> I'd bet any high-school kid who had any experience whatsoever at
> installing Linux or FreeBSD could no doubt blow a real OS and a native
> BIND install onto any sufficiently capable set of four machines in about
> an hour or so and provided that someone could cough up at least a
> half-baked zone file from somewhere to load on them they'd all be online
> and answering to the registered nameserver IP numbers in no time flat.
> Certainly in less than what's apparently going to be at least 23 hours
> now!

I'm going to play devils advocate here.

* I bet any high school kid setup Linux or FreeBSD box will probably die
  under the load of M$'s zones - the default out-of-the-box config
  is nice, but not *nice*.

Well, that's why I said "sufficiently capable machine"..... Give *me* a
pair of 1GHz Xeon processors with >=2MB cache on a dual-bus motherboard,
1GB of RAM, a pair of 1000baseT interfaces (one for a private
administrative interface), a fiber-channel attached RAID array that's
properly tuned for speed, and the latest version of FreeBSD, and we'll
see just how many queries per second such a box can answer! :wink:

Obviously you'd want to install only the bare minimum of software
necessary and then turn off inetd and any other stand-alone network
daemon but named....

* You have no idea whether M$'s DNS servers are serving static zone
  files, back ended to a database, talking to a mapper of some sort,
  whatever.

It doesn't really matter -- that's a back-office implementation issue.
The part that's answering the queries has a terribly simple job to do.

However in theory if they've got a reliable internal nameserver that's,
for example, either insecure or incapable of handling the public query
load, then they can update that one any way they please and let BIND on
the authoritative server do the zone transfer from it. Dynamic DNS is
useless if you don't have your TTLs set right, and if you do have your
TTLs right then getting the SOA right is trivial too, and once you've
done that it doesn't matter if you stick an extra zone transfer in the
path. So long as they're not being total idiots and trying to void
BIND's warranty with <300 sec. TTLs, they'd do just fine.