How about a game of chess? (was Re: Vulnerbilities of Interconnection)

Actually I do not know how to play chess maybe *Risk*, but your point
is well taken. The intent is not provide a public recipe for taking
down the Internet, that would be the opposite goal of the research to
begin with. Regardless it is difficult line to tread and it is best
to err on the side of caution.

As for sampling biases - that is why it is only anecdotal evidence and
the need for it to be questioned. Reports of Vinny accidently
announcing his router as AS701 do not make it to the media very often.

That aside the suggestion of how to model the Internet are very useful
and the closer these models can get to operational reality the
better. The intent being methodology not revealing data. Waites'
approach is a good first step, but what next. Also if you know
capacties how do you model a cascading effect that encompasses intra-
network and inter-network traffic flows.

Also it might be easier to calculate transition probabilities by
summing across the rows of the adjaceny matrix then dividing the row
components by the sum.

Bell Labs published papers on security problems in the IP protocol and the
Clipper chip. On the other hand Bell Labs had a policy of not publishing
papers about security issues in protocols such as SS7. Did not publishing
the information keep the telephone network more secure, or did publishing
make the Internet less secure?

Did it make a difference? I don't know. People building new networks
should be aware of the risks so they can address them. We don't keep
fire or building codes a secret, because we want people to build safe
buildings. How do people learn how to build secure networks if the
information is secret?

"sgorman" == <sgorman1@gmu.edu> writes:

    > Also it might be easier to calculate transition
    > probabilities by summing across the rows of the adjaceny
    > matrix then dividing the row components by the sum.

I'm not sure that that works. That would cause systems not
adjacent to i to contribute to the probability that a packet goes from
i to j in one step, if I understand correctly. There is also a
hidden imbedded Markov chain in my formulation.

Firstly, to avoid confusion with terminology, i am using:

A_i - adjacency density vector -- the number of systems adjacent
       to i.
C_ij - connectivity matrix -- representing presence or absence of an
       interconnect between i and j as a 1 or 0. C_ij does not
       represent a Markov process.
P_ij - transition probability matrix -- probability that a packet, now
       in i will end up in j next.

We have,

        A_i = Sum_j C_ij

In making P_ij, i think it is necessary to calculate according to the
number of interconnections of j -- how big the neighbour is. This
looks right intuitively: if i is a smaller, local isp or an non-isp
organization they will tend to have a few upstream providers and maybe
some private peering. The upstreams will tend to be more heavily
interconnected and peered and their component of A_i will be larger
than that of the private peering, or more of your packets will go
to an upstream than to your peers.

The assumption may or may not be valid; it is difficult to know
without actual traffic statistics for a good sample of links on the
internet. If we had a close complete data set of traffic statistics
for the internet measured at the same instant, it would be possible to
calculate the P_ij directly:

               pps on link connecting i and j
        P_ij = ------------------------------
               total pps leaving i

but this data is not available available in general.

(aside: I belive that because of the scoping property of Little's
Formula this approach would also be valid for analyzing packet flow
within an AS where i and j represent each router indexed
arbitrarily)

Consider
                             C_ij A_j
          P_ij = lambda_i --------------
                          Sum_k C_jk A_k

                             Sum_k C_ij C_jk
               = lambda_i ---------------------
                          Sum_k Sum_l C_jk C_kl

where lambda_i is a normalizing factor appropriate to make the sums
across each row of P_ij equal to 1.
             
The product of C_ij with A_j is necessary since we don't want to take
into account the adjacency density of ASj if we are not connected to
it -- it doesn't matter how big it is, if i is not connected to j, a
packet has zero chance of making the transition directly. This is
taken care of by the fact that C_ij == 0 in that instance.

Likewise the right inner product of C_jk with A_k in the denominator
is necessary for the same reason, except now two hops out. Non
existent links are never taken into account in calculating the
transition probabilities.

You might say that the numerator is skewed towards the current state,
and the denominator is skewed towards the next state. This suggests
that there is an implicitly imbedded Markov chain -- the denominator
used in calculating the probability of going in one step from i to j
takes into account information about what will happen after at the
next step when it leaves j.

-w

Apologies for replying to myself.

I wrote:

    > C_ij A_j
    > P_ij = lambda_i --------------
    > Sum_k C_jk A_k

The first factor in the sum in should be C_ik, not C_jk. There is no
imbedded chain, nor "skew" in the equation. There was skew in my
reasoning. :wink: It also means that lambda_i == 1 -- it is already
normalized.