"sgorman" == <sgorman1@gmu.edu> writes:

> Also it might be easier to calculate transition

> probabilities by summing across the rows of the adjaceny

> matrix then dividing the row components by the sum.

I'm not sure that that works. That would cause systems not

adjacent to i to contribute to the probability that a packet goes from

i to j in one step, if I understand correctly. There is also a

hidden imbedded Markov chain in my formulation.

Firstly, to avoid confusion with terminology, i am using:

A_i - adjacency density vector -- the number of systems adjacent

to i.

C_ij - connectivity matrix -- representing presence or absence of an

interconnect between i and j as a 1 or 0. C_ij does not

represent a Markov process.

P_ij - transition probability matrix -- probability that a packet, now

in i will end up in j next.

We have,

A_i = Sum_j C_ij

In making P_ij, i think it is necessary to calculate according to the

number of interconnections of j -- how big the neighbour is. This

looks right intuitively: if i is a smaller, local isp or an non-isp

organization they will tend to have a few upstream providers and maybe

some private peering. The upstreams will tend to be more heavily

interconnected and peered and their component of A_i will be larger

than that of the private peering, or more of your packets will go

to an upstream than to your peers.

The assumption may or may not be valid; it is difficult to know

without actual traffic statistics for a good sample of links on the

internet. If we had a close complete data set of traffic statistics

for the internet measured at the same instant, it would be possible to

calculate the P_ij directly:

pps on link connecting i and j

P_ij = ------------------------------

total pps leaving i

but this data is not available available in general.

(aside: I belive that because of the scoping property of Little's

Formula this approach would also be valid for analyzing packet flow

within an AS where i and j represent each router indexed

arbitrarily)

Consider

C_ij A_j

P_ij = lambda_i --------------

Sum_k C_jk A_k

Sum_k C_ij C_jk

= lambda_i ---------------------

Sum_k Sum_l C_jk C_kl

where lambda_i is a normalizing factor appropriate to make the sums

across each row of P_ij equal to 1.

The product of C_ij with A_j is necessary since we don't want to take

into account the adjacency density of ASj if we are not connected to

it -- it doesn't matter how big it is, if i is not connected to j, a

packet has zero chance of making the transition directly. This is

taken care of by the fact that C_ij == 0 in that instance.

Likewise the right inner product of C_jk with A_k in the denominator

is necessary for the same reason, except now two hops out. Non

existent links are never taken into account in calculating the

transition probabilities.

You might say that the numerator is skewed towards the current state,

and the denominator is skewed towards the next state. This suggests

that there is an implicitly imbedded Markov chain -- the denominator

used in calculating the probability of going in one step from i to j

takes into account information about what will happen after at the

next step when it leaves j.

-w