Network Operations "Metrics"

CEO's are judged by their company's profitability, long-term
growth, efficiency, etc. Students are judged based on
quizes, reports, mid-terms and finals. The economy is
tracked by tracking individual sectors, quarter-to-quarter
and year-over-year performance, unemployment, interest
rates, etc.

What metrics are used to measure networks and network
operators?

What "micro" measurements (the equivalent of tracking travel
expenses or cost-of-sales to ensure overall profitability)
are used to ensure good macro-level (up-time, network
reliability, application performance, happy customers)?

What systems/processes do you use to track all of this
information, and associate it to overall business success?

Thanks.
Pete.

I assume this was a rhetorical question, since you know as well as I do
that all major telcos fly by the seat of their pants.

                                -Bill

I can assure you that is absolutely false Woody.
-ren

*top post corrected*

> > What systems/processes do you use to track all of this
> > information, and associate it to overall business success?
>

Customers Happy + (Bean Counter Happy && Bean Counter != Crook) + Time =
Overall Business Success

One shouldn't make the model too complex. Know your customers and what they
want. Temper that with proper evaluation of cost (ie, don't run at a loss by
giving customer bandwidth for free). This is no different than any other
business, and I've yet to see a company with an SLA perform better than a
company without an SLA that cared about their customers.

>I assume this was a rhetorical question, since you know as well as I do
>that all major telcos fly by the seat of their pants.
>
> -Bill

Define major. Are you saying that Rural ILECs aren't major telcos?

I can assure you that is absolutely false Woody.
-ren

Do you represent a major telco? If so, were you flying by the seat of your
pants when you top posted? :slight_smile:

Everyone flies by the seat of their pants. If there was a single proven way,
we wouldn't need nanog or have any of the issues that are prevelant today.

-Jack

Now the $64M question is which NMS system will allow you to calculate that
in RealTime? :wink:

-Jim P.

Honestly. We did it with Nagios. www.nagios.org

It keeps the bean counters happy. And with good notes on the specific
outages we can account for the down time and steps to prevent it in the
future. Combination of working services and happy customers are the best
you can do.

Gerald

What metrics are used to measure networks and network
operators?

  Peering and Transit Cost
  Peering and Transit Cost / bit
  Revenue
  Revenue/bit

  Change Management Practices and Successes
  Outages
  Ave Uptime/Device by Type
  Ave # Trouble Tickets / Time
  Customer Turnups/Month
  Customer Call hold times and call lengths

  Peering BW
  Peering Utilization
  Average Backbone Circuit Utilization
  Peak Backbone Circuit Utilization [maybe P95 of circuits, or top 20 busy]

  Packet Loss and Latency within network
  Packet Loss and Latency outside of network on Internet

  Devices managed
  # of employees required
  Ave # of employees / device
  Capex $$ / POP
  Capex $$ / bit or bps
  Traffic/Pop
  dialup holdtimes
  Ave Packet Size

  It's interesting to note that two schools of thought exist on defining
  the denominator in many cases - rate (bps) and volume (TB/month).

What "micro" measurements (the equivalent of tracking travel
expenses or cost-of-sales to ensure overall profitability)
are used to ensure good macro-level (up-time, network
reliability, application performance, happy customers)?

  Specific instances of some of above.
  

What systems/processes do you use to track all of this
information, and associate it to overall business success?

  Although folks would love to throw money at vendor XYZ to
  produce SAS-like reports w/ a big dial, I don't believe such
  a tool exists.

  At the end of the day, the formula about making a profit
  without too many upset customers and no financial chicanery
  is the simplest.

  In terms of greater geo-telco-politics, a particular engineering
  group or operations group may pick out 2-10 metrics above and use
  those to justify certain things.

  The vast majority of the metrics above can be gotten from
  simple ping scripts and snmp query scripts stuffing data into
  databases, and running DB reports.

  A trouble ticket system such as remedy or RT2 or others can
  also serve as a workflow system and provide useful statistics.

  "There are three kinds of lies: lies, damned lies, and statistics."
  - Benjamin Disraeli

  -a