Network Operations Guide

Kasper_Adel · August 22, 2007, 6:40pm

Hello,

My customer grew from a small enterprise to an SP/Mobile operator network very quickly.

I want to guide them on how to operate their network, so I am looking for a document for this.

On the top of my head and please add your suggestions:

Change Management
Network Management
Daily Operational tasks/process

Let me explain what’s on my mind for each point above:

I can think of RANCID and TACACS+ in terms of configurations, but they also do hardware changes that affects the network without any strict procedures (example: add a switch and start an STP loop)
This I can get, but would also welcome suggestions, NMS, SNMP, netflow….
Ticketing system, escalation procedures, troubleshooting documents, backup procedures, inventory…

Your valuable input would be highly appreciated,.

Regards,

Kim

michael.dillon · August 22, 2007, 7:22pm

My customer grew from a small enterprise to an SP/Mobile operator

network very quickly.

I want to guide them on how to operate their network, so I am looking

for a document for this.

I've found the following websites quite useful for this:

http://www.google.com/search?hl=en&q=Change+Management
http://www.google.com/search?hl=en&q=Network+Management
http://www.google.com/search?hl=en&q=Daily-Operations-tasks+network

http://www.google.com/search?hl=en&q=Change+Order
http://www.google.com/search?hl=en&q=Trouble-Ticketing-system
http://www.google.com/search?hl=en&q=escalation+procedures
http://www.google.com/search?hl=en&q=troubleshooting+processes
http://www.google.com/search?hl=en&q=backup+procedures
http://www.google.com/search?hl=en&q=Network-Inventory

No, this is not a joke. I checked every one of the above URLs and you
*WILL* find useful information at each one.

And here is an important tip. Install a database server (Oracle,
PostgreSQL) for general use and make sure that everyone has ODBC/JDBC
access to it. Then ban the storage of information in spreadsheets and
make it a habit to ask pointed questions about where data is stored when
you are in meetings. Order the DB Admins to create any tables that
people ask for in the "general use" database but to advise people on
table design, for instance field types (IP addresses are not VARCHAR).
Some attempt should be made to ensure that important fields are
consistent between users since at some future date, tables may need to
be joined. But these rules need to be applied with a light touch because
the goal is to KEEP DATA OUT OF SPREADSHEETS.

Many modern companies operate using 1950's style processes shuffling
spreadsheets through email instead of shuffling paper through the mail
carts. That is counterproductive. It is far better to emulate
1970's/80's companies who could only justify the vast expense of
computers by simplifying processes and centralizing data on shared
databases.

Once a spreadsheet starts to become a valuable source of data, the data
should be stuffed into the database where it can be shared, kept up to
date, be viewed consistently by all parts of the business, and joined
with other data for reporting purposes. Then, when it makes sense to
build or buy an application centered around a database, you already have
a source of clean consistent data ready and waiting.

--Michael Dillon

Deepak_Jain · August 22, 2007, 8:23pm

Once a spreadsheet starts to become a valuable source of data, the data
should be stuffed into the database where it can be shared, kept up to
date, be viewed consistently by all parts of the business, and joined
with other data for reporting purposes. Then, when it makes sense to
build or buy an application centered around a database, you already have
a source of clean consistent data ready and waiting.

I'm guessing there is a tool somewhere that will take a set of data from a database and present it like a spreadsheet for import/export of updates. Anyone have a pointer?

Is anyone doing anything fancy like exporting their Visio to CSV some other sort for inclusion into a database as meta data? (for viewing one would grab the data and then ask Visio to render it)

DJ

michael.dillon · August 22, 2007, 9:36pm

I'm guessing there is a tool somewhere that will take a set
of data from a database and present it like a spreadsheet for
import/export of updates. Anyone have a pointer?

Keeping data clean and consistent is so important to network operations
that I think this is relevant to discuss for a bit.

First of all, assuming Windows is the client OS and the data is on a
server somewhere, you may find it easier to use Access to get at the
data, join tables, sort it and filter out unwanted rows. Then export it
to an Excel spreadsheet. But not all corporate setups include Access, so
you can do much the same directly from Excel. Look in Data->Import
External Data->New Database Query, then select your data source,
table/query/view, columns, etc. At the end you can select "View or edit
query in Microsoft Query" and you will get a query builder than can help
you sift out only the rows and columns that you need to import. These
queries can be saved so that in future you simply rerun the query
Data->Refresh Data and get fresh up-to-date data.

For updating a central database, you either need to develop applications
or use a general purpose tool like MS Access. Usually spreadsheets are
used to store fairly straightforward tables so building an update
application is not necessarily that complex. For instance an Excel
spreadsheet template can contain a VB subroutine that takes a row of
data and turns it into an SQL UPDATE or INSERT statement.

To an IT person this all may sound rather crude and hardly any better
than just keeping a bunch of spreadsheets, but they probably never have
to deal with the consequences of dirty and inconsistent data.
Spreadsheets tend to breed. They get copied around in email and pretty
soon, people make mistakes and throw away the updated version, not the
old one. Or other people, building a new spreadsheet think that person X
has the definitive spreadsheet with all the latest IP address
allocations, not realizing that this is a second hand copy of another
master spreadsheet, and person X only gets a copy whenever an upgrade
project completes, every few months. Meanwhile, operations is busy
rationalising PoP layout and all the subnetting changes but nobody
writes it down except in the one project managers planning spreadsheet.

The key principle here is to keep all important data in tables on a
database server, and make sure that everybody understands that these
tables are the one and only true source for this data. And, of course,
make sure that server is backed up properly and you have a
disaster-recovey clone server ready for action when needed.

As long as the data is on a DB server, non-Windows systems should have
no problems with accessing it. Scripting languages, Open Office, web
servers and so on, can all share the same data.

Note that I am suggesting this be done, separate from the kind of
OFFICIAL corporate databases that run financial, ordering and billing
applications. Those databases are always locked down by the DBAs and no
tables are added to them without being properly designed and approved by
DBAs, data architects, app developers, etc. I am suggesting that you run
a separate DB server to attract all the data that usually gets
squirreled away in spreadsheets, to entice employees to share data and
cooperate, without a lot of bureaucracy in the way. DBAs can help a bit,
advise a bit, but they should not be able to forbid people to set up a
table or index or query/view. This suggestion is to treat the DB server
like a general service to all employees, like telephones or meeting
rooms.

--Michael Dillon

Joel_Jaeggli · August 22, 2007, 9:51pm

Deepak Jain wrote:

Once a spreadsheet starts to become a valuable source of data, the data
should be stuffed into the database where it can be shared, kept up to
date, be viewed consistently by all parts of the business, and joined
with other data for reporting purposes. Then, when it makes sense to
build or buy an application centered around a database, you already have
a source of clean consistent data ready and waiting.

I'm guessing there is a tool somewhere that will take a set of data from
a database and present it like a spreadsheet for import/export of
updates. Anyone have a pointer?

This used to be the key feature of relational databases...

In a desktop computing context people do this with microsoft access and
filemaker routinely...

These days however people have gotten so used to doing web interfaces to
tables in sql databases that it's pretty trivial even with a cookbook to
whip something up in php that will allow you view a table, do a select,
and perform an insert update or delete on the changed or added row.

Alex_Harrowell · August 22, 2007, 10:53pm

I’m impressed by this example of simple, decent, practical NANOG advice.

billndotnet · August 22, 2007, 11:10pm

Much of the advice I'm about to offer depends on scale, a lot. If you're
got a small network, you'd be fine with Cacti and similiar OSS toolkits.
If you're running more than a couple devices, MRTG will always look
attractive as long as you're not responsible for administrating it.
Please, don't try to scale it. There are lots of decent OSS/OTS toolkits
in the wild, take the time to find one you're comfortable with and you'll
save yourself some pain.

If you don't have anyone in-house with the cluepower to install, maintain,
and understand one of these, consider buying, with support. I've been
spending a lot of time elbow-deep in the Monolith platform. Even though I
can build all my own tools, finding a good organized, full featured
platform that I can hack to hell and back has been a pretty big boon.

If you're going to roll your own:
Even if only a contract basis, I'd recommend tapping the skillset of a
good DBA or a professional network toolsmith for advice on organizing the
sheer dearth of data involved here. Getting off on a good footing is the
single most important part of a task like this, to minimize how much time
you spend going back and redoing things because they didn't scale or
simply don't apply generally enough.

The hard part in building management tools from scratch is coming up with
a good schema for standardizing how you organize your data. A scheme that
seems to make great sense will be completely obliterated when it first
encounters SNMP based conventions, and heaven help you if you standardize
on the lingo of a single vendor. Take Cisco vs *, for example. The Cisco
standard way of describing things is decent enough, and that's fine until
you decide to bring another vendor into the relationship.

Another reason I support a good standard is having multiple hands in the
cookpot. If you have a team of people working on tools, versus one
dedicated snmp ninja, you can't have multiple designers. It just doesn't
work. This is an area of networking that needs sunlight at all times, to
keep evil things from growing in the code. Ugly hacks and stupid shell
scripts are all well and good, if they're in your personal toolbox bin
directory. You don't want them in your enterprise/production grade tools.
There's no telling what will happen three years down the road.

Organization of network data is usually pretty simple: Devices contain
interfaces and sessions, interfaces contain counters, states, and
descriptions.

Smart pollers won't bother with counters on a down/shut interfaces, and
will only check descriptions/labels every so often. Your only frequent
polls should be in/out/error counters. Use 64 bit counters, and account
for rollover even if it's less of an issue at 64 bits.

If you're a hosting company or have high traffic on servers, the ARP cache
and the bridge tables are your friend.

The single most important piece of advice I can offer when building your
own tools: Never poll the routing table with SNMP. Ever. Any OTS tool that
says it can, as a feature, well, it's a witch, burn it. (If you need
routes, build something that can speak BGP. It's not that hard, last time
I did it was maybe 50 lines of code plus perl modules.)

- billn

Andrew_Sullivan · August 23, 2007, 3:31am

I'd like to echo this comment, except that instead of just "DBA" I'd
suggest _really strongly_ that you find someone experienced in
relational database design if you're going to build such tools using a
relational database (and I'll bet lunch you'll eventually put this
into some database, if you start to collect data). You can spend a
great deal of money on data gurus later to try to make your
poorly-designed original data model work without interruptions; or
else you can design it with good normalisation at the outset.

Someone who has built a lot of MySQL-backed web sites (and nothing
else) does _not_ qualify as a database designer. Real databases: ask,
for instance, how a new data source will fit into the model if its
fields are wrong, or subtly different, and how new classes of data are
to be accommodated.

My experience suggests to me that four hours of real, solid database
design up front is worth approximately two weeks of re-engineering and
compatibility development after about a year. This is no
exaggeration: the problems of badly normalized data are really bad
once you have a pile of data that you want to be able to rely on. (Of
course, my data geek colleagues are happy to take the consulting fees
later. But save yourself grief up front.) This is as true for
network tools as anything else.

A

Deepak_Jain · August 23, 2007, 7:46am

My experience suggests to me that four hours of real, solid database
design up front is worth approximately two weeks of re-engineering and
compatibility development after about a year. This is no
exaggeration: the problems of badly normalized data are really bad

My god man, that's crazy talk. Planning ahead? Designing for growth? If we did that, what would Operations people do except check off fields in this "toolset" you propose thinking about before implementation!?! Reducing TCO and improving uptimes. Madness, I say.

:_)

To be useful, someone in the organization needs to sit down with the hired designer and describe what kind of data needs to be kept, what kind of questions the business needs demand answers for ("How many available /24s do we have? How many /30s? How long has it been since 50% of our customer base has had its contact information verified?) If you follow Alice down this rabbit hole, you will eliminate 95% (or more) of your operational and communications problems -- irrespective of how you store the data. If you combine that with ILM (Information Lifetime Management) of this knowledge, you could be at high 90s for several years at a time before needing to plumb entropy.

DJ

Sam_Stickland1 · August 23, 2007, 3:21pm

Bill Nash wrote:

The single most important piece of advice I can offer when building your own tools: Never poll the routing table with SNMP. Ever. Any OTS tool that says it can, as a feature, well, it's a witch, burn it. (

You mean like the Cisco Route Manager?

*Cisco Managed Services Accelerator - Cisco

billndotnet · August 23, 2007, 3:39pm

That 'feature' right there is a great indicator as to just how unhealthy
such an action is. Even if your CPU load is below your threshold, you
can't guarantee that the polling action isn't going to put it over that
threshold to the point of impacting performance.

Break out the Zebra/Quagga install and offload that CPU load. Don't use
SNMP to read routing information.

Ok, I shouldn't be so declarative about it. You can do whatever you want
with your network. I will continue to beat people if/when I catch them
doing it. =)

- billn

billndotnet · August 24, 2007, 5:26pm

I built a perl daemon using Net::BGP and DBI that inserted and removed
routes, on update, into an SQL db. I could then query to my hearts
content, beating up a db with full routes with all the efficiency of SQL.
It's simple as hell and works great.

- billn

billndotnet · August 24, 2007, 7:04pm

Just to answer a bunch of off-list mails en masse, no, I do not have a
copy of this code for distribution. I wrote it originally for an
ex-employer, so I don't have a copy. I plan on rebuilding it, it's fairly
simple, so if you have use cases for a tool like this, shoot me an email
off list with your thoughts. Time permitting, I'll rebuild and post. I
need it for some of my own purposes anyway, might as well make it suck
less.

- billn

Sam_Stickland1 · August 31, 2007, 11:21am

Bill Nash wrote:

Thing is, Zebra/Quagga doesn't seem to have any sort of "external connector"
to withdraw the routes in a decent format for analysis purposes (xml, csv,
plain list...).
Last time we tried, Zebra/Quagga broke down when we installed SNMP support (to
locally query the routing table on the soft router).

Anyone know about any solution for such BGP data collection ? (OpenBGPd ?)
Thanks in advance for any hint.

I built a perl daemon using Net::BGP and DBI that inserted and removed routes, on update, into an SQL db. I could then query to my hearts content, beating up a db with full routes with all the efficiency of SQL. It's simple as hell and works great.

It's just a shame that there aren't any BGP extensions in common use that allow it to advertise all the routes in the BGP table, not just the best ones (1). It would also only allow you to monitor BGP routes. Even forming adjancies with the other routing protocols won't catch everything. Perhaps you need to find stray static routes that got added, but aren't being redistributed into any routing protocol.

Ultimately I feel that this problem would require the vendors to provide decent route reporting mechanisms rather than attempting to gather this information one hop removed.

Sam

1) see the archives for a discussion of problem. As well as needing to advertise all the routes, the problem is that the BGP withdrawal message does not carry enough information to specify which path is being withdrawn.

Scott_Francis1 · August 31, 2007, 3:25pm

[snip details on keeping e.g. IP assignment data in databases]

To an IT person this all may sound rather crude and hardly any better
than just keeping a bunch of spreadsheets, but they probably never have
to deal with the consequences of dirty and inconsistent data.
Spreadsheets tend to breed. They get copied around in email and pretty
soon, people make mistakes and throw away the updated version, not the
old one. Or other people, building a new spreadsheet think that person X
has the definitive spreadsheet with all the latest IP address
allocations, not realizing that this is a second hand copy of another
master spreadsheet, and person X only gets a copy whenever an upgrade
project completes, every few months. Meanwhile, operations is busy
rationalising PoP layout and all the subnetting changes but nobody
writes it down except in the one project managers planning spreadsheet.

good suggestions, although I did want to point out that the
authoritative source of truth does not necessarily have to be a
database (at least not initially). Current $employer uses sharepoint,
and as distasteful as I find the borg-like tendencies of the MS
software environment, this tool actually seems to work fairly well
when used in conjunction with the rest of the MS toolkit. Single
authoritative location for information, supports revisions,
checkout/checkin, etc. We currently maintain IP allocation data in a
fairly involved spreadsheet, and while I can definitely see that we'd
have more flexibility if it were in a database with some kind of CGI
frontend, this has served fairly well thus far ...

since everybody knows that there is only one place to find whatever
the spreadsheet is, they tend to just go check out the current version
through their web browser rather than trying to get a copy of whatever
from somebody in email (also helps that there's a fairly small set of
folks - less than 40 - that would ever have any interest in that
data). But then, we're also not an NSP/ISP, so YMMV.

Nathan_Ward · September 8, 2007, 2:30pm

An alternative that I've been meaning to tinker with for some time might be to use OpenBGPd/zebra/quagga dumping to a file, and load that file in to SQL.

I like that a bit more than INSERT/DELETE direct to SQL, because those dump files are dumps of all BGP messages, as well as table states, so you can move to any point in time when diagnosing problems.

A quick google reveals http://nms.lcs.mit.edu/software/bgp/bgptools/ - tools that can deal with SQL and MRT dump files.