Templating/automating configuration

Because we had different sources of truth which were written in-house, we wound up rolling our own template engine in Python. It took about 3 weeks to write the engine and adapt existing templates. Given a circuit ID, it generates the full config for copy and paste into a terminal session. It also hooks into a configuration parser tool, written in-house, that tracks configured interfaces, so it is easy to see whether the template would overwrite an existing interface.

I used the Jinja2 template engine, along with pyodbc/unixODBC/FreeTDS for access to a Microsoft SQL backend.

The keys for us are:

* extracting information from a source of truth

* validating the information for correctness

* making sure you don't overwrite existing config

* outputting the right templates for the circuit features

It made more sense to write a tool than it did to try to adapt something for our environment.

If I had a free hand and unlimited budget, I would find a single app that functions as a source of truth for all circuits and products, which includes a templating engine that hooks in easily.

-Brian

Hi,

Here are some extra pointers:

https://youtube.com/watch?v=C7pkab8n7ys

https://www.nanog.org/sites/default/files/dosdontsnetworkautomation.pdf

https://github.com/coloclue/kees

Kind regards,

Job

Hi Brian,

Because we had different sources of truth which were written in-house, we wound up
rolling our own template engine in Python. It took about 3 weeks to write the
engine and adapt existing templates. Given a circuit ID, it generates the full
config for copy and paste into a terminal session. It also hooks into a
configuration parser tool, written in-house, that tracks configured interfaces, so
it is easy to see whether the template would overwrite an existing interface.

Interesting. I'm going through much the same process at the moment, due to similar requirements - multiple sources of truth, validation that there's no clash with existing configs, but also with a requirement for network-wide atomic operations. The latter has been a strong driver for a custom tool - it's now grabbing an exclusive lock on all the devices, making all the checks, pushing all the config, commit check everywhere, commit everywhere, and only once all the commits succeed, release the locks. If any of those steps fail anywhere, we get to roll back everywhere. (Obviously with appropriate timeouts / back-offs / deadlock prevention, and specific to platforms with sane config management - no vanilla IOS).

Did you find anything to give you a leg-up on config parsing, or did you have to do that completely from scratch? At the moment, I'm working with PyEZ (I know, vendor lock-in, but we're firmly a Juniper shop, and going in eyes-open to the lock-in) to build a limited model of just the parts of the config I'm interested in validating, and it seems to be working.

If I had a free hand and unlimited budget, I would find a single app that
functions as a source of truth for all circuits and products, which includes a
templating engine that hooks in easily.

Plus the business buy-in and the resource to go back and standardise all the existing configs, so the application can fully parse and understand the network before it starts. That, and a pony :slight_smile:

Regards,
Tim.

Hi Brian,

The import process to the database runs directly on our rancid server, reading the downloaded configs out of the appropriate directory within rancid. Most of our gear is Cisco, so the ciscoconfparse module for Python helps a lot with organizing and querying the config. From there, the config is parsed for key items like interface name, description, configured bandwidth, etc., and that info is then added or updated as necessary in the database.

Because it's dependent on rancid, there is some lag time between when a change is made and when the database gets updated, so we still strongly encourage running the pre-config checks for new circuits. But with PyEZ, it looks like you easily could, after grabbing that lock, validate the existing config before pushing down new config. Lots of possibilities there. I'm envious that you have a vendor-written Python module as a choice!

...

Or, at least, rebuild the existing configs based on the new source of truth, so that subsequent config parsing conforms to a single standard.

Cisco IOS and IOS-XE have config locks on the CLI, as well as
automatic configuration rollbacks and the ability to generate a config
diff on deice. For some reason lots of people seem to forget/ignore
this.

If you are using NAPALM then I believe you can also implement this
through NAPALM.

Cheers,
James.

Job,

Would you be able to provide any further insight into your Don’t #5 – “Don’t agree to change management. Managers are rarely engineers and should not be making technical decisions. (nor should sales)“.

Thanks,
Graham

Hi Graham,

The talk was giving in context of motivating people to start with
network automation and help them go from 'no automation' to a step
further 'some automation'.

Would you be able to provide any further insight into your Don’t #5
“Don’t agree to change management.

I think the development team of the network automation software should
define their own process around change management. If you want to use
kanban? great! if you want to use simple fifo model applied to issues
filed on your private github project? great! My point was: don't let
someone from higher up dictate how, and when you do software releases.

Another aspect is that you most likely will have proceses that should
run without any human intervention: such as the nightly update for all
EBGP prefix-filters. You don't want to end up in a situation where a
computer generates those configs and has to hand them over to a human
for some additional checks and subsequently pushing it out to the
network. Imagine having the computer print out your automatically
generated configs, a human pick them up, review them, and type them back
into a computer for the changes to take effect! That would be terrible.

Managers are rarely engineers and should not be making technical
decisions. (nor should sales)“.

That was a simple point: ideally a manager enables you to do your work,
and trusts you to do the work. If you have a manager who opinionatedly
argues with you on tabs vs spaces or how to push a configuration to a
device, you might find that you don't have enough freedom of movement to
succesfully bootstrap the automation project.

In other words: don't roll over and blindly accept what other
(inexperienced) folks within the organisation tell you, try to find your
own path. However, do make sure you steal the good ideas from the
sysadmins: they often are ahead of netops in terms of automation and
understanding idempotency.

Kind regards,

Job

Graham Johnston wrote:

Would you be able to provide any further insight into your Don’t #5
“Don’t agree to change management. Managers are rarely engineers and
should not be making technical decisions. (nor should sales)“.

What do you think the purpose of change control / management is?

Nick

well, http://dilbert.com/strip/1995-05-29

On Wed, 14 Jun 2017 21:35:59 +0100, Nick Hilliard <nick@foobar.org> may
have written:

What do you think the purpose of change control / management is?

To provide employment for change managers of course.

Bureaucratic change control implementations using the ITIL view
of change control with a formal CAB are likely an (over)reaction
to human mistakes causing outages, most of which could probably
be avoided with a simpler less-formal process such as peer or
team review.

Change control functions as a risk transfer away from operations teams to
CAB board members, since if things go wrong b/c of a change: it is now
the CAB's fault. There may also be bias towards change-aversity
if the CAB cannot be held accountable for issues that come from
delaying or rejecting important maintenance.

Overall purpose for change control / management, when applied to substantial
modifications to an operating environment or configuration of
business-critical network/applications is

To mitigate possibility of damage/outages from high-impact / high-risk
changes made by humans to systems and network-devices by
requiring standards of formal written documentation and planning,
combined with peer review And approvals by business and technical
stakeholders for the maintenance time, including evaluation of
exact proposed configuration changes, implementation plans,
and backout/contingency plan: for possible errors or omissions.

But as with most things
can be taken to an unreasonable extreme.

The use of change management procedures has a high
associated cost, b/c the time and labor to implement
even simple relatively low-risk changes can be dramatically
increased with an unreasonable delay, and extensive test labs
may be necessary. There may actually be increases in
various risks, if any kind of maintenance is delayed or
lost in the paperwork.