The state of TACACS+

Ever since first using it I've always liked tacacs+. Having said that I've grown to dislike some things about it recently. I guess, there have always been problems but I've been willing to leave them alone.

I don't have time to give the code a real deep inspection, so I'm interested in others thoughts about it. I suspect people have just left it alone because it works. Also I apologize if this is too verbose or technical, or not technical enough, or just hard to read.

History:

TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there have been no official changes to the code. Much has happened between now and then. I specifically was interested in parsing tac_plus logs correctly. After finding idiosyncrasies I decided to look at the source and the RFC to see what was really happening.

Logging, or why I got into this mess:

In the accounting log, fields are sometimes logged in different order. It appears the client is logging whatever it receives without parsing it or modifying it. That means the remote system is sending them in different orders, so technically the fault lies with them. However, it seems too trusting to take in data and log it without looking at it. This can also cause issues when you send a command like (Cisco) "dir /all nvram:" on a box with many files. The device expands the command to include everything on the nvram (important because you might want to deny access to that command based on something it expanded), but it gets truncated somewhere (not sure if it's the device buffer that is full, tac_plus, or the logging part. I might tcpdump for a while to see if I can figure out what it looks like on the wire) I'm not sure if there are security implications there.

Encryption:

The existing security consists of md5 XOR <content> with the md5 being composed of a running series of 16 byte hashes, taking the previous hash as part of the seed of the next hash. A sequence number is used so simple replay shouldn't be a factor. Depending on how vulnerable iterative md5 is to it, and how much time you had to sniff the traffic, I would think this would be highly vulnerable to chosen plaintext if you already have a user-level login, or at least partial known plaintext (with the assumption they make backups, you can guess that at least some of the packets will have "show running-config" and other common commands). They also don't pad the encrypted string so you can guess the command (or password) based on the length of the encrypted data.

For a better description of the encryption you can read the draft: http://tools.ietf.org/html/draft-grant-tacacs-02
I found an article from May, 2000 which shows that the encryption scheme chosen was insufficient even then.
http://www.openwall.com/articles/TACACS+-Protocol-Security

For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it would be easier to roll keys frequently, or if it isn't too much overhead they could use public key.

Clients:

As for clients, Wikipedia lists several that seem to be based on the original open-source tac_plus from Cisco. shrubbery.net has the "official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly or shrubbery.net code, but they retained the name and started doing their own versioning. All the webpages look like they're from 1995. In some cases I think it's intentional but in some ways it shows a lack of care for the code, like it's been dropped since 2000.

Documentation is old:

This only applies to shrubbery.net's version. I didn't look at the other ones that closely. While all of it appears valid, one Q&A in the FAQ was about IOS 10.3/11.0. Performance questions use the sparc 2 as a target machine. There isn't an INSTALL or README, just the FAQ/CHANGES/COPYING (and a tac_plus.conf manpage), so the learning curve for new users is probably pretty steep. Also there isn't a clear maintainer. The best email address I found was listed in the tacacs+.spec file, for packaging on rpm systems.

If you hit the website they give some hints with some outdated, though still functional links. And they list the official email as tac_plus@shrubbery.net

Conclusion:

Did everyone already know this but me? If so have you moved to Kerberos? Can Kerberos do everything TACACS+ was doing for router authorization? I've got gear that only supports radius and tacacsplus, so in some cases I have no choice but to use one of those, neither of which I would trust over an unencrypted wire. If TACACS+ isn't a dead end then it needs a push to bring the protocol to a new version. There are big name vendors involved in making supported clients and servers. There should be someone invested in keeping it secure and adding features.

I don't understand why vendors and operators keep turning to TACACS. It
seems like they're often looking to Cisco as some paragon of best security
practices. It's a vulnerable protocol, but some times the only thing to
choose from.

One approach to secure devices that can support only TACACS or RADIUS:
Deploy a small embedded *nix machine (Soekris, Raspberry Pi, etc.) that
runs a RADSEC (for RADIUS) or stunnel (for TACACS) proxy. Attach it to a
short copper with 802.1q, take weak xor'ed requests in on one tag, wrap the
requests with TLS, and forward out another tag towards your central AAA box.

Kerberos or more certificate-based SSH on routers would be super.
SSH with certificates is nice in that it allows authenticators out in the
field to verify clients "offline", without needing a central AAA server.
However, the tradeoff is that you must then make sure all the clocks are
correct and in-sync, and root certificates are verified.

TACACS+ was proposed as a standard to the IETF. They never adopted
it and let the standards draft expire in 1998. Since then there

If continued existence of TACACS+ can be justified at IETF level, in parallel
with radius and diameter, I have some interest in the subject and would be
ready to work with draft.

Encryption:

For new crypto I would advise multiple cipher support with
negotiation so you know what each client and server is capable of.
If the client and server supported multiple keys (with a keyid) it

It seems encryption is your only/major woe? Personally I don't like how we
need to keep reimplementing crypto per-application level. We're living in a
world where crypto should be standard for all connection, not application
issue. There are some solutions to this like BEEP framework or new L4 protocol
like QUIC and MinimaLT, any of which I think would be workable as mandatory
transport for TACACS.

Clients:

"official" version that debian and freebsd use. I looked at some of
the others and they all seemed to derive from Cisco's code directly

There is also commercial server 'radiator' which does radius and tacacs
amongst others.

Did everyone already know this but me? If so have you moved to

I think I missed the key revelation. The naive encryption? The limited amount
of software available?

Kerberos? Can Kerberos do everything TACACS+ was doing for router

I think from networker point of view, it's radiator or tacacs, if it has to
work today without new software. And if it can require new software, it can be
pretty much arbitrary new protocol, if sound justification can be found.

I don't think radius nor kerberos nor ssh with certificates supports
command authorization, do they?

Nor accounting...

I think this is probably sufficient justification for TACACS+. I'm not sure if
command authorization is sufficient, as you can deliver group via radius which
maps to authorized commands.
But if you must support accounting, per-command authorization comes as free
gift more or less.

Hi,

> Nor accounting...

I think this is probably sufficient justification for TACACS+. I'm not

sure if

command authorization is sufficient, as you can deliver group via radius

which

maps to authorized commands.
But if you must support accounting, per-command authorization comes as

free

gift more or less.

Yes. Per-command auth and accounting is needed.

So what we need is tacacs over TLS (sctp / ipv6)

I agree tacacs is long in the tooth and needs to be revisited and invested
in. Please take my money (serious)

CB

RADIUS does not support command authorization or accounting.

-jav

Given the problem of remote auth; the restriction of choice of protocols
is dictated by what protocols the relying party device supports.

This is the problem: You are at the mercy of your router vendor, to
support the authentication protocol functionality. Things are
workable, but in a sad state.

Obviously, providing highly robust, highly secure remote authentication, is
not a high priority among the router vendors. They pay lip service to
the whole thing.

In many cases you might be better off with local auth.

How do you feel about having to wait 30 seconds between every command you
enter to troubleshoot, to fail to the second server, if the TACACS or
RADIUS system is nonresponsive, because the dumb router can't remember
which TACACS servers are up and which ones are down, and always tries the
first one in the list first? At least RADIUS has the concept of a
"dead timer" :slight_smile:

By all rights; routers should be implementing authorization using LDAP
over TLS, with a locally cached persistent copy of the directory and
credentials (so users can still log in, and their command exec rights
cached, in case of network outages)..
and authentication with either user SSH public key published in LDAP,
Kerberos/GSSAPI with Smartcard and other 2factor auth/OTP support, or
LDAP BIND using SASL.

RADIUS and TACACS+ are what you get, because they've been there forever,
and frequently enough deemed "good enough".

Some routers have limited Kerberos support; although, usually, not
support for Kerberos ticket forwarding SPNEGO / Negotiate authentication
using GSSAPI over SSH.

(Over encrypted Telnet, Yes)

RADIUS and TACACS+, without IPSEC or TLS encapsulation of all the traffic
are both highly insecure by today's standards, and in theory should not
be used.

Unfortunately; on many network devices, these are your only native
central authentication options!

Fallback plan:
The network should be designed so such connections are not allowed to cross
an untrusted Layer 2 domain.

If an attacker can sniff auth traffic --- TACACS+ is particularly
susceptible to decryption of the entire session including user credentials,
whereas RADIUS is particularly susceptible to the possibility of
authentication replay.

Depending on the router vendor; the available functionality with each
protocol, varies.....

Cisco is most noted for providing rich functionality over TACACS+ for shell
authorization and accounting,
and providing very limited RADIUS support.

It is not that RADIUS is limited --- its that your device vendor's RADIUS
featureset is limited -- which, for all intents and purposes, means,
the features available to you are more limited, if you use such gear.

> Hi,
> it is with radius afaik ...
RADIUS does not support command authorization or accounting.

RADIUS protocol supports accounting; and there is no reason RADIUS
start-stop accounting events cannot be sent for every shell command ---
this is not a protocol limitation, this is a device implementation
limitation.

Some devices can provide per-command authorization by embedding the command
being run in an Access-Request.

RADIUS protocol response messages can encapsulate any attribute-value pair
that can be sent in a TACACS response.
using Vendor-specific attributes.

There is a restriction on IOS devices, that arbitrarily forbids certain
vendor-specific Attribute-value pairs
from being encapsulated in the RADIUS reply message; per-command
authorization is among prevented
software capabilities of the router, not a limitation of the RADIUS
protocol.

http://wiki.freeradius.org/vendor/Cisco#Command-Authorization

' cisco-avpair = "shell:cmd=show"
would do the trick to authorize the "show" command. except that there is a
tiny note for the commands "cmd" and "cmd-arg"
saying that they cannot be used for encapsulation in the Vendor-Specific
space.
These two are the ONLY ones.'

Are you talking about Cisco routers? The default timeout value for TACACS+ is five seconds, so I’m not sure where you’re coming up with thirty seconds, unless you have seven servers listed on the router and the first six are dead/unreachable.

-jav

Are you talking about Cisco routers? The default timeout value for TACACS+
is five seconds, so I’m not sure where you’re coming up with thirty
seconds, unless you have seven servers listed on the router and the first
six are dead/unreachable.

Even 5 seconds extra for each command may hinder operators, to the extent
it would be intolerable; shell commands should run almost
instantaneously.... this is not a GUI, with an hourglass. Real-time
responsiveness in a shell is crucial --- which remote auth should not
change. Sometimes operators paste a buffer with a fair number of
commands, not expecting a second delay between each command --- a
repeated delay, may also break a pasted sequence.

It is very possible for two of three auth servers to be unreachable, in
case of a network break, but that isn't necessary. The "response
timeout" might be 5 seconds, but in reality, there are cases where you
would wait longer, and that is tragic, since there are some obvious
alternative approaches that would have had results that would be more
'friendly' to the interactive user.

(Like remembering which server is working for a while, or remembering
that all servers are down -- for a while, and having a 50ms timeout,
with all servers queried in parallel, instead of a 5 seconds timeout)

-jav

Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :slight_smile:

Even 5 seconds extra for each command may hinder operators, to the extent
it would be intolerable; shell commands should run almost
instantaneously.... this is not a GUI, with an hourglass. Real-time
responsiveness in a shell is crucial --- which remote auth should not
change. Sometimes operators paste a buffer with a fair number of
commands, not expecting a second delay between each command --- a
repeated delay, may also break a pasted sequence.

It is very possible for two of three auth servers to be unreachable, in
case of a network break, but that isn't necessary. The "response
timeout" might be 5 seconds, but in reality, there are cases where you
would wait longer, and that is tragic, since there are some obvious
alternative approaches that would have had results that would be more
'friendly' to the interactive user.

(Like remembering which server is working for a while, or remembering
that all servers are down -- for a while, and having a 50ms timeout,
  with all servers queried in parallel, instead of a 5 seconds timeout)

I think this needs to be part of the specification.

I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.

That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.

Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.

--
-JH

So, here is my TACACS RFC christmas list:

1. underlying crypto
2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX)
3. authentication and authorization caching and/or something else

Picking back up where this left off last year, because I apparently only
work on TACACS during the holidays :slight_smile:

avoiding relatives? :slight_smile:

Even 5 seconds extra for each command may hinder operators, to the extent
it would be intolerable; shell commands should run almost
instantaneously.... this is not a GUI, with an hourglass. Real-time
responsiveness in a shell is crucial --- which remote auth should not
change. Sometimes operators paste a buffer with a fair number of
commands, not expecting a second delay between each command --- a
repeated delay, may also break a pasted sequence.

It is very possible for two of three auth servers to be unreachable, in
case of a network break, but that isn't necessary. The "response
timeout" might be 5 seconds, but in reality, there are cases where you
would wait longer, and that is tragic, since there are some obvious
alternative approaches that would have had results that would be more
'friendly' to the interactive user.

(Like remembering which server is working for a while, or remembering
that all servers are down -- for a while, and having a 50ms timeout,
  with all servers queried in parallel, instead of a 5 seconds timeout)

I think this needs to be part of the specification.

I'm sure the reason they didn't do parallel queries was because of both
network and CPU load back when the protocol was drafted. But it might be
good to have local caching of authentication so that can happen even when
servers are down or slow. Authorization could be updated to send the
permissions to the router for local handling. Then if the server dies while
a session is open only accounting would be affected.

Juniper, at least, does the authorization cache on the device trick...
(or really scoping of commands/areas a user is permitted via a local
cache file in /var/tmp)

That does increase the vendors/implementors work but it might be doable in
phases and with partial support with the clients and servers negotiating
what is possible. The biggest drawback to making things like this better is
you don't gain much except during outages and if you increase complexity too
much you make it wide open for bugs.

and I wonder what percentage of 'users' a vendor has actually USE tac+
(or even radius). I bet it's shockingly low...

Maybe there is a simpler solution that keeps you happy about redundancy but
doesn't increase complexity that much (possibly anycast tacacs, but the
session basis of the protocol has always made that not feasible). It's

does it really? :slight_smile:

possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT
would address these problems too. It's possible that if we did the
transport with BEEP it would also provide this, but I'm reading the docs and
I don't think it goes that far in terms of connection assurance.

So, here is my TACACS RFC christmas list:

1. underlying crypto

juniper, cisco, arista, sun, linux, freebsd still can't get TCP-AO working...
they don't all have ssl libraries in their "os" either...

Getting to some answer other than: "F-it, put it i clear text" for new
protocols on routers really is a bit painful... not to mention ITARs
sorts of problems that arise.

-chris

[snip]

Juniper, at least, does the authorization cache on the device trick...

That seems nice...

and I wonder what percentage of 'users' a vendor has actually USE tac+
(or even radius). I bet it's shockingly low...

Well, the percentage of users doing per-command authorization is
probably much lower than the percentage simply using Tac+ for login
authentication and accounting only or accounting and exec
authorization.

What happens in this case in terms of failure handling is probably OK
for the common scenario.

For many use cases it should probably be a workable tradeoff to simply
have AAA server reply with the shell:priv-lvl=1 or shell:priv-lvl=10,
and make the choice to authorize commands locally by customizing
which commands different privilege level numbers have, and make sure
all devices have the same scheme; limiting AAA usage to once per
shell.

The cases where that's no solution, are most likely PCI or other
higher security environments where the usability problems with
TACACS+ failover simply have to be accepted, use a dedicated OOB
network for AAA servers, and a HA clustered pair of AAA servers
dedicated to each and every site --- sharing a virtual service IP
address.

So, here is my TACACS RFC christmas list:
1. underlying crypto

RADIUS over TCP and DIAMETER have underlying crypto.
Rfc6613: TLS or IPsec transport is shown as mandatory for RADIUS over TCP.

Getting to some answer other than: "F-it, put it i clear text" for new
protocols on routers really is a bit painful... not to mention ITARs
sorts of problems that arise.

The average cheap-o smartphone ships with a TLS library; I think
it's safe to say your router should have one. They shouldn't have
too many problems... after all, this type of equipment already
includes SSH protocol.

So why not have an option for setting up a SSH session to tunnel
authentication requests over?

-chris

2. ssh host key authentication - having the router ask tacacs for an
authorized_keys list for rdrake. I'm willing to let this go because many

I would be content for them to just support OpenSSH CA
certificate-based authorization of a user's SSH key.

If the key is signed by a trusted SSH CA, valid and not expired, and
the session would be valid according to the certificate, then they
can authenticate using one of their listed principals.

Authenticate using key signed by valid certificate as first factor,
perform second factor authentication against Kerberos server,
authorize against LDAP or Tacacs server.

vendors are finding ways to do key distribution, but I'd still like to have
a standard (https://code.google.com/p/openssh-lpk/ for how to do this over
LDAP in UNIX)

SSSD is handling this on Redhat.
It's probably best to consider that how to use an "openssh public ssh
key" is specific to the OpenSSH application.

It makes sense that if the public key is for use with GPG/PGP to
authenticate, etc, then the LDAP attribute should be something
different, again specific to the application and the key format that
application uses.

http://docs.fedoraproject.org/en-US/Fedora/18/html/FreeIPA_Guide/user-keys.html
AuthorizedKeysCommand or PubKeyAgent is used on the openssh server.

But within the single-signon daemon SSSD-Ldap; the LDAP attribute
for a user object's SSH key is a configurable setting.

Within the IPA LDAP schema, there is an added ipaSshPubKey user
attribute. I think this as close as you get to a 'standard' for now.

dn: cn=schema
add:attributeTypes: ( 2.16.840.1.113730.3.8.11.31 NAME 'ipaSshPubKey'
     DESC 'SSH public key' EQUALITY octetStringMatch SYNTAX
1.3.6.1.4.1.1466.115.121.1.40 X-ORIGIN 'IPA v3' )
add:objectClasses: ( 2.16.840.1.113730.3.8.12.11 NAME
'ipaSshGroupOfPubKeys' ABSTRACT MAY ipaSshPubKey X-ORIGIN 'IPA v3' )
add:objectClasses: ( 2.16.840.1.113730.3.8.12.12 NAME 'ipaSshUser'
SUP ipaSshGroupOfPubKeys AUXILIARY X-ORIGIN 'IPA v3' )
add:objectClasses ( 2.16.840.1.113730.3.8.12.13 NAME 'ipaSshHost' SUP
ipaSshGroupOfPubKeys AUXILIARY X-ORIGIN 'IPA v3' )

Rfc6613: TLS or IPsec transport is shown as mandatory for RADIUS over TCP.

sweet. can you ref conforming implementations?

randy

We are able to implement TACAS+. It is my understanding this a fairly old
protocol, so are you saying there are numerous bugs that still need to be
fixed?

A question I have is TACAS+ is usually hosted on a server, and networking
devices are configured to reach out to the server for authentication. My
question is what happens if the device can't reach the server if the
devices network connection is offline? Our goal with TACAS+ is to not have
any default/saved passwords. Every employee will have their own username
and password. That way if an employee gets hired/fired, we can enable or
disable their account. We are trying to avoid having any organization wide
or network wide default username or password. Is this possible? Do the
devices keep of log of the last successful username/password combinations
that worked incase the device goes offline?

Colton,

Yes, that's the 'normal' way of setting it up. Basically you still have to
configure a root user, but that user name and password is kept locked up
and only accessed in case of catastrophic failure of the remote
authentication system. An important note is to make sure that the fail
safe password can't be accessed without having several people engaged so it
can't be used without many people knowing.

Scott Helms
Vice President of Technology
ZCorum
(678) 507-5000

Scott,

Thanks for the response. How do you make sure the failsafe and/or root
password that is stored in the device incase remote auth fails can't be
accessed without having several employees engaged? Are there any mechanisms
for doing so?

My fear would be we would hire an outsourced tech. After a certain amount
of time we would have to let this part timer go, and would disabled his or
her username and password in TACAS. However, if that tech still knows the
root password they could still remotely login to our network and cause
havoc. The thought of having to change the root password on hundreds of
devices doesn't sound appealing either every time an employee is let go. To
make matters worse we are using an outsourced firm for some network
management, so the case of hiring and firing is fairly consistent.

In the Cisco world the AAA config is typically set up to try tacacs first,
and local accounts second. The local account is only usable if tacacs is
unavailable. Knowledge of the local username/password does not equate to
full time access with that credential. Also, you would usually filter the
incoming SSH sessions to only permit a particular management IP range; the
local credential, or tacacs credential, shouldn't be usable from any
arbitrary network.