ISC DHCP server failover

Summers_William · March 17, 2010, 2:01pm

Greetings Nanog members,

I am wondering if anyone has implemented the failover features of ISC DHCP? And if so, how successful has failover been in your environment?

Many thanks,

William Summers
Network Administrator
Information Technology Services
Deerfield Academy

sthaug · March 17, 2010, 2:09pm

I am wondering if anyone has implemented the failover features of ISC DHCP? And if so, how successful has failover been in your environment?

Yes, some of us have implemented DHCP failover using ISC DHCP. However,
you are much more likely to get answers to ISC DHCP questions if you ask
on the dhcp-users@lists.isc.org list. See

dhcp-users Info Page

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Dan_White · March 17, 2010, 2:22pm

We've been running version 4 in a failover scenario for a couple of years.
It's worked very well for us in an ISP environment. We serve several
networks from a pair of servers.

We've experienced two types of problems from time to time:

The servers stop balancing their addresses, and one server starts to
exhibit 'peer holds all free leases' in its logs, in which case we need to
restart the dhcpd process(es) to force a rebalance.

In some cases, and I'm not sure which equipment may be to blame, if one
server goes down then the other server will not hand out addresses to
clients which had originally received addresses from the failed server.
We've dealt with that by balancing our lease times with our MTTR for a
failed server.

Blake_Covarrubias · March 17, 2010, 9:08pm

We've experienced two types of problems from time to time:

The servers stop balancing their addresses, and one server starts to
exhibit 'peer holds all free leases' in its logs, in which case we need to
restart the dhcpd process(es) to force a rebalance.

We experience this problem from time to time as well, and I have yet to find its cause. We also 'fix' it by restarting the dhcpd daemon.

In some cases, and I'm not sure which equipment may be to blame, if one
server goes down then the other server will not hand out addresses to
clients which had originally received addresses from the failed server.
We've dealt with that by balancing our lease times with our MTTR for a
failed server.

From the dhcpd.conf man page it seems the solution to this problem is to put the remaining active server into the PARTNER-DOWN state.

Aside from the 'peer holds all free leases' error Dan mentioned, ISC DHCP's load balancing and failover has worked very well for us.

Raymond_Dijkxhoorn · March 17, 2010, 10:49pm

Hi!

I am wondering if anyone has implemented the failover features of ISC DHCP? And if so, how successful has failover been in your environment?

We run it on various locations and this works pretty well.
Student dormitory's, and so on.

Bye,
Raymond.

David_W_Hankins · March 19, 2010, 11:26pm

The servers stop balancing their addresses, and one server starts to
exhibit 'peer holds all free leases' in its logs, in which case we need to
restart the dhcpd process(es) to force a rebalance.

If restarting one or both dhcpd processes corrects a pool balancing
problem, then I suspect what you're looking at is a bug where the
servers would fail to schedule a reconnection if the failover socket
is lost in a particular way. Because the protocol also uses a message
exchange inside the TCP channel to determine if the socket is up
(rather than just TCP keepalives) this can sometimes happen even
without a network outage during load spikes or other brief hiccups on
the partner DHCP server.

So far as I know that particular problem is fixed in current
maintenance releases (4.1.1, 4.0.2), so I'm curious if you are
using them.

But it's also possible that your pools need rebalancing more often
than the default minimum rebalance interval.

In some cases, and I'm not sure which equipment may be to blame, if one
server goes down then the other server will not hand out addresses to
clients which had originally received addresses from the failed server.
We've dealt with that by balancing our lease times with our MTTR for a
failed server.

To me this sounds like another symptom of the failover connection
between the servers failing and not being reconnected. It would
explain why the server doesn't have the partner's "recently active"
bindings, it wouldn't have received updated information if the socket
was inactive.

My own opinion of the failover software is that it may be clunky and
hard to use (we're working on that), but it is reliable.

One of the warnings I want to give when someone is interested in
deploying failover pairs of ISC DHCP servers is to still be prepared
to react to a server failure. Unlike other fault tolerance protocols
where losing communication with the peer can safely be assumed that
the peer is off-line, DHCP servers can go out of communication with
each other while still being able to reach clients.

To cope with this, failover segregates the idea of entering a
"communications-interrupted" state from a "partner-down" state, and
many rules govern how servers can allocate or extend leases to ensure
there are no addressing conflicts (caused by the servers giving one
IP address to two different clients without the knowledge of the
other server).

Failover essentially bridges the gap of a server outage by giving each
server in the pair roughly half of the remaining pool of unallocated
addresses, which they individually allocate from normally and when
operating in communications-interrupted.

Either server can continue to extend already active leases, letting
clients keep the addresses they already have, but if it runs out of
free* leases, or if the current clients' leases are allowed to expire,
it won't be able to admit any new clients or extend expired leases.

Because the software can't detect if its peer is truly off-line, the
operator must manually move the surviving server to partner-down to
inform it that it's operating alone in order to use expired or leases
in the peer's free pool.

Most people have a bad experience with failover, and therefore form
a poor opinion of it, because of experiencing such an event without
knowing of the need to transition the server state explicitly during
an outage. It isn't as automatic as the word 'failover' makes it
sound.

For now the lesson is that failover gives you precious hours or days
to find a terminal and repair the partner server or put the surviving
server into partner-down. It's also quite a good idea to monitor the
failover state of your servers and ensure they aren't spending a great
deal of time in communications-interrupted.

NEW in 4.2.0:

There is a new configuration option intended for servers sharing a
"Heartbeat Cable", or similar situations where the operator is
convinced with certainty that a failover socket disconnect likely
implies the peer is truly down. A configurable timeout can now be
entered such that the server automatically enters partner-down.
Note that the failover protocol still requires an "MCLT delay" before
the server is allowed to use the peer's free leases, but this is
not normally a problem in usual operation.

There is a new optimization that significantly increases endurance
during communications-interrupted and in many cases could mean you
could remain in that state indefinitely, but it won't help you for
example if the active load requires the full free pool; it can't
allocate the peer's leases.

So it doesn't replace entering partner-down for long-term outages.

These features were both provided in 4.2.0 (a1 and a2 respectively
as memory serves), currently an alpha which I hope to move to its
first beta soon.

* I'm simplifying a lot to make this shorter and easier to read, and
also using language that failover overloads. If you really want to
know all about failover internals, ask us on dhcp-users.

Mike2 · March 20, 2010, 12:10am

David W. Hankins wrote:

  The servers stop balancing their addresses, and one server starts to
exhibit 'peer holds all free leases' in its logs, in which case we need to
restart the dhcpd process(es) to force a rebalance.

If restarting one or both dhcpd processes corrects a pool balancing
problem, then I suspect what you're looking at is a bug where the
servers would fail to schedule a reconnection if the failover socket
is lost in a particular way. Because the protocol also uses a message
exchange inside the TCP channel to determine if the socket is up
(rather than just TCP keepalives) this can sometimes happen even
without a network outage during load spikes or other brief hiccups on

With all due respect and acknowledgment of the tremendous contributions of ISC and you yourself Mr. Hankins, I have to comment that failover in isc-dhcp is broken by design because it requires the amount of handholding and operator thinking in the event of a failure that you explained to us at length is required. Failure needs to be handled automatically and without any intervention at all, otherwise you might as well not have it and I think most network operators would agree.

I am certainly not prepared to develop proof of concept code or go the full route of developing such a server myself, however, I belive firmly that a failover implementation in dhcp could be designed as a counterpoint to the current implementation that is reliable, simple, scalable and requiring no special procedures once a 'break' occurs. The method used by isc-dhcpd, I think, creates the problem of the potential for unreliable failover because it's not designed for the 'right' problem. But there are example implementations - such as vrrp/carp - that would form the basis of trustworthy dhcp failover protocol. Your key issues are a) broadcast discovery packets, which every listening host on the lan segment (such as 1 or more slaves) can easily respond to, and b) unicast frames from relay agents and others, which could easily be handled by a virtual mac/shared ip address by a group of slaves. This means that redundancy of more than 2 hosts is already possible. The last pieces are protocol for servers to join and leave the pool of hosts serving dhcp, a master election protocol that pre-determines the order of slaves to fail over to in order to avoid the half-brain syndrome, a sanity checking protocol to ensure the elected master is sane and kicking (eg: the slaves all hit the master with, what else, dhcp requests), and a well defined group database update protocol over the network so that leases hit some fixed storage somewhere, sometime.

Just my $0.02 worth.

Mike-

sthaug · March 20, 2010, 8:43am

With all due respect and acknowledgment of the tremendous contributions
of ISC and you yourself Mr. Hankins, I have to comment that failover in
isc-dhcp is broken by design because it requires the amount of
handholding and operator thinking in the event of a failure that you
explained to us at length is required. Failure needs to be handled
automatically and without any intervention at all, otherwise you might
as well not have it and I think most network operators would agree.

Note that this method of handling failover is inherent in the original
DHCP failover design. See

http://tools.ietf.org/id/draft-ietf-dhc-failover-12.txt

Specifically, quoting from the above draft,

"While this technique works in some domains, having the only server to
which a DHCP client can communicate voluntarily shut itself down seems
like something worth avoiding.

The failover protocol will operate correctly while both servers are
unable to communicate, whether they are both running or not. At some
point there may be resource contention, and if one of the servers is
actually down, then the operator can inform the operational server and
the operational server will be able to use all of the failed server's
resources."

I certainly cannot speak for "most network operators". However, I will
note that I have been aware of this behavior of the IDC DHCP server
as long as I have been running it in failover mode.

I am certainly not prepared to develop proof of concept code or go the
full route of developing such a server myself, however, I belive firmly
that a failover implementation in dhcp could be designed as a
counterpoint to the current implementation that is reliable, simple,
scalable and requiring no special procedures once a 'break' occurs.

And which implements failover protocol in the IETF draft?

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Dan_White · March 20, 2010, 6:20pm

I don't want to defend bad code where it may exist, but I view the problems
we've encountered with ISC DHCP to be minor compared to the benefits.

It may not be fair to compare DHCP failover to redundancy in a routing
scenario. In a routing failure, I'd be highly motivated to find the root
cause, open tickets, and get the problem fixed.

In a scenario where a couple of customers are unable to pull an IP address,
every few months, I'm OK with manual intervention as long as state is
maintained. I'd argue that it's more important to maintain data integrity
(no two servers think they own the same IP) than availability (where one
server is too aggressive and corrupts data).

That's true of much of the open source software I use, such as cyrus
(email) replication and openldap synchronization.

Given the resources I and others in my company have to deal with issues,
it's always a matter of putting out the biggest fire. If/when problems with
DHCP failover become a big enough issues, we'll spend the time to find out
what in our network is causing the issue and fix it, or find out what the
bug is in the software and open a bug report.

All problems are fixable given enough resources, and enough motivation.

Leo_Bicknell1 · March 20, 2010, 8:25pm

In a message written on Fri, Mar 19, 2010 at 05:10:04PM -0700, Mike wrote:

I am certainly not prepared to develop proof of concept code or go the
full route of developing such a server myself, however, I belive firmly
that a failover implementation in dhcp could be designed as a
counterpoint to the current implementation that is reliable, simple,
scalable and requiring no special procedures once a 'break' occurs. The
method used by isc-dhcpd, I think, creates the problem of the potential
for unreliable failover because it's not designed for the 'right'
problem. But there are example implementations - such as vrrp/carp -
that would form the basis of trustworthy dhcp failover protocol. Your

[snip technical bits]

Your method might work good where there is a LAN segment with two
DHCP servers on it, and I'm sure that's how some people operate.
However your method doesn't cover a much more common, and difficult
case.

Consider a DHCP server in Chicago and one in New York, performing
DHCP for clients in Chicago, Cleveland, Pittsburg, Buffalo, Albany,
and New York. When the network is broken, Chicago may still need
to serve Cleveland and Pittsburg, and New York may need to serve
Buffalo and Albany, and yet Chicago and New York cannot communicate
during that time. Also, you want to be sure when they come back
together there are no conflicts, for instance maybe Rodchester can
reach both Chicago and New York while those two cannot talk to each
other.

LAN discovery does not work for servers 1000 miles apart. All-or
nothing failure doesn't work, when each server may see part of the
clients.

I do think the DHCP failover protocol was perhaps over-engineered
which I think is the jist of your post, but unfortunately unlike
VRRP it's not always two things on the same local LAN. Perhaps
there is a market for DHCP redundancy "lite" where it only handles the
case of two servers on the same LAN, I dunno.

David_W_Hankins · March 21, 2010, 5:33pm

First let me say that I wasn't involved in failover's design, I'm only
a sort of "maintainer," so the criticism is not offending me in the
slightest.

Failover definitely busied itself with the cross-country,
geographically diverse DHCP server situation, hoping that by solving
that they are also giving "HA", heartbeat-cable types of folks a tool
they can also use, although it isn't explicitly designed for that
purpose alone. That does tend to leave this community a little
under-served and unhappy, which was my motivation for failover
features in 4.2 to try and support their needs better (auto partner-
down, greater endurance in comms-interrupted).

What you describe for an alternative (although I will criticize it
slightly in suggesting you are under-estimating DHCP's needs; the
question of message delivery is really not relevant) are the building
blocks for something I would refer to as "DHCP Server Clustering".

I fully endorse it.

That is a set of separate programs that work together to appear from
the outside to be a single DHCP server (as those terms are defined in
RFC), and the ways in which you can build-in redundancy and self-
healing (self-restarting components, component failures only affect a
subset of services, redundant processes that cover gaps in coverage,
etc).

In short, you're describing one of our key motivations for migrating
ISC DHCP to the BIND 10 framework.

That gives us a complete set of tools. Within the same rack, you will
ultimately be able to implement a "single server" from all outside
observance that is actually implemented in a redundant way across
(N+1) systems* or CPU's within one system, while still maintaining a
failover ability to tie two such geographically diverse clusters
together (not to mention co-habitation with BIND 10's DNS services
in the same configuration and monitoring plane) that don't actually
have to be clusters if you don't want all that baggage either.

So everyone's happy.

Unfortunately at the moment we are still collecting sponsors for the
DHCP-in-BIND-10 project, and no shovels have been turned. But I'm
confident the work will proceed (and if anyone wishes to help as a
sponsor or a participant, please contact us! We are in Anaheim this
week, and there is also a link in my signature you can click).

In the meantime, failover is a tool we have whereas DHCP clustering
software is so far only a tool we want to create.

* Some objects in the future-mirror may be further away than they
appear.