A site for solving at least some of your technical problems...
A site for solving at least some of your technical problems...
Last night we had a pretty large scale power issue. It went down for over 1h.
I restarted the computer in the middle of the night, but when I came in the office in the morning, the LAN computers and phones were not connecting to the Internet.
I fairly quickly ruled out the firewall as an issue, it was in place as expected.
Looking further, I noticed that my PC's would find some DHCP info, but the DNS IP addresses were the local IP address (such as 10.0.10.1 instead of my Internet provider DNS IPs). So the culprit had to be the DHCP server.
Sure enough, the server was not even running.
The first error I was having was a Permission Error. It was not able to open the lease file for appending ("a"—this means write permission are required).
Here are messages I can see in my /var/log/syslog file. It is also part of the dhcpd journal file in systemctl. We can see that the error is:
Can't open /var/lib/dhcp/dhcpd.leases for append.
The rest look okay, except for the apparmor error...
Jul 8 07:20:14 monster dhcpd[24395]: Internet Systems Consortium DHCP Server 4.4.1 Jul 8 07:20:14 monster dhcpd[24395]: Copyright 2004-2018 Internet Systems Consortium. Jul 8 07:20:14 monster dhcpd[24395]: All rights reserved. Jul 8 07:20:14 monster dhcpd[24395]: For info, please visit https://www.isc.org/software/dhcp/ Jul 8 07:20:14 monster dhcpd[24395]: Config file: /etc/dhcp/dhcpd.conf Jul 8 07:20:14 monster dhcpd[24395]: Database file: /var/lib/dhcp/dhcpd.leases Jul 8 07:20:14 monster dhcpd[24395]: PID file: /run/dhcp-server/dhcpd.pid Jul 8 07:20:14 monster kernel: [17540.573341] audit: type=1400 audit(1720448414.202:598): apparmor="DENIED" operation="capable" profile="/usr/sbin/dhcpd" pid=24395 comm="dhcpd" capability=1 capname="dac_override" Jul 8 07:20:14 monster dhcpd[24395]: Internet Systems Consortium DHCP Server 4.4.1 Jul 8 07:20:14 monster dhcpd[24395]: Copyright 2004-2018 Internet Systems Consortium. Jul 8 07:20:14 monster dhcpd[24395]: All rights reserved. Jul 8 07:20:14 monster dhcpd[24395]: For info, please visit https://www.isc.org/software/dhcp/ Jul 8 07:20:14 monster dhcpd[24395]: Can't open /var/lib/dhcp/dhcpd.leases for append. Jul 8 07:20:14 monster dhcpd[24395]: Jul 8 07:20:14 monster dhcpd[24395]: If you think you have received this message due to a bug rather Jul 8 07:20:14 monster dhcpd[24395]: than a configuration issue please read the section on submitting Jul 8 07:20:14 monster dhcpd[24395]: bugs on either our web page at www.isc.org or in the README file Jul 8 07:20:14 monster dhcpd[24395]: before submitting a bug. These pages explain the proper Jul 8 07:20:14 monster dhcpd[24395]: process and the information we find helpful for debugging. Jul 8 07:20:14 monster dhcpd[24395]: Jul 8 07:20:14 monster dhcpd[24395]: exiting.
The apparmor error is what made me think that it's probably apparmor that prevents the append. However, looking at the setup file, it looked just fine.
# vim:syntax=apparmor # Last Modified: Mon Jan 25 11:06:45 2016 # Author: Jamie Strandboge <jamie@canonical.com> #include <tunables/global> /usr/sbin/dhcpd { #include <abstractions/base> #include <abstractions/nameservice> #include <abstractions/ssl_keys> capability chown, capability net_bind_service, capability net_raw, capability setgid, capability setuid, network inet raw, network packet packet, network packet raw, @{PROC}/[0-9]*/net/dev r, @{PROC}/[0-9]*/net/{dev,if_inet6} r, owner @{PROC}/@{pid}/comm rw, owner @{PROC}/@{pid}/task/[0-9]*/comm rw, # LP: #1926139 @{PROC}/cmdline r, /etc/hosts.allow r, /etc/hosts.deny r, /etc/dhcp/ r, /etc/dhcp/** r, /etc/dhcpd{,6}.conf r, /etc/dhcpd{,6}_ldap.conf r, /usr/sbin/dhcpd mr, /var/lib/dhcp/dhcpd{,6}.leases* lrw, /var/log/ r, /var/log/** rw, /{,var/}run/{,dhcp-server/}dhcpd{,6}.pid rw, # isc-dhcp-server-ldap /etc/ldap/ldap.conf r, # LTSP. See: # http://www.ltsp.org/~sbalneav/LTSPManual.html # https://wiki.edubuntu.org/ /etc/ltsp/ r, /etc/ltsp/** r, /etc/dhcpd{,6}-k12ltsp.conf r, /etc/dhcpd{,6}.leases* lrw, /ltsp/ r, /ltsp/** r, # Eucalyptus /{,var/}run/eucalyptus/net/ r, /{,var/}run/eucalyptus/net/** r, /{,var/}run/eucalyptus/net/*.pid lrw, /{,var/}run/eucalyptus/net/*.leases* lrw, /{,var/}run/eucalyptus/net/*.trace lrw, # wicd /var/lib/wicd/* r, # access to bind9 keys for dynamic update # It's expected that users will generate one key per zone and have it # stored in both /etc/bind9 (for bind to access) and /etc/dhcp/ddns-keys # (for dhcpd to access). /etc/dhcp/ddns-keys/** r, # allow packages to re-use dhcpd and provide their own specific directories #include <dhcpd.d> # Site-specific additions and overrides. See local/README for details. #include <local/usr.sbin.dhcpd> }
In our case, the important line is the following:
/var/lib/dhcp/dhcpd{,6}.leases* lrw,
This matches our leases files (including the backups) and gives "l", "r", and "w" permissions (link, read, and write—which is sufficient for an append).
No idea what would make apparmor think that we did not have write permissions. I rebooted and that fixed this issue. I'm thinking that since the power went off all at once, the computer did not have a clean reboot cycle. That worked. Once back, this permission error disappeared.
Note: in newer versions of the ISC DHCP service, they make sure that the files exist and have the proper user/group permissions, etc. Therefore, there is 0 reasons to temper with chown/chmod against the lease files. These get reset anyway each time you start the service (systemctl start isc-dhcp-server includes commands to that effect).
This was one really weird issue. I don't ever recall getting such an error from apparmor. I'm glad just rebooting worked, though. Easy enough, but it took me a while to think I should do that.
The next error, and I'm glad I have access to the source code because, because that way I could see that was indeed a fatal error... really not clear in the logs.
Interface eno2 matches multiple shared networks
Here is the full output and as you can see, that one line is repeated but also burried between a much larger set of other messages not clearly stating that this is the error.
Jul 8 08:34:37 monster systemd[1]: Started ISC DHCP IPv4 server. Jul 8 08:34:37 monster dhcpd[11474]: Internet Systems Consortium DHCP Server 4.4.1 Jul 8 08:34:37 monster sh[11474]: Internet Systems Consortium DHCP Server 4.4.1 Jul 8 08:34:37 monster sh[11474]: Copyright 2004-2018 Internet Systems Consortium. Jul 8 08:34:37 monster sh[11474]: All rights reserved. Jul 8 08:34:37 monster sh[11474]: For info, please visit https://www.isc.org/software/dhcp/ Jul 8 08:34:37 monster dhcpd[11474]: Copyright 2004-2018 Internet Systems Consortium. Jul 8 08:34:37 monster dhcpd[11474]: All rights reserved. Jul 8 08:34:37 monster dhcpd[11474]: For info, please visit https://www.isc.org/software/dhcp/ Jul 8 08:34:37 monster dhcpd[11474]: Config file: /etc/dhcp/dhcpd.conf Jul 8 08:34:37 monster sh[11474]: Config file: /etc/dhcp/dhcpd.conf Jul 8 08:34:37 monster sh[11474]: Database file: /var/lib/dhcp/dhcpd.leases Jul 8 08:34:37 monster sh[11474]: PID file: /run/dhcp-server/dhcpd.pid Jul 8 08:34:37 monster dhcpd[11474]: Database file: /var/lib/dhcp/dhcpd.leases Jul 8 08:34:37 monster sh[11474]: Wrote 0 deleted host decls to leases file. Jul 8 08:34:37 monster sh[11474]: Wrote 0 new dynamic host decls to leases file. Jul 8 08:34:37 monster sh[11474]: Wrote 0 leases to leases file. Jul 8 08:34:37 monster dhcpd[11474]: PID file: /run/dhcp-server/dhcpd.pid Jul 8 08:34:37 monster dhcpd[11474]: Internet Systems Consortium DHCP Server 4.4.1 Jul 8 08:34:37 monster dhcpd[11474]: Copyright 2004-2018 Internet Systems Consortium. Jul 8 08:34:37 monster dhcpd[11474]: All rights reserved. Jul 8 08:34:37 monster dhcpd[11474]: For info, please visit https://www.isc.org/software/dhcp/ Jul 8 08:34:37 monster dhcpd[11474]: Wrote 0 deleted host decls to leases file. Jul 8 08:34:37 monster dhcpd[11474]: Wrote 0 new dynamic host decls to leases file. Jul 8 08:34:37 monster dhcpd[11474]: Wrote 0 leases to leases file. Jul 8 08:34:37 monster dhcpd[11474]: Interface eno2 matches multiple shared networks Jul 8 08:34:37 monster sh[11474]: Interface eno2 matches multiple shared networks Jul 8 08:34:37 monster sh[11474]: If you think you have received this message due to a bug rather Jul 8 08:34:37 monster sh[11474]: than a configuration issue please read the section on submitting Jul 8 08:34:37 monster sh[11474]: bugs on either our web page at www.isc.org or in the README file Jul 8 08:34:37 monster sh[11474]: before submitting a bug. These pages explain the proper Jul 8 08:34:37 monster sh[11474]: process and the information we find helpful for debugging. Jul 8 08:34:37 monster sh[11474]: exiting. Jul 8 08:34:37 monster dhcpd[11474]: Jul 8 08:34:37 monster dhcpd[11474]: If you think you have received this message due to a bug rather Jul 8 08:34:37 monster dhcpd[11474]: than a configuration issue please read the section on submitting Jul 8 08:34:37 monster dhcpd[11474]: bugs on either our web page at www.isc.org or in the README file Jul 8 08:34:37 monster dhcpd[11474]: before submitting a bug. These pages explain the proper Jul 8 08:34:37 monster dhcpd[11474]: process and the information we find helpful for debugging. Jul 8 08:34:37 monster dhcpd[11474]: Jul 8 08:34:37 monster dhcpd[11474]: exiting. Jul 8 08:34:37 monster systemd[1]: isc-dhcp-server.service: Main process exited, code=exited, status=1/FAILURE Jul 8 08:34:37 monster systemd[1]: isc-dhcp-server.service: Failed with result 'exit-code'.
As you can imagine, the error is quite misleading. The issue is not that there are multiple networks on the eno2 interface. I do have like 7 IPs assigned to that interface.
Reading between the lines, what the error is saying is that I have two subnet definitions in my /etc/dhcp/dhcpd.conf file and they both point to the same interface (eno2). That is not supported so this is a fatal error.
A subnet definition can look like this:
subnet 192.168.1.0 netmask 255.255.255.0 { }
This tells the DHCP service to communicate with computers that match the subnet and netmask. But the way the DHCP system works is not specific to an IP address. Instead, it is specific to an interface. So we want to use the following command:
ip address
and search for an entry that matches 192.168.1.0/24. The attached interface is the one that will receive the IP addresses as defined by that subnet block definition. Therefore, it makes sense that you would not be able to have two subnets defined against the same interface. DHCP would not be able to decide to support IP A instead of IP B.
All I had to do is make sure that each subnet pointed to a different interface or comment out the offensive one. Note that if some of your interfaces do not have a matching subnet definition, you get a warning. That shows in red, but it can be ignored.
I'm not too sure I understand why it started failing in my case. I don't recall moving one IP address to another interface. But I probably did that. It did not break DHCP until the next reboot. That makes sense. It did not try to restart and as a result it did not notice that there was a duplicated subnet definition.
My main problem here is the error message. I really does not mean what one would think it would...