Today, a fax machine at the office started complaining that it couldn’t send emails. No useful error messages or anything…
After some digging it turns out the fax machine was getting SERVFAIL from the name server. This nameserver carries slaved zones for certain domains used for critical infrastructure, and it turns out there was a problem getting updates from upstream.
Unfortunately the error message was not very helpful:
Mar 11 11:16:04 ns3 named: transfer of 'fubra.it/IN' from 18.104.22.168#53: failed while receiving responses: CNAME and other data Mar 11 11:16:04 ns3 named: transfer of 'fubra.it/IN' from 22.214.171.124#53: end of transfer
After a little digging I found a useful command installed by the bind package: named-checkzone
This command made it easy to see where the error came from. First I grabbed the zone using dig with the axfr option:
dig @ns1.fubra.com fubra.it axfr > /root/db.fubra.it
Next I used named-checkzone to parse the zone and reveal the problem:
named-checkzone -d fubra.it. /root/db.fubra.it
… which returned the following….
loading "fubra.it." from "/root/db.fubra.it" class "IN" dns_master_load: /root/db.fubra.it:69: code.fubra.it: CNAME and other data zone fubra.it/IN: loading master file /root/db.fubra.it: CNAME and other data
Looking at line 69 in the zone file revealed the problem – there was an Address and CNAME record for the same resource code.fubra.it. Two minutes later the problem was fixed.