Troubleshooting DNS Using dig to Figure out Dropped Emails

"Matt, some of my emails to you are bouncing back, saying that the mail server rejected the recipient". This isn't a message you want to get from the CTO of a company you applied to for a DevOps position. Yet, that's where I found myself a little over a year ago - I had just moved to a new country, I was still up to my neck in the moving related chores, and I was looking for a job at the same time.

I didn't have the spare energy to invest into this, even though it was pretty critical in the job search. I messed around with the SPF records and thought I was done. Thankfully, the CTO that alerted me to this took the time to walk me through the issue - thanks Oliver!

I've learned a lot more about DNS since then and I decided to look back at the problem and describe it and the solution to illustrate how to troubleshoot general DNS issues. This writeup is based on Oliver's walkthrough.

First, a description of the problem. Some emails would bounce back with a message stating that:

Google tried to deliver your message, but it was rejected by the server for the recipient domain mattscodecave.com by smtp.secureserver.net.

I've bolded two important pieces of information in that message. The first one is the domain for my email address. The second one is the name of the server actually handling the email. That server is telling the sender that it doesn't accept emails for the domain mattscodecave.com. No wonder - my email is taken care of by Gmail. So why is some poor server getting messages for a recipient that doesn't belong to it?

Email uses DNS to route emails to the correct server. If emails are going to the wrong server, something must be wrong in the DNS records. The DNS records responsible for routing emails are called MX records. Let's use the *nix dig utility to check the MX records for the domain mattscodecave.com:

dig MX mattscodecave.com # <snip> ;; ANSWER SECTION: mattscodecave.com. 172817 IN MX 10 alt2.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt3.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt4.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 1 aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt1.aspmx.l.google.com. # <snip>

Those entries are correct. They tell all the servers out there to push any emails headed my way to Google's mail servers.

Usually, more than one authoritative name server is in charge of a domain, so lets get all the name servers (NS records) for my domain:

dig NS mattscodecave.com #<snip> ;; ANSWER SECTION: mattscodecave.com. 172800 IN NS ns1.dnsowl.com. mattscodecave.com. 172800 IN NS ns2.dnsowl.com. mattscodecave.com. 172800 IN NS ns3.dnsowl.com. #<snip>

Now we have to ask each of these servers for its MX records and make sure they are correct.

dig NS @ns1.dnsowl.com mattscodecave.com # <snip> ;; ANSWER SECTION: mattscodecave.com. 172817 IN MX 10 alt2.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt3.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt4.aspmx.l.google.com. mattscodecave.com. 172817 IN MX 1 aspmx.l.google.com. mattscodecave.com. 172817 IN MX 10 alt1.aspmx.l.google.com. # <snip>

The entries are good. We repeat the same procedure for servers ns2.dnsowl.com and ns3dnsowl.com. All the records come back good. Bummer.

But wait! What if there's some other DNS server handing out bad information? To find that out, we have to query the root dns servers. We can use the same dig NS @ command, but we have to get the address of a root server first. IANA sports a list of them so lets pick the first one and run with it:

dig NS @a.root-servers.net mattscodecave.com # <snip> ;; AUTHORITY SECTION: com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. # <snip>

The root server response points us at the authoritative servers for the .com domain. We use the first one from that list and repeat the query:

dig NS @a.gtld-servers.net mattscodecave.com # <snip> ;; AUTHORITY SECTION: mattscodecave.com. 172800 IN NS ns1.digitalocean.com. mattscodecave.com. 172800 IN NS ns2.digitalocean.com. mattscodecave.com. 172800 IN NS ns3.digitalocean.com. mattscodecave.com. 172800 IN NS ns1.dnsowl.com. mattscodecave.com. 172800 IN NS ns2.dnsowl.com. mattscodecave.com. 172800 IN NS ns3.dnsowl.com. # <snip>

AHA! There are three other authoritive name servers that represent my domain to the world. They belong to Digitalocean. At that time, I was hosting this blog on one of their VPSs and I had recently moved my domain to another registrar, but I forgot to clean up the DNS entries.

Let's see what kind of MX records these Digitalocean servers have:

dig MX @ns1.digitalocean.com mattscodecave.com # <snip> ;; ANSWER SECTION: mattscodecave.com. 1800 IN MX 10 mailstore1.secureserver.net. mattscodecave.com. 1800 IN MX 0 smtp.secureserver.net. # <snip>

This is it! The Digitalocean DNS servers were pointing at the wrong email server. Since there were 6 authoritive servers, 3 of which had the correct records and 3 the incorrect ones, some emails would go through while others wouldn't.

I could have updated the MX records on the Digitalocean server, but instead I logged into my Digitalocean account and disabled the name servers for my domain. Three name servers is plenty enough. A little while later, I queried the a.gtld-server.net server about the authoritive name servers for my domain again ( dig NS @a.gtld-server.net mattscodecave.com ) and the only entries in the AUTHORITY section were the correct dnsowl.com servers. Victory!

The above should be enough to troubleshoot a lot of DNS issues. Here are some resources to learn a bit more:

This work is licensed under a Creative Commons Attribution 4.0 International License.