I've been running a small SMTP and IMAP mail server for many years, hosting a handful of individual mailboxes. It's hard to say when exactly I started. whois says I registered the tablix.org domain in 2005 and I remember hosting a mailing list for my colleagues at the university a bit before that, so I think it's safe to say it's been around 15 years.

Although I don't jump right away on every email-related novelty, I've tried to keep the server up-to-date with well accepted standards over the years. Some of these came for free with Debian updates. Others needed some manual work. For example, I have SPF records and DKIM message signing setup on the domains I use. The server is hosted on commercial static IP space (with the very same IP it first went on-line) and I've made sure with the ISP that correct reverse DNS records are in place.

Image by Andreas Trepte CC BY-SA 2.5

From the beginning I've been worrying that my server would be used for sending spam. So I always made sure I did not have an open relay and put in place throughput restrictions and monitoring that would alert me about unusual traffic. In any case, the amount of outgoing mail has stayed pretty minimal over the years. Since I'm hosting just a few personal accounts these days, there have been less than 1000 messages sent to remote servers over SMTP in the last 12 months. I've given up on hosting mailing lists many years ago.

All of this effort paid off and, as far as I'm aware, my server was never listed on any of the public spam black lists.

So why am I writing all of this? Unfortunately, email is starting to become synonymous with Google's mail, and Google's machines have decided that mail from my server is simply not worth receiving. Being a good administrator and a well-behaved player on the network is no longer enough:

550-5.7.1 [...] Our system has detected that this 550-5.7.1 message is likely unsolicited mail. To reduce the amount of spam sent 550-5.7.1 to Gmail, this message has been blocked. Please visit 550-5.7.1 https://support.google.com/mail/?p=UnsolicitedMessageError 550 5.7.1 for more information. ... - gsmtp

Since mid-December last year, I'm regularly seeing SMTP errors like these. Sometimes the same message re-sent right away will not bounce again. Sometimes rephrasing the subject will fix it. Sometimes all mail from all accounts gets blocked for weeks on end until some lucky bit flips somewhere and mail mysteriously gets through again. Since many organizations use Gmail for mail hosting this doesn't happen just for ...@gmail.com addresses. Now every time I write a mail I wonder whether Google's AI will let it through or not. Only when something like this happens you realize just how impossible it is to talk to someone on the modern internet without having Google somewhere in the middle.

Of course, the 550 SMTP error helpfully links to a wholly unhelpful troubleshooting page. It vaguely refers to suspicious looking text and IP history. It points to Bulk Sender Guidelines, but I have trouble seeing myself as a bulk sender with 10 messages sent last week in total. It points to the Postmaster Tools which, after letting me jump through some hoops to authenticate, tells me I'm too small a fish and has no actual data to show.

So far Google has blocked personal messages to friends and family in multiple languages, as well as business mail. I stopped guessing what text their algorithms deem suspicious. What kind of intelligence sees a reply, with the original message referenced in the In-Reply-To header and part quoted, and considers it unsolicited? I don't discount the possibility that there is something misconfigured at my end, but since Google gives no hint and various third-party tools I've tried don't report anything suspicious I've ran out of ideas where else to look.

My server isn't alone with this problem. At work we use Google's mail hosting and I've seen this trigger happy filter from the other end. Just recently I've overlooked an important mail because it ended up in the spam folder. I guess it was pure luck it didn't get rejected at the SMTP layer. With my work email address I'm subscribed to several mailing lists of open source software projects and regularly Google will decide to block this traffic. I know since Mailman sends me a notification that my address caused excessive bounces. What system decides, after months of watching me read these messages and not once seeing me mark one as spam, that I suddenly don't want to receive them ever again?

I wonder. Google as a company is famously focused on machine learning through automated analytics and bare minimum of human contact. What kind of a signal can they possibly use to train these SMTP rejects? Mail gets rejected at the SMTP level without user's knowledge. There is no way for a recipient to mark it as not-spam since they don't know the message ever existed. In contrast to merely classifying mail into spam/non-spam folders, it's impossible for an unprivileged human to tell the machine it has made a mistake. Only the sender knows the mail got rejected and they don't have any way to report it either. One half of the feedback loop appears to be missing.

I'm sure there is no malicious intent behind this and that there are some very smart people working on spam prevention at Google. However for a metric driven company where a majority of messages are only passed with-in the walled garden, I can see how there's little motivation to work well with mail coming from outside. If all training data is people marking external mail as spam and there's much less data about false positives, I guess it's easy to arrive to a prior that all external mail is spam even with best intentions.

This is my second rant about Google in a short while. I'm mostly indifferent to their search index policies, however this mail problem is much more frustrating. I can switch search engines, but I can't tell other people to go off Gmail. Email used to work, from its 7-bit days onward. It was one standard thing that you could rely on in the ever changing mess of messaging web apps and proprietary lock-ins. And now it's increasingly broken. I hope people realize that if they don't get a reply, perhaps it's because some machine somewhere decided for them that they don't need to know about it.