Shopify breaks email for whole domain
Shopify and Google G Suite broke everything email:
Today we had a client that reported ALL internal email from one user to another user was going to SPAM folder. Our monitoring had caught the issue as well, and we were working on fixing the issue. So what happened to break all the email? First, the client’s email is hosted by G Suite. After a bit of tracking, found out that Shopify SPF broke everything. How exactly is it possible that Shopify broke the client’s G Suite internal user to user email?
SPF and DMARC
Our customer’s SPF record was: “v=spf1 include:shops.shopify.com include:_spf.google.com include:sendgrid.net ~all” and our DMARC record was: “v=DMARC1; rua=mailto:firstname.lastname@example.org; ruf=mailto:email@example.com; p=quarantine; sp=none; fo=1;”
Earlier this week (up to yesterday) shops.shopify.com had an SPF record. Today it does not. Earlier this week, everything email was working well. Today is not. So what was happening today:
- One G Suite user would send email to another internal G Suite user (at the same company and same domain).
- The outgoing G Suite server would send it ‘out’ to the incoming G Suite server.
- The incoming G Suite server would then receive the email, look up the SPF record.
- Then it gets a little more complex – G Suite would “permerror” the entire SPF record lookup, just because one of the includes did not resolve.
- The receiving G Suite server would lookup the Dmarc – see our Quarantine policy and put the email in the spam folder.
Should G Suite have created a “permerror” and flagged the email (from G Suite mind you) just because one unrelated record (from Shopify) did not look up? In my opinion NO! RFC 7208 also has recommendations about ‘void’ lookups (aka RCODE 0 or 3), and the RFC does recommend limiting ‘void’ lookups to TWO before giving a “permerror”. But it appears that G Suite is limiting to a single ‘void’ lookup before completely giving up and issuing a “permerror”. Not good in this case!!
Of course the root problem was the Shopify SPF record disappearing and that record is required for the Internet to receive validated email. But G Suite is still fragile as demonstrated by being broken so easily. Still annoying that I had to a spend a couple of hours on this just to keep email flowing and inform everyone what was going on. Shopify has to fix their problem, but it would be nice if Google updated code so it does not break G Suite when an external company makes a mistake.
We monitor all records and email, allowing us to catch the records issue quickly – but not before a few internal emails went to spam. Worse is that some emails to customers (from GSuite, sendgrid.net and Shopify) were rejected outright – like purchase receipts! This effected all email for the domain – how it was treated depended on the receiving email servers – some put it in Junk, some rejected the email outright.
Especially ironic that internal emails within the same domain were put in Spam folder.
We removed Shopify from our SPF record and I recommend you do the same. Then we had to set our DMARC record policy to “none”. The first fixed the issue with receiving G Suite email. The second is a fix for Shopify (otherwise those emails would have been sent to spam). So for the moment that domain’s email sending is not as secure as we would like, but we have to wait for Shopify to fix things.
I guess we have to keep checking the Shopify record and see if it gets fixed. We will check it every day and see when they fix it. (We also opened a ticket with Shopify which they said got ‘escalated’.)
As for Google – there is no easy way for us to externally know if they have updated the void lookup problem – but hopefully they will so this is not such a problem in the future.
Update: Shopify got back to us and told us it was our fault and said the order of our SPF TXT record elements was wrong. This is non-sense, email receivers evaluate all SPF “includes:”, regardless of order. SHOPIFY has their SPF record for shops.shopify.com MISSSING IN ACTION and they need to put it back or everything will remain broken.
Why are some large companies that rely on email as a core part of their business so clueless about what is these days just basic functionality?