I have a cloud provider partner hosting a Smartermail 15.x server for us. Along with this I have a backup MX service that we subscribe to which should receive emails in case Smartermail server is down. The way my MX records are setup, the priority assigned to Smartermail is ZERO and the two backup servers are assigned 10 and 20 respectively.
Recently we had to figure out a rash of missing emails that never reach the intended recipients. During this investigation we found out that Office 365 will skip sending the email to the primary server and just send emails directly to the backup servers. When this happens the emails are now being relayed and the SPF policy (default at office 365 is -all) is set to hard fail and the emails just vanish.
We have about 5 domains that we deal with the most and one of them is HomeDepot.com - They use the ~all in their SPF record and their emails will pass through the SPF test and still route when passing through the backup MX server. But the rest of the emails all get rejected (most times without any NDRs) due to the same SPF policy.
I have already implemented some workarounds in these like using ~all instead of -all to make sure that the emails come through but I am at a loss when it comes to figuring out why Office 365 keeps going to the backup server without even trying the primary server. Anyone have had experience with this and know what is going on? I have may max connections on Smartermail set to 1000 so I don't believe that is the issue here. On a daily basis my avg connection is in the 10-15 range. Not exactly an apple to apple correlation here but I don't think that is where the issue lies.
Looking forward to hear from anyone who might have some bright ideas to try.