Inbound MX servers getting hung up
Problem reported by George Ascione - 10/1/2015 at 5:38 PM
I've been experiencing a weird issue since I upgraded from v12 to v14.  My MX servers (simple exim based machines) pass all the inbound mail traffic to my SmarterMail server.  They were never configured as anything radical, just whitelisted on the SmarterMail box to keep from getting throttled.  Ever since I upgraded to v14 my MX servers sometimes get kicked from SMTP connections with the error "Remote host mailserver.xxxxxxxxxx.com [xx.xxx.xx.xxx] closed connection in response to sending data block"
These messages get queued on the exim machine and most never make it out until I manually log onto the server and force the queue out with 'exim -qff'
I went as far as to add the MX servers as inbound gateways to see if that would help and the problem persists.  This is queuing hundreds of emails that should be passed no matter what.
Any help would be greatly appreciated.  Thank you.

3 Replies

Reply to Thread
George Ascione Replied
Doesn't seem like anyone else seems to have issues.
Anyway, I've done some more digging and the error on the SmarterMail side is the following: "554 5.4.6 Hop count exceeded – possible mail loop"
This has never happened before upgrading to v14, so obviously something in this release is causing this issue.  We have increased the hop count from 20 to 200 just to see if it would work and a few messages that were stuck seem to be passing through.  It seems like someone else had this issue as well and posted on these forums, no responses on that either.
Not sure if a dev monitors here, but any input on this would be appreciated.  Thanks.
Joe Wolf Replied
Exim is having problems sending messages to the latest SM 14.  I'm not sure who to blame but we didn't have the problem until the latest version of SM 14.
To temporarily fix the problem edit your exim.conf and in the "Transports Configuration" section you'll have your "remote smtp" transport section.  You need to add "hosts_avoid_tls=*" to that section.  So it will look something like:
  driver = smtp
  interface = xxx.xxx.xxx.xxx   (an IP Address if you use a specific interface)
This fixed the problem for our Exim server to send to SM 14 but this doesn't fix the millions of other Exim servers that are probably having the same problem sending to SM 14.  You lose TLS on Exim, but at least you can communicate.
There's probably another way to accomplish this, but the issue remains that Exim and SM 14 don't get along very well.  We're running Exim 4.86 and I've even re-compiled Exim and the exim.conf to default and the problem still exists unless you drop TLS for outbound transport.
Thanks, -Joe
George Ascione Replied
Hello Joe;
Thanks for the reply, we applied the patch. In our case any external Exim mail systems will not have a issue sending to us. In fact based on our configuration we have engineered out that problem, by accident. Our system has a cluster of Exim servers which pre-process all inbound mail. All domain mail has MX records pointing to our Exim Cluster's. We pre-process all inbound mail on these servers prior to delivery to the Smarter Tools servers. This eliminates almost 40% of our inbound traffic for all types of failures. So our inbound mail is usually never directed at the SM servers, it's relayed to the SM servers from our border MX servers after RFC compliant checks are complete. So as long as our Exim servers have this fix we should be good to go.

Reply to Thread