SM16 June 16 Update - Undeliverables & System.Net.Sockets.SocketException in Logs
Problem reported by Shaun Peet - June 17, 2017 at 8:47 AM
Resolved
We've had numerous reports of undeliverables since installing the latest update from yesterday.  Here's some of the sample logs:
 
[2017.06.17] 03:18:05 [36580] Sending remote mail for REDACTED@orilliagirlshockey.com
[2017.06.17] 03:19:50 [36580] Delivery for REDACTED@orilliagirlshockey.com to REDACTED@gmail.com has completed (Bounced)
[2017.06.17] 03:19:50 [36580] Delivery for REDACTED@orilliagirlshockey.com to REDACTED@rogers.com has completed (Bounced)
[2017.06.17] 03:19:50 [36580] Delivery finished for REDACTED@orilliagirlshockey.com at 3:19:50 AM    [id:-1342671036580]
[2017.06.17] 03:19:57 [46154] Starting local delivery to REDACTED@orilliagirlshockey.com
[2017.06.17] 03:19:57 [46154] Delivery for  to REDACTED@orilliagirlshockey.com has completed (Forwarded Delivered) Filter: None
[2017.06.17] 03:19:57 [46154] End delivery to REDACTED@orilliagirlshockey.com
 
 
And then when showing related rows (ie details) here's the errors:
 
[2017.06.17] 03:18:05 [36580] Sending remote mail for REDACTED@orilliagirlshockey.com
[2017.06.17] 03:18:05 [36580] Spam check results: [_COMMTOUCH: 2,Unknown], [_MESSAGESNIFFER: 0,code:0]
[2017.06.17] 03:18:05 [36580] Initiating connection to 64.233.183.27
[2017.06.17] 03:18:05 [36580] Connecting to 64.233.183.27:25 (Id: 1)
[2017.06.17] 03:18:05 [36580] Exception connecting to 64.233.183.27 (Id: 1)
[2017.06.17] 03:18:05 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:18:05 [36580] Connection to 64.233.183.27 failed (Id: 1)
[2017.06.17] 03:18:05 [36580] Initiating connection to 2607:f8b0:4001:c15::1b
[2017.06.17] 03:18:05 [36580] Connecting to [2607:f8b0:4001:c15::1b]:25 (Id: 2)
[2017.06.17] 03:18:26 [36580] Exception connecting to 2607:f8b0:4001:c15::1b (Id: 2)
[2017.06.17] 03:18:26 [36580] System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [2607:f8b0:4001:c15::1b]:25
[2017.06.17] 03:18:26 [36580] Connection to 2607:f8b0:4001:c15::1b failed (Id: 2)
[2017.06.17] 03:18:26 [36580] Initiating connection to 173.194.205.27
[2017.06.17] 03:18:26 [36580] Connecting to 173.194.205.27:25 (Id: 3)
[2017.06.17] 03:18:26 [36580] Exception connecting to 173.194.205.27 (Id: 3)
[2017.06.17] 03:18:26 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:18:26 [36580] Connection to 173.194.205.27 failed (Id: 3)
[2017.06.17] 03:18:26 [36580] Initiating connection to 2607:f8b0:400d:c02::1b
[2017.06.17] 03:18:26 [36580] Connecting to [2607:f8b0:400d:c02::1b]:25 (Id: 4)
[2017.06.17] 03:18:47 [36580] Exception connecting to 2607:f8b0:400d:c02::1b (Id: 4)
[2017.06.17] 03:18:47 [36580] System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [2607:f8b0:400d:c02::1b]:25
[2017.06.17] 03:18:47 [36580] Connection to 2607:f8b0:400d:c02::1b failed (Id: 4)
[2017.06.17] 03:18:47 [36580] Initiating connection to 74.125.139.26
[2017.06.17] 03:18:47 [36580] Connecting to 74.125.139.26:25 (Id: 5)
[2017.06.17] 03:18:47 [36580] Exception connecting to 74.125.139.26 (Id: 5)
[2017.06.17] 03:18:47 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:18:47 [36580] Connection to 74.125.139.26 failed (Id: 5)
[2017.06.17] 03:18:47 [36580] Initiating connection to 2607:f8b0:400c:c05::1a
[2017.06.17] 03:18:47 [36580] Connecting to [2607:f8b0:400c:c05::1a]:25 (Id: 6)
[2017.06.17] 03:19:08 [36580] Exception connecting to 2607:f8b0:400c:c05::1a (Id: 6)
[2017.06.17] 03:19:08 [36580] System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [2607:f8b0:400c:c05::1a]:25
[2017.06.17] 03:19:08 [36580] Connection to 2607:f8b0:400c:c05::1a failed (Id: 6)
[2017.06.17] 03:19:08 [36580] Initiating connection to 64.233.190.27
[2017.06.17] 03:19:08 [36580] Connecting to 64.233.190.27:25 (Id: 7)
[2017.06.17] 03:19:08 [36580] Exception connecting to 64.233.190.27 (Id: 7)
[2017.06.17] 03:19:08 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:19:08 [36580] Connection to 64.233.190.27 failed (Id: 7)
[2017.06.17] 03:19:08 [36580] Initiating connection to 2800:3f0:4003:c01::1a
[2017.06.17] 03:19:08 [36580] Connecting to [2800:3f0:4003:c01::1a]:25 (Id: 8)
[2017.06.17] 03:19:29 [36580] Exception connecting to 2800:3f0:4003:c01::1a (Id: 8)
[2017.06.17] 03:19:29 [36580] System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [2800:3f0:4003:c01::1a]:25
[2017.06.17] 03:19:29 [36580] Connection to 2800:3f0:4003:c01::1a failed (Id: 8)
[2017.06.17] 03:19:29 [36580] Initiating connection to 209.85.203.26
[2017.06.17] 03:19:29 [36580] Connecting to 209.85.203.26:25 (Id: 9)
[2017.06.17] 03:19:29 [36580] Exception connecting to 209.85.203.26 (Id: 9)
[2017.06.17] 03:19:29 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:19:29 [36580] Connection to 209.85.203.26 failed (Id: 9)
[2017.06.17] 03:19:29 [36580] Initiating connection to 2a00:1450:400b:c03::1b
[2017.06.17] 03:19:29 [36580] Connecting to [2a00:1450:400b:c03::1b]:25 (Id: 10)
[2017.06.17] 03:19:50 [36580] Exception connecting to 2a00:1450:400b:c03::1b (Id: 10)
[2017.06.17] 03:19:50 [36580] System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [2a00:1450:400b:c03::1b]:25
[2017.06.17] 03:19:50 [36580] Connection to 2a00:1450:400b:c03::1b failed (Id: 10)
[2017.06.17] 03:19:50 [36580] Initiating connection to 63.250.193.252
[2017.06.17] 03:19:50 [36580] Connecting to 63.250.193.252:25 (Id: 11)
[2017.06.17] 03:19:50 [36580] Exception connecting to 63.250.193.252 (Id: 11)
[2017.06.17] 03:19:50 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:19:50 [36580] Connection to 63.250.193.252 failed (Id: 11)
[2017.06.17] 03:19:50 [36580] Initiating connection to 98.139.214.154
[2017.06.17] 03:19:50 [36580] Connecting to 98.139.214.154:25 (Id: 12)
[2017.06.17] 03:19:50 [36580] Exception connecting to 98.139.214.154 (Id: 12)
[2017.06.17] 03:19:50 [36580] System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
[2017.06.17] 03:19:50 [36580] Connection to 98.139.214.154 failed (Id: 12)
[2017.06.17] 03:19:50 [36580] Bounce email written to -1055163246154.eml
[2017.06.17] 03:19:50 [36580] Delivery for REDACTED@orilliagirlshockey.com to REDACTED@gmail.com has completed (Bounced)
[2017.06.17] 03:19:50 [36580] Delivery for REDACTED@orilliagirlshockey.com to REDACTED@rogers.com has completed (Bounced)
[2017.06.17] 03:19:50 [36580] Delivery finished for REDACTED@orilliagirlshockey.com at 3:19:50 AM    [id:-1342671036580]
 
 

24 Replies

Reply to Thread
0
Shaun Peet Replied
When I search the delivery logs for System.Net.Sockets.SocketException there are THOUSANDS of these errors from the past 12 hours alone.
 
When I search from yesterday (June 16), when we applied the June 16 update, these appeared to start around 10pm and we installed the update mid-afternoon.  I'm going to bounce the service and see if that does anything.
 
Searching the logs from Thursday (June 15) for the SocketException there were a handful but certainly nothing like what we're seeing currently.
 
0
Shaun Peet Replied
Bouncing the mail service and IIS app pool didn't solve the problem.  I see no way to download a specific minor version prior to last Friday's update but luckily I have a custom build available from about 10 days ago.  Will try that and see if that fixes the problem.
0
Shaun Peet Replied
Reverting to an earlier minor version of 16 did not solve the issue but it did reveal a little more details about what's happening.  We're seeing a ton of the dreaded "601 attempted to deliver to the following IP's" errors.  This is acting like a blacklist situation but we can't find any of the domains / IPs listed.
 
Commence hair pulling.... now.
 
0
Derek Curtis Replied
Employee Post
No DNS issues either, right? SPF or DMARC problems?
Derek Curtis
COO
SmarterTools Inc.
(877) 357-6278
0
Derek Curtis Replied
Employee Post
I know you bounced the service..did you try a full reboot as well? I meant to add, as this isn't something anyone else has reported, or is reporting. Nothing in our support queue about it...
Derek Curtis
COO
SmarterTools Inc.
(877) 357-6278
0
Shaun Peet Replied
Not that I can tell. I'm not 100% sure whether or not it's happening to every domain that we host either. From my own mailbox I'm able to send to my personal gmail account - either via an alias or directly, and by sending from the webmail or from Outlook (using Smartermail) or from Outlook (using our other Exchange server). All that works. So I'm still testing and trying to figure out the source of the problem as I watch our spool count keep getting higher and higher and higher...
0
Tim Uzzanti Replied
Employee Post
Sounds like you might be having DNS issues or connectivity issues. Have you modified anything with your firewall? Or, there is something going on with your TCP/IP stack on the machine. I would definitely reboot to make sure Windows itself is healthy. Check for any Windows Updates etc.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Shaun Peet Replied
I've gotten permission from a couple of people to use their accounts for testing, so I've got 3 valid "from" addresses on 3 different domains to us. One is our own domain, but the other two are setup identically in terms of DNS records, SPF, etc. Neither have DKIM or DMARC setup. One account I can send to my gmail account and it gets delivered pretty much instantly. The other account has been sitting in the spool now for 5 minutes and now has the status of "Remote Delivery" with the "Error 601 Attemtped to send the message to the following ip's..."
0
Shaun Peet Replied
Ok trying a reboot now...
0
Shaun Peet Replied
While I reboot I'm still wondering - why would there be network / firewall issues on the server sending from one domain and not the other?
0
Tim Uzzanti Replied
Employee Post
If you're getting a 601 from the destination server, that means the server is having issues responding back. Back in 2014 google was having a bad issue and it was rejecting everyone and we were seeing them as 601 issues.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Tim Uzzanti Replied
Employee Post
You can always verify by telnetting into the destination server and simulating in an SMTP session to see how far you can get.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Shaun Peet Replied
After rebooting the spool continues to climb. Tried a Telnet session and that put a test message into the spool. Also tried sending another message from an account known to be having issues to my gmail account and both of those ended up with the same 601 error after sitting in the spool for a couple minutes as "pending".

Sent another test message to my gmail account from a different domain but still on the same server - no issues.

Also, for the domain(s) which are having deliverability issues it's not just gmail - it's pretty much every recipient domain outside of our own mail server as far as I can tell. We've got 450-ish domains on the server and I have no idea how many are having deliverability problems but we've heard from a handful and see an awful lot of different domains as the sender sitting in the spool.

Another weird (I think) thing was that after rebooting there was a TrustedInstaller.exe process running (still is, actually, after 15 minutes or so). Windows Update wasn't doing anything and didn't claim to have any updates available either - the last update it did was on the 14th. I manually told it to check for updates and it found 4 of them - after installing those it wants another reboot so I'll do that again and see if we're any closer.
0
Tim Uzzanti Replied
Employee Post
Your server is having a hard time sending messages to remote servers? Correct?

So, you would need to Telnet into one of those remote servers from your server to see what the session looks like and why its being rejected.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Shaun Peet Replied
After that reboot there's still a TrustedInstaller.exe process running but it's not using anywhere near the amount of memory prior to that last restart. Still seems a bit strange but moving back to the issue at hand...

Our spool count continues to climb.

Using the webmail, sending a message to my gmail account from a user in the @oakvillerangershockey.ca domain works; the message arrives almost immediately. Then sending a message from the @riversideminorhockey.ca domain does not work. But we've got a different error message from the bounce:

Failed Recipient: REDACTED@gmail.com
Reason: Remote host said: 550 5.7.1 [2001:0:5ef5:79fb:2013:24c1:27d1:729d] Our system has detected that
5.7.1 this message does not meet IPv6 sending guidelines regarding PTR
5.7.1 records and authentication. Please review
5.7.1 https://support.google.com/mail/?p=IPv6AuthError for more information
5.7.1 . n13si2324500itn.42 - gsmtp

Huh? We're not using IPv6 for sending.
0
Tim Uzzanti Replied
Employee Post
Sounds like there might be a configuration change on the server or some sort of networking change that could be prioritizing IPv6 for certain routes etc. It could be any number of things, with almost all of them not related to SmarterMail itself. But, we can help you with that but would require an emergency support incident over the weekend so that one of our technicians could go through everything. It would be impossible to diagnose via the community.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Tim Uzzanti Replied
Employee Post
Maybe something changed on your DNS server? Do you manage it? Maybe someone enabled IPv6 on your network and its now being prioritized needing now appropriate records to accomidate?
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Shaun Peet Replied
Ok thanks Tim. I might have to end up doing that. For now, I've disabled IP6 on the network adapter on the server - we'll have to see if that has any effect. We also have a few IP6 addresses listed in the Bindings section of SmarterMail, although none of them are set to use any Ports. Should those be deleted from that page if we're not using IP6?

I've also updated the title of this thread - I had "major bug?" with a question mark to begin with since I wasn't completely sure this was a SM issue, although to be fair nothing else changed from yesterday on that server except for installing the latest minor.

If we do end up with an emergency support ticket and there actually is a bug in SM which is using IP6 even when it's not supposed to - do we get that ticket refunded? It wasn't clear on your support page if that was the case or not.
0
Tim Uzzanti Replied
Employee Post
Yes, any ticket opened related to a SmarterMail bug is always refunded. For shits and giggles ... know you hate my profanity :)... I sent some messages to Gmail from a variety of SM 16.x servers and all works well. Also, no reported issues from other customers which leads me to being a side effect of a configuration change on the server, DNS, server that DNS is on, windows update, network provider etc. Hard to tell.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Shaun Peet Replied
Emergency support ticket opened. Hopefully Von-Austin can work his magic. If it's just a misconfiguration on our part then hopefully that's easily found and corrected.

I know I kept referring to gmail but that's only because my personal account for testing is gmail. To be clear, the domains having deliverability issues seem unable to send mail to any external email address.

In addition, the only thing that changed on the mail server (that I know of) since yesterday afternoon was the installation of the latest SM16 minor. Although, to be fair I *think* the issues started around 10pm whereas SM was updated around 3pm. But I don't know that for certain either. If our ISP changed something on us then that also isn't something that I know about.

On the profanity thing - I'm not adverse to it personally but I don't think it provides people with a positive reflection on those who go blue in a public area. Particularly when a company representative of any kind (let alone the CEO) does so on their own public forums given that when things are written down there's opportunity to choose words carefully. As I've said before if it's an attempt to be funny there are other ways which wouldn't so quickly detract from public perception. I know that sounds preachy but virtually my entire customer base is in youth sports and if I were to do that on our forum (which uses SmarterTrack) then it would cost me current and future business.
4
Shaun Peet Replied
For anyone reading this, just a quick update...
 
We ended up creating an emergency support ticket over the weekend and firstly I'll say that the support received was tremendous - so kudos to the SM team, including Von-Austin who handled the ticket and Derek & Tim for commenting in the community on the weekend.
 
Our main issue with undeliverable emails has been corrected.  However, figuring out exactly what happened & why is looking more like a plane crash in that there were probably several factors, some related and some unrelated, contributing to the problem.  It was a mixture of the software behaving differently than in previous versions combined with (mis)configuration settings.  To their credit, SmarterTools committed to refunding the full ticket because there was it least some elements of this caused by a bug.
 
In our case, one of the contributing factors was that our SmarterMail installation had been running on a server for about 5 years or so, moved to a different server, and has been running on that server for a few years now.
 
We also have integrated our own custom built CMS solution to SmarterMail via web services to manage users & aliases for each of the domains we host both the website and email for.  A few years ago we noticed that it was possible to delete the primary administrator account for a domain using the web services.  That didn't prevent the domain from working but it caused some strange issues from time to time when doing various things.  If one of our customers did that and noticed strange things, we'd fix it for them by re-assigning a primary administrator.
 
In at least a few of the domains that were having deliverability issues with the latest minor, we discovered that the domain didn't have a primary administrator set, which prevented the UI in SM16 from working properly when managing the domain.
 
On top of that, most domains did not have the IP4 address set for the domain (which I think is the correct setting), but some had the IP of the new server, and some had the IP of the old server.  The ones that had the IP set to use the old server for outgoing were the ones causing undeliverables.  Setting the IP for outgoing as either not set or as the IP of the current server fixed the issue.
 
So there probably was something within the latest minor SM16 version that started to use the Outgoing IP setting defined at the domain level rather than using the global setting for the server.  The old behavior may have been by design or a bug, and the new behavior may be by design or a bug - but in any case the behavior changed which is essentially why this started after installing the latest minor.
 
We're still working towards a full resolution to identify all the contributing factors here and how to both correct potential misconfiguration as well as prevent future misconfiguration.  One thing in particular which we should have logged as a bug a long time ago was that the web services allows the primary administrator to be deleted - something that's not possible to do within the UI of SmarterMail itself.  We corrected that in our own implementation of the web services and forgot to file a bug report.  Lesson learned!
 
0
Shaun Peet Replied
I think we can change the status of this to "resolved".
 
I'll add the feature request to prevent deleting the primary administrator via the web services (and API, in case it's possible there too) as a separate thread.
 
Thanks again to all those involved in helping get this sorted out!
0
Tim Uzzanti Replied
Employee Post
Thanks for the follow-up and glad it all got worked out.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
1
Jade Benson Replied
I stumbled across this thread after reading the release notes to the latest SM17 beta version.
After reading Shaun's explanation of what went wrong, I suspect that he missed an important step when moving smartermail services to a new server with a different ip or changing the servers IP.
The domainConfig.xml file needs to be updated for each domain hosted on the smartermail server when changing the servers IP or moving SM to a new server with a different IP.

In past migrations we've used powergrep to recursively run through the smartermail data directory to find and replace the current IP with the new one, after completing this and similar changes to C:\Program Files (x86)\SmarterTools\SmarterMail\Service\mailConfig.xml

Once complete you can start the smartermail services and you should be ready to go.

Reply to Thread