mail delivery issues since build 7153
Problem reported by Christian Schmit - 8/14/2019 at 2:16 AM
Resolved
Since build 7153 we have noticed increased delivery failures for local and remote recipients. Another factor indicating that something seems to have changed from build 7125 to build 7153 is that in the Smartermail statistics we see a huge increase in "SMTP Terminated connections" under "SMTP Out Sessions". We reverted back to build 7125 which solves the delivery issue for us.

Is anyone else running build 7153 seeing an increase in "SMTP Terminated connections" under "SMTP Out Sessions" in the Smartermail statistics?

We also saw much more "mailbox is locked" entries in delivery.log with build 7153. Maybe both issues are related. We have tickets are open with support for this.

Christian

57 Replies

Reply to Thread
0
David Jamell Replied
0
Thomas Lange Replied
Looks like are we have similar issues with our SmarterMail system:
We are receiving the System-Messages with Delivery-Notification for all outgoing emails, but only for the automatically by our SmarterMail/Events automatically added "journal@"-recipient/account used for our message-archiving software.

Our Events "Journal Incoming" and "Journal Outgoing" - both just do "Add recipient" journal@ourdomain.de
and under "Spool" and "Waiting for Delivery" are many, many hundreds/thousounds of emails.... most of them dealing with delivering to our local journal-useraccount (looks like only this "last/added recipient" ist failing, originally recipients are already ok/success!
Could even be that some kind of messages having "looping issues" because often many, many, many textlines with "Failed" :-(

I can even see "mailbox is locked" messages in the delivery log for the journal-account, re-indexing was no solution, still seeing this issues/SystemMessages with delivery failure
[2019.08.14] 17:20:46.631 [40279] Add message to mailbox call failed.  Reason: Access to the grp file is locked, monitor enter failed.
[2019.08.14] 17:20:46.631 [40279] Delivery attempt to journal@xxxxxxxxxx.de to folder 'Inbox' failed.  Mailbox is locked.

I tried with SystemMessages and "empty from address", and even with entering "postmaster as the from-address", this did not make any difference concerning the messages/failure-emails.

And our users do think, that there original/destination was not working, they do not notice that in our case it is just our "journal"-recipient failing!

tested with v17 / build 7153 and even with the MAPI-beta 7159
looks like the bug/issue is not solved/fixed yet.
2
Kyle Kerst Replied
Employee Post
Thomas, if you're still having looping issues please deploy the custom build below which includes fixes for this as well as some automated forwarding delay problems: 

Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Bruce Replied
I am getting 1,000 of emails looping on multiple SmarterMail servers with the subject line;

 Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed:Failed: Failed: Failed: Failed: Failed: Failed:Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed:

For each loop this gets longer.

Will this custom build fix this?
1
Employee Replied
Employee Post
Hi Bruce.  Yes, build 7164 will fix this looping.
0
Bruce Replied
Thanks, Rod, for the quick reply.
2
Thomas Lange Replied
Thanks Kyle for the link to this custom build.
(btw: I suppose the MAPI-beta needs an updated/fixed build, too)

Bruce: I just installed custom build 7164 - looks good now!
But keep in mind in case your "Spool"-queue is already filled with thousounds of messages (we had 3.000 upto 8.000 Waiting for delivery!) you need to stop the SmarterMail service, then RENAME the Spool-Directory, and restart the SmarterMail-service. SmarterMail will then re-create a new spool-directory, including it´s  subdirectories.

Keep in mind you could loose emails: what was in the old-spool will not being send anymore! , thats why I suggest a rename of this folder and not a delete, to have the chance to manually check content/files.

0
Bruce Replied
I was able to clean the spool queues, but I have just been woken at 2am by a throttle notification and checked the spool on one of the servers and it has occurred again, 1,000's of emails with 'Failed: Failed:' subjects.

It looks like 7164  has not fixed the problem and the loop is still occurring.
0
Bruce Replied
It looks to be when you get a Failed Delivery notification from 'System Administrator' to a mailbox that is full that because the Failed Notification can not be delivered it then gets stuck in a loop.
0
Stefan Mössner Replied
Hello all,

I had the same issue with SmarterMail, build 7153. The new custom build 7164 and deleting the spool files helped me to get back a performant running SmarterMail.

Thank You.
0
Bruce Replied
I have not seen any more since 2am this morning, so I am hoping it was just a hangover from the previous version.
0
Stefan Mössner Replied
Hello all,

unfortunatelly it only worked for nearly one day. Again there were some mails with "Failed:". I'm using some addresses of my internal mail domain which doesn't exist as mailboxes for notifications from firewalls, IP cameras etc. I had to delete these mails for stopping SmarterMail to try to send them again and again. In the past releases of SmarterMail there wasn't such an issue. Why isn't SmarterMail stopping to send these mails after a few retries?

Kind Regards.
0
Bruce Replied
I am also still getting this but only now on one server, which has 12,000 mailboxs, but this issue only seems to affect one mailbox that sends email notifications but the mailbox is full so failed emails cannot be returned, so you end up with an email loop with the subject line;

Failed: Failed: Failed: Failed: Failed: Failed: Failed: 
0
Bruce Replied
OK, just found another mailbox that this is also happening too which is also rejecting email as the mailbox is full.

As a temporary fix I am increasing the size of the affected mailboxes so that they can receive the 'Failed' email and stop the loop.
0
Bruce Replied
Is there a fix for this yet.

The issue was fixed for when a mailbox does not exist, but not for when a mailbox is full.
0
Stefan Mössner Replied
It's even not fixed when the sending mailbox doesn't exist.
0
Kyle Kerst Replied
Employee Post
Hello everyone, thanks for keeping us posted on your results here. We're not seeing the same looping behavior during testing, so I suggest submitting a support ticket so we can take a closer look at this. One thing that works after you've upgraded to 7164 is to add an SMTP Block under Settings>Security>SMTP Blocks for the offending from addresses, then remove the already existing messages in the spool. Once cleared out you should be able to remove the SMTP Block and resume normal operation. Please give that a shot and reach out to support if these issues persist. Thanks!
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Stefan Mössner Replied
Hello Kyle,

I don't think that this will help because this behavior happens again and again. I get monitoring mails every day and so there are mails in the spool folders with "Failed" in the subject even if they are delivered correctly.

This is a new behavior since build 7153. I can't open a support ticket because I use the free version of SmrterMail.

Kind Regards.
1
Bruce Replied
The SMTP Blocking is not an option as these are local mailboxes that send the emails.

I will explain the behaviour I am seeing in the hope that you can replicate it or know how to fix it.

When the email fails to be received =SmarterMail creates a 'Failed' email from the System Administrator. I think this is the 'Delivery Status Failure' email in the 'System Messages'.

If the 'Failed' email from the System Administrator can not be delivered because the sender's mailbox is full it ends up in a loop with SmarterMail sending 'Failed' emails for the 'Failed' emails.

I have fixed the issue on my system by when I spot the 'Failed' emails in the spool I increase the amount of diskspace on the full mailbox so they can receive the 'Failed' email from the System Administrator thus stopping the loop.
3
Sébastien Riccio Replied
SM build 7125 was not affected.

The problem appeared in custom build 7139 that we had to install to fix newsletter issues after customer complaints.
I opened a ticket explaining precisely the issue (same as bruce is describing here), that was 20 july (1 month ago).
I also specified in follow-ups that it appeared after build 7139 installation, so they are aware of it, to better locate the code changes that could be the source of the problem.

After a few support ping-pong resulting in new custom build to install the problem is still here.

In the meantime 7153 was publicly released containing the problem...  ( I really don't get it why they released a public version with a _known_ major issue )

Since a month we're spending an unbelievable amount of time each day to monitor/clean spool of these NDR before queue starts to be filled with 100K and more of these FAIL mails and creating locking problems on the mailboxes, and even server lockups...

Everybody is angry and tired here, our tech staff, our customers (threating to use another service...)
This really need to be top 1 issue to be resolved.

0
Ronald Raley Replied
Just received a new custom build from support this morning. Will try to upgrade/fix tonight.

Thanks,
Ron
0
Sébastien Riccio Replied
Cool. Crossing fingers
0
Ronald Raley what new version is?
0
Ronald Raley Replied
2
Ronald Raley Replied
Seems to have broken something else.

Now, a user cannot delete a message in webmail without a browser refresh.

It basically broke auto refreshing.

Ron
0
Sébastien Riccio Replied
Really?
0
David Finley Replied
Why not release these builds as you fix issues?
http://www.interactivewebs.com
0
Bruce Replied
Ron are you saying not to install this latest fix?

Is anyone else seeing that webmail auto refesh is broken?
0
David Finley Replied
I just tested on Safari on Build 7165 (Aug 14, 2019)
- No problem deleting. Found it worked as one would expect.
http://www.interactivewebs.com
0
Bruce Replied
Thanks David.

I have updated 3 servers with Build 7165 and not seeing an issue with deleting emails.

After installing an upgrade of SmarterMail you do need to refresh your browser in order for auto-refresh to start working again in the webmail, it might have been this that caused Ron's delete problem.
0
Ronald Raley Replied
Our refresh issue resolved itself with a server restart last night. But still seeing FAIL FAIL FAIL FAIL looping issue.

Ron
0
Bruce Replied
Build 7165 is worse than it was previously for us.

Get a huge number of these Failed emails now.

The issue has shifted back from a problem with full mailboxes to non-existing mailboxes again.
0
Bruce Replied
Previously every time SmarterTools came out with a new major release it would be buggy and unusable for the first 6 months which was bad enough.

Now it is every month you release a buggy and unstable version and you then release custom builds all month to try and fix the issues before it all starts over the next month!

Come on SmarterTools get some software testers as if this pattern continues will certainly be moving to a competitor and stop paying you $5,000 per year to be your beta testers.
1
Kyle Kerst Replied
Employee Post
Hello everyone, sorry for the delay here. We've identified a fix for the few remaining scenarios causing these issues and I've included the custom build below. Please give this a shot and let us know if you see any further problems. 


As to the refresh issues, we're not seeing these problems here. Can you submit a support ticket on that one?
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Stefan Mössner Replied
Hi Kyle,

it looks like this build (7172) solved the issue. Tonight there were no more mails with "Failed" in the spool folders. Thank you for solving the issue.

Kind Regards.
1
Bruce Replied
Looks like the issue is resolved for us also with build 7172.

Thank you.
1
Christian Schmit Replied
We are also on build 7172 since last night and everything seems to be working as it should.



1
Sébastien Riccio Replied
Yes, it seems build 7172 is way better at handling these infite bounce loops.
We're still trying different scenarios that we had before, to be sure they are all handled now.
0
domains Replied
We have applied Build 7172 but are still having mail rejections with a "Mailbox is locked" error
0
Kyle Kerst Replied
Employee Post
Hello everyone! Thanks for the updates on your results with this, I'm glad to hear we were able to get to the bottom of this and resolve it. Please let us know if you run into any additional scenarios that result in this behavior. 

Hello Domains, The mailbox locked behavior is an expected result if an existing delivery session is taking place, or POP access is active on the account. Does the account experiencing those errors typically receive a large amount of email? You may want to adjust your retry intervals to me something like this for best results as well (Settings>General>Spool)

1, 1, 5, 5, 15, 30, 30, 30, 30, 60, 90, 120, 240, 480, 960, 1440, 2880

These retry intervals account for local delivery/forwarding delays as well as legitimate (longer) failures to deliver. Please give that a shot and if you continue to have issues please submit a ticket so we can get to the bottom of this. 
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
domains Replied
Kyle,
Yes the accounts get a very large volume of email but no POP account access. I have tweaked the spool settings and will monitor. A trouble ticket was submitted last week, but it hasn't been resolved
0
Sébastien Riccio Replied
Still not convinced why retry intervals are involved in this ....
0
Kyle Kerst Replied
Employee Post
Domains, happy to hear you were able to get the retry intervals corrected. I'll take a look at your ticket and see if I can offer further guidance there as well. 

Sebastien, the retry intervals are a necessary change due to some modifications to how we handle delivery sessions re: forwarding and system message notification delivery. These retry intervals should be present by default on new installations, but will require the update on existing for optimal performance. 
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Sébastien Riccio Replied
Hello Kyle, we would be interrested to understand why these changes were necessary? Thank you

Edit: Retries are in case of failure ? It's kinda misleading
0
Stefan Mössner Replied
Hi Kyle,

thank you for giving us the information about the need of changing the spool settings on existing installations. Do we have to restart SmarterMail after changing this?

Kind Regards.
0
Kyle Kerst Replied
Employee Post
Sebastien, from what I understand these changes were to essentially make the spooling/delivery process more efficient, eliminating or mitigating as much as possible cases where we would have to wait to deliver in the past. I can gather more specific details for you on this though if you can ping me on your existing ticket. Retry intervals are designed to handle delivery failures first and foremost, but apply to delivery in general, and will be used for all delivery session. This is why we have a couple of short retries in there at the start, followed by more typical durations to account for legitimate failures or locked mailboxes. 

Stefan, happy to help! You should not need to restart the SM service to allow these changes to take effect. 
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Alex Clarke Replied
So... will a future build of SM automatically adjust the retry intervals, or is this a change that we will manually need to make?
0
Neal Culiner Replied
If SmarterTools ran the builds 1 week prior to public release would this situation occur?
1
Alex Clarke Replied
If SmarterTools ran the builds 1 week prior to public release would this situation occur?
Probably. 1 week isn't long enough to test a release and some bugs only expose themselves under certain conditions.
0
Kyle Kerst Replied
Employee Post
Alex, the retry intervals should be updated in new installs going forward.
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Alex Clarke Replied
Thanks for the confirmation Kyle!

Any idea when the next release will be please?
0
Kyle Kerst Replied
Employee Post
You're very welcome Alex, always happy to help and will provide information as we get it :-) The next release date isn't something I'm privy to at the moment but I would expect it in the very near future as we've been focused on bug fixes pretty heavily over the last week or so and theres a lot of good stuff incoming. Have a great day!
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Patrick Mattson Replied
OK so i understand I got two emails stuck doing the failed: failed: failed. When will the fix be released officially? I have run into issues since upgrading from 14 and have been afraid to upgrade since.
0
Kyle Kerst Replied
Employee Post
Patrick, this is the loop behavior being discussed above, and I recommend performing a minor upgrade to the custom build I linked previously: 


This build has now been running in several environments for a period of time with no further looping. The minor upgrade itself should be entirely painless, but be sure to take a full backup of your environment before getting started: 

Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Dillon Yonash Replied
Can you confirm that these fixes have been rolled into the standard (non-custom) release of build 7188?  I think this describes what I was seeing in 7153, but I didn't find this thread until today.  I installed 7188 earlier this week.
0
Kyle Kerst Replied
Employee Post
These fixes were rolled into 7188, and you should no longer see long delivery delays. What do these delays look like? Are you seeing them stall in the spool or are you receiving user complaints of delayed delivery? 
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
1
Bruce Replied
Looks like the issue still exists in Build 7188 under some circumstances.

I had a loop of 22,000 emails with the subject line;

Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: Failed: ..........................

This occurred after disabling a mail domain. The emails that failed were sending an email to the same address as sent the email.

The delivery log for one of these emails is below, which I have redacted the real email address and replaced with example@example.com;

[2019.10.15] 17:14:11.800 [65092] Delivery started for example@example.com (via bypass) at 17:14:11 [2019.10.15] 17:14:17.826 [65092] Added to SpamCheckQueue (0 queued; 4/150 processing) [2019.10.15] 17:14:17.826 [65092] [SpamCheckQueue] Begin Processing. [2019.10.15] 17:14:19.717 [65092] Starting Spam Checks. [2019.10.15] 17:14:19.717 [65092] Skipping spam checks: User authenticated [2019.10.15] 17:14:19.717 [65092] Spam Checks completed. [2019.10.15] 17:14:19.717 [65092] Removed from SpamCheckQueue (0 queued or processing) [2019.10.15] 17:14:20.857 [65092] Added to LocalDeliveryQueue (0 queued; 3/50 processing) [2019.10.15] 17:14:20.857 [65092] [LocalDeliveryQueue] Begin Processing. [2019.10.15] 17:14:20.857 [65092] Starting local delivery to example@example.com [2019.10.15] 17:14:20.874 [65092] Delivery for example@example.com to example@example.com has bounced. Reason: The domain is disabled [2019.10.15] 17:14:20.998 [65092] DSN email written to 56365102 with status failed to example@example.com [2019.10.15] 17:14:21.013 [65092] Delivery for example@example.com to example@example.com has completed (Bounced) [2019.10.15] 17:14:21.013 [65092] Removed from LocalDeliveryQueue (0 queued or processing) [2019.10.15] 17:14:23.873 [65092] Removing Spool message: Killed: False, Failed: False, Finished: True [2019.10.15] 17:14:23.873 [65092] Delivery finished for example@example.com at 17:14:23 [id:56365092]


Reply to Thread