6
Some messages remain in the spool for hours without even a delivery attempt being made
Problem reported by Gabriele Maoret - SERSIS - 6/1/2020 at 3:26 AM
Resolved
some messages remain in the spool for hours without even a delivery attempt being made and recipients statu in PENDING.

After a while they simply fail.

This si a big issue for our customers


Can you figure out why this is happening?


EG:






104 Replies

Reply to Thread
0
Gabriele Maoret - SERSIS Replied
Restart SmartrMail service and/or reboot the server doesn't solve the issue
1
Thomas Lange Replied
Hi Gabriele,

we are not on build 7454 yet - we are still on 7451.

If I remember right there were issues some month ago with messages in Spool and failing. This was already fixed and in addition more frequent retries for Spool were suggested by support:

Settings / General / Spool - Retry Intervals (Minutes, separated by comma)
1, 1, 5, 5, 15, 30, 30, 30, 30, 60, 90, 120, 240, 480, 960, 1440, 2880

Perhaps this helps for your SmarterMail installation. Otherwise SmarterTools should have a closer look.
0
Gabriele Maoret - SERSIS Replied
Better checking the emails that remain "blocked" in spool I noticed one thing: some of these emails have the NEXT ATTEMPT ("PROSSIMO TENTATIVO" in Italian) set on a time in the past (now here are 12.22).

Could this be the problem?

0
Gabriele Maoret - SERSIS Replied
Am I the only one with this issue?
I'm getting more and more messages that's are for hours in REMOTE DELIVERY state, 0 ATTEMPTS and NEXT ATTEMPT in the past!

Example (actual time 16:51 24H format):



0
Gabriele Maoret - SERSIS Replied
I think i've figured out what's happening:

All the connections that are that state are versus an Aruba SMTP server with IP Address 62.149.157.166

If I try to connet to this IP on port 25 via telnet this is the response:

>>>>>>>>>
421 mxcm01-pc.ad.aruba.it bizsmtp mfA22200r3Uk8nK01 Too many connections, try later.
Connection loosed
>>>>>>>>>

It's seems that SmarterMail doesn't disconnet the SMTP session after that message and never retry again, so the messages remain in the queue forever (or so).


EDIT: another SMTP remote server that cause the same issue: 

62.149.157.151
0
Gabriele Maoret - SERSIS Replied
Similar issue receiving messages...

1
Tim Uzzanti Replied
Employee Post
Please open a ticket and include your delivery logs so we can evaluate.  We don't think there is an issue based on the number of servers we have been on over the last week fine tuning in preparation for release.
Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Gabriele Maoret - SERSIS Replied
I think you are right, Tim... next days I will investigate further and open a ticket
0
Kyle Kerst Replied
Employee Post
Gabriele I took a quick look at your screenshots and noticed all of these pending deliveries are to Yahoo/Hotmail/etc and this could be a clue as to the root cause. Frequently when we see stalled messages to these providers it is indicative of one of the following:

1. Rate-limiting has been applied to your server IP due to the amount of email coming in from your server. 
2. Mail from your server is being rejected due to failed SPF, RDNS, DKIM, etc from the sending domain.
3. Sending IP address is listed on a blacklist or other spam list such as their internal lists. 

If you search your Delivery logs for these recipients what do you see there? If you could check on these items before submitting a ticket this will help us get to the bottom of it much quicker. Thanks, and have a great day!
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
1
Gabriele Maoret - SERSIS Replied
Hi Kyle, can I send the datails to you via PM or do you prefer that I open a ticket first?

P.S.: The destination SMTP servers aren't Yahoo/Hotmail/etc... These are the origination SMTP servers in my latest post that's talking of same issue with INCOMING e-mails... 

With OUTGOING emails the destinations seem to be some ARUBA S.p.A. smtp server, like 62.149.157.166 
3
Grady Werner Replied
Employee Post
Not Kyle, but since he hasn't answered yet, I figured I'd chime in.  We've had instances when using PM for troubleshooting issues that conversations get lost, and that hurts everyone.  We really prefer tickets because there's good oversight to ensure stuff doesn't fall through the cracks.  We realize that sometimes the back and forth of PMs are useful, but tickets ensure accountability and have a significantly higher chance of getting your issue resolved.
Grady Werner
SmarterTools Inc.
www.smartertools.com
1
Gabriele Maoret - SERSIS Replied
OK Grady, I'll open a ticket for this


2
Gabriele Maoret - SERSIS Replied
Before opening a ticket, I thoroughly investigated and perhaps I found the trick...

I found that there are TLS authentication errors in the delivery logs, so I tried to disable the relative option in SETTINGS --> PROTOCOLS --> SMTP OUT:

This seems to have solved the issue, but now I think I have something wrong with my SM certificates setting...

Now I will do a thorough investigation in my configuration, but I ask you politely if anyone has any suggestions to give me on where to look so as not to waste time on unnecessary checks ...


Thanks in advance to all!
0
Sébastien Riccio Replied
Do you have any TLS errors in delivery.log for these delivery attempts, or, is there no attempt at all logged ?

Is there anything else relevant in the delivery log flow for one of the attempt if they exist ?

Sébastien Riccio
System & Network Admin

0
Gabriele Maoret - SERSIS Replied
Hi Sebastian, this is an exeample:  LOG.txt

As you can see in the file, there'are some TLS errors like this:

[2020.06.03] 13:42:22.432 [63750] CMD: STARTTLS
[2020.06.03] 13:42:24.776 [63750] RSP: 220 2.0.0 Ready to start T
[2020.06.03] 13:42:25.510 [63750] Certificate name mismatch. 


The strange thing it's that delay the delivery, but after a while it works...

And it seems to happen only when SM delivery messages to certain SMTP servers, while other servers instead are OK...

Disabling TLS authentication on outbound SMTP solve the issue, but I think that if I can keep it enabled (without issues) it's better...

I need to understand if it's an error in my config or if it's a bug in SM or if are the destination SMTP servers that have issues...

0
Robert G. Replied
I'm having the same results as Gabriele. I'll open a ticket about this as well. It's even happening with users that are on my mail server. user1@domain.com to user2@domain.com getting delayed 20+ minutes Status: "Spam Check". 
0
Scarab Replied
I was having the same problem in Build 7242 but it was maybe 3 or 4 messages a week, so I never really paid it much attention. After upgrading to Build 7459 I was getting 2000 messages an hour that weren't even attempting delivery to local users!

Gabriele, I could kiss you because turning off "Enable TLS if supported by the remote server" fixed it for us immediately (still took a while for the Spool to catch up on a couple hours of messages that accumulated since 2am when we upgraded)!

We have a commercial certificate for our primary domain and a LetsEncrypt certificate for all our other domains. Never had a problem with our certs in SM before, but sure enough we have tons of the "Certificate name mismatch" in the logs.
0
Sébastien Riccio Replied
Hello, the "Certificate name mismatch" could mean that the remote certificate does not match the contacted remote hostname and SM aborts sending the mail using TLS.
If I'm correct, in 7242 it was then retrying without using TLS.

Looks like in mapi-BUILDS it doesn't retry without TLS.

It's maybe a side effect introduced with code changes around this fix:
Fixed: Gateways are using TLS, if available, even though they are configured to use no encryption.

That's only my suppositions.

edit: We don't have this issue but also we  a gateway for relay so the TLS certificate always matches.
Sébastien Riccio
System & Network Admin

0
Gabriele Maoret - SERSIS Replied
Hi Sebastien, so you think it's a BUG in SmarterMail that if it finds out that the REMOTE certificate has an issue, SmarterMail itself doesn't retry without TLS, am I right?
1
Sébastien Riccio Replied
Hello Gabriele, that's a supposition. I remember that previously with 7242 I had every mail delayed mails for 5 minutes (if your spool retry settings are begining with 5 minutes) because our mail gateway had it's certificate expired.
It was trying TLS and failing because of certificate then re-trying after 5 minutes without TLS.

The fact that you have stuck mails forever in the queue if you enable TLS and also got certificates mismatch in the logs makes me think it can be that new builds doesn't retry anymore wihtout TLS, after a failed TLS session. But this would need to be confirmed.

Can you reproduce the issue and check the logs for the stuck messages and see if there is a certificate mismatch error again. If then, can you give me the MX it tries to reach when this errors appears so I can check it's certificate with an openssl command, to confirm that the certificate really mismatches.

Also there should be a little thinking about "should SM retry without TLS if the TLS attempt fail". Because some customers or companies you host can have in their requirements that all mails should be transfered using TLS, or shouldn't be transfered at all.

So a per domain configuration for this should be added, something like "Require TLS for outgoing mails", so that no mails from this customers can be transmitted outsite without a layer of security...

Well that's not the point of this thread... Can you check the remote hostname that triggers the certificate issue?




Sébastien Riccio
System & Network Admin

0
Sébastien Riccio Replied
Gabriele,

I've read the latest log you posted here, it doesn't seems related to the certificate mismatch as the logs shows that it stills sends the mail over the TLS connection.

However, for an unknown reason the session with the server seems to timeout after the DATA command (when the mail content is sent).


[2020.06.03] 13:42:33.214 [63750] CMD: DATA  
[2020.06.03] 13:42:36.026 [63750] RSP: 354 enter mail, end with "." on a line by itself
[2020.06.03] 13:43:36.050 [63750] The smtp session has timed out.
[2020.06.03] 13:43:36.050 [63750] Attempt to ip, '62.149.157.166' success: 'False'

Is it the only destination server having this issue or you have the same with other distant MX ?
(this one is in.9netweb.it)

Kind regards.

Sébastien Riccio
System & Network Admin

0
Gabriele Maoret - SERSIS Replied
No, it's not the only one. This is only an example, there's quite a few other there...

The fact is that if I disable TLS for OUTBAND SMTP the issue suddenly disappear...
0
Robert G. Replied
Our issue was due to URIBLs... The average response time was very high. Oddly enough these were just fine on 7459. It only became an issue after upgrading a few days back. 

0
Jade D Replied
I want to add to this thread.

We have a ticket open about this same issue and there is for the most part a definite issue with smartermail when using tls.

We've seen issues where smartermail connects to the remote server using tls and then reports a mismatch despite there being no mismatch with the certificate and hostname.

Another instance, when smartermail connects via tls and then simply hangs.

Emails sit in the spool for days with no delivery attempts, and the next attempt date is in the past.

Turning off "Enable TLS if supported by remote server" and restarting the spool service does not resolve the issue.

The only workaround as of now is to restart the mail service which on a busy shared mail server causes a loss in email transmission for active mails which is not ideal.

Keeping "Enable TLS if supported by remote server" turned off results in clients complaining when they send email to Gmail as the email is not encrypted.

There appears to be no other way to restart the spool other than restarting the mail service on the server.
0
Jade D Replied
Here is some data to show how the amount of mails in the spool continuously grow until we restart the smartermail service.

Red arrows show when the mail services was restarted on the server.
The same is true for all 9 or more mail servers that we manage.

0
Sébastien Riccio Replied
Not a fix, but a workaround in the meantime, as we don't have this problem, I was asking myself why we don't :

We use an outgoing gateway so SmarterMail doesn't communicate with the remote peers directly but only with our outgoing gateway (that is not SM) and there is no TLS issue with it.

Well almost all transit through the outgoing gateway. Some NDR/Delivery Success messages doesn't seems to go through the gateway but that is another topic.
Sébastien Riccio
System & Network Admin

0
Gabriele Maoret - SERSIS Replied
To me this issue is resolved... Do you see it again?
0
Jade D Replied
@Gabriele

What did you change to get the issue resolved?
0
Gabriele Maoret - SERSIS Replied
I disable TLS for OUTBAND SMTP 
0
Jade D Replied
The problem with disabling TLS is that emails sent to gmail are flagged and the recipient is shown a warning.
Makes no sense that we have to compromise security to have a functioning mail server thats possible of delivering email.
0
Sébastien Riccio Replied
Yes, disabling TLS for Outbound SMTP is a bad idea nowadays. Many recipient servers now takes this into account for scoring e-mails and also display a security warning in some cases.
Sébastien Riccio
System & Network Admin

0
Jade D Replied
You're spot on Sebasien - we've received complaints from clients where mail has been filtered as spam because it was not delivered via TLS.

We have around 10 servers that are all suffering from the same symptoms. Mails queue for days if TLS is enabled and the only work around is to disable tls support, restart the mail server and wait for the spool to clear.

I've had a ticket open with Smartertools since last year and there has been no movement on it.

Smartermail tries to establish a connection with a remote host via tls and then hangs. No reason, no error, the connection simply sits there.

I've send Smartertools logs showing how on some days the mail arrive at the remote host, and others not.
0
Sébastien Riccio Replied
Jade, that is a really painful situation you have here. We're fortunate to use an outgoing gateway that isn't a SmarterMail instance and that handle outbound TLS flawlessly.

Sébastien Riccio
System & Network Admin

0
Sébastien Riccio Replied
Also the current global status of TLS on incoming mail servers is a bit cahotic. A lot of service providers mail servers use deprecated TLS versions and some even doesn't handle new ones that are current standard.

So if the sender server only accept TLS 1.2 and 1.3 and the remote server only propose TLS 1.0 and 1.1 (or even older), they can't negotiate and the transaction fails.
If I remember correctly, in SmarterMail, when sending with TLS fails, it fallbacks or at least was falling back to Non-TLS for the next retry. Have you check your logs, when the mail is finally sent, is it in a TLS enabled session?

Also it can be that a provider has multiple MX and some are updated with latest TLS libs and some other not and then it depends on which one you connect with the round-robin lottery of MXs. That could be the reason why sometimes it works and sometimes not.

All this bring another problem, if it retries without TLS after a TLS failure. You can't guarantee to your customers that the mail will be sent over a secure connection, and we have some customers that prefers the mail not to be delivered and receive a bounce instead of transmitting anything clear text on the wire.

So with all of this, our outgoing gateway we can force per source domain or destination domain, what we want for TLS (Try to use, Do not use, Force use and reject if it's not possible).
That way we have a complete control about our outgoing SMTP stuff.

Kind regards.
Sébastien Riccio
System & Network Admin

0
Jade D Replied
"
So with all of this, our outgoing gateway we can force per source domain or destination domain, what we want for TLS (Try to use, Do not use, Force use and reject if it's not possible).
That way we have a complete control about our outgoing SMTP stuff."

What are you using as a gateway that allows you to force TLS based on recipient domain?
0
Steve Norton Replied
Jade,

Can you do a WireShark capture on TCP port 25 for one of the failed attempts so we can have a look at the TLS client/server hellos and the cipher suite negotiation.
1
Gabriele Maoret - SERSIS Replied
I can confirm the issue is still here.

Yesterday I tested it enabling "Enable TLS if supported by the remote server" and suddenly got tons of e-mails blocked in the ougoing queue.

Disabling "Enable TLS if supported by the remote server" solved the issue, but it's not a good solution for mail security...

Please SmarterTools, take care of this issue!!!!

It's more than six month that it's here and still no FIX!!!!
0
Jade D Replied
Hey Steve

I would love to, but to run that on a mail server with 1000+ domains may cause issues and I cant spend an hour or so watching it hoping that one of our users sends an email to a mail server which previously does work and now doesnt.
0
Sébastien Riccio Replied
Hello Jade D,

We're forcing SSL based on *sender* domain (if our customer doesn't want their e-mail to be delivered without encryption), but with some work you could also base it on destination domain.
For this we use Haraka with some homemade plugins (Haraka plugin system let you hook on any event and to add your own piece of code to alter the processing)


We also considered (and still considering) using zone-mta that is a bit like Haraka with some nice features.

Kind regards.
Sébastien Riccio
System & Network Admin

0
Steve Norton Replied
Jade (or anyone actually),
Do you have a destination domain name or IP address of a server that is causing this issue today and I'll see what captures I can do myself. 
Also need to know the OS versions people are running?
0
Jade D Replied
Thank you for that info Sébastien - I'll look into that.

Hey Steve,

You can test TLS on www95.cpt1.host-h.net  - 196.40.97.42 
I dont have a mailbox on this remote ISP to provide you, but it is one of the servers that we sent through to SmarterTools via support response.

Within our email we explained that on the 21st January mails to this server were being delivered, and on the 22nd not.

I then disabled tls on smartermail, restarted the mail service and the mails that were queued for this destination IP were delivered.


0
Steve Norton Replied
Jade,
I've tested ports 25 and 465, I get TLS 1.2 using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384.
This is pretty strong stuff, do you have that cipher suite in the registry at;
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\00010002\
Or if you have it configured by policy, is it listed?
I've run a capture against SM and it uses this combination.
0
Jade D Replied
Hi Steve

Well done bud, you've achieved more in a few hours than what ST support have been able to find.

According to MS, the cipher suite TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 is available from Windows Server 2016 and up


Time to upgrade our version of Windows!
0
Steve Norton Replied
Okay Jade, well that's progress. So you don't have a common TLS 1.2 suite between you. I'll check to see if they accept 1.1.
I trust you have the default 2012 R2 suites.
0
Steve Norton Replied
Jade,
They accept TLS 1.1 with TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P256 which your server supports I believe.
Maybe you need to disable 1.2 support via SM '/Protocols/Security Protocols' and re-enable 'Use TLS if supported'.
0
Jade D Replied
Hi Steve,

Heres a screenshot taken from the mail server running IIScrypto 


0
Jade D Replied
Hi Steve

I missed the response below :

Jade,
They accept TLS 1.1 with TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P256 which your server supports I believe.
Maybe you need to disable 1.2 support via SM '/Protocols/Security Protocols' and re-enable 'Use TLS if supported'.

Im going to get a 2019 Server up and running and install SM on there and will report back. It makes sense to upgrade and future proof rather than disable one set of cipher suite which may cause issues with other providers.

I'll report back on this thread as soon as possible.
0
Tim Uzzanti Replied
Employee Post
Jade, what version of Windows Server are you using?

Looks like we need some KB's on this.  With some of the larger companies and cloud companies flipping the switch on old TLS, we are going to start seeing different kinds of results. 

Tim Uzzanti
CEO
SmarterTools Inc.
(877) 357-6278
www.smartertools.com