4
Mutiple SmarterMail Crashes Per Day on Build 7459
Problem reported by Scarab - 6/12/2020 at 10:06 AM
Submitted
Ever since upgrading to Build 7459 we have been experiencing multiple crashes per day (it may be important to note that we installed a critical security patch to .NET 4.5 the same day that we upgraded to Build 7459). We have the MailService.exe set to automatically restart in Services.msc, which it does, but it is generally down for 3-5 minutes while it restarts. The Windows Event Log is showing the following errors;


Log Name:      Application
Source:        .NET Runtime
Date:          6/11/2020 9:20:15 AM
Event ID:      1023
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      SMARTERMAIL.domain.com
Description:
Application: MailService.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an internal error in the .NET Runtime at IP 00007FFF715A59C3 (00007FFF71550000) with exit code 80131506.

Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">;
  <System>
    <Provider Name=".NET Runtime" />
    <EventID Qualifiers="0">1023</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2020-06-11T16:20:15.763733600Z" />
    <EventRecordID>7203</EventRecordID>
    <Channel>Application</Channel>
    <Computer>SMARTERMAIL.domain.com</Computer>
    <Security />
  </System>
  <EventData>
    <Data>Application: MailService.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an internal error in the .NET Runtime at IP 00007FFF715A59C3 (00007FFF71550000) with exit code 80131506.
</Data>
  </EventData>
</Event>

Log Name:      Application
Source:        Application Error
Date:          6/11/2020 9:20:18 AM
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      SMARTERMAIL.domain.com
Description:
Faulting application name: MailService.exe, version: 100.0.7459.31218, time stamp: 0x5ed83e5d
Faulting module name: clr.dll, version: 4.8.4180.0, time stamp: 0x5e7d21b3
Exception code: 0xc0000005
Fault offset: 0x00000000000559c3
Faulting process id: 0x924
Faulting application start time: 0x01d63f02aa8e761e
Faulting application path: D:\smartermail\Service\MailService.exe
Faulting module path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll
Report Id: cbe0e51f-6ed6-4319-a504-a77048260e6a
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">;
  <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2020-06-11T16:20:18.841783500Z" />
    <EventRecordID>7204</EventRecordID>
    <Channel>Application</Channel>
    <Computer>SMARTERMAIL.domain.com</Computer>
    <Security />
  </System>
  <EventData>
    <Data>MailService.exe</Data>
    <Data>100.0.7459.31218</Data>
    <Data>5ed83e5d</Data>
    <Data>clr.dll</Data>
    <Data>4.8.4180.0</Data>
    <Data>5e7d21b3</Data>
    <Data>c0000005</Data>
    <Data>00000000000559c3</Data>
    <Data>924</Data>
    <Data>01d63f02aa8e761e</Data>
    <Data>D:\smartermail\Service\MailService.exe</Data>
    <Data>C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll</Data>
    <Data>cbe0e51f-6ed6-4319-a504-a77048260e6a</Data>
    <Data>
    </Data>
    <Data>
    </Data>
  </EventData>
</Event>

25 Replies

Reply to Thread
0
Tim Uzzanti Replied
Employee Post
We had another customer report the same issue on the same day they installed the .NET Microsoft security update.  The number of fixes Microsoft released was record setting in regards to the number of bugs / security updates.  We have updated our SM servers and will be monitoring as well.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Chris Replied
Is it this update? If yes, I will make sure I hold off... 

0
Scarab Replied
Chris,

It was not KB4561608 (although we did install that one at the same time as well). There was specifically a .NET update.

Tim,

I seem to recall this .NET error occurring regularly before with SmarterMail, it having something to do with garbage collection on 64-bit .NET 4.0+. Multiple concurrent garbage collections were causing corruption in memory. It did not occur on 32-bit .NET. I can't remember any more details (like what version) and my Search-Fu on the forums is failing me atm.
0
Tim Uzzanti Replied
Employee Post
We had an IIS crashing issue with MAPI during the BETA.  I'm not aware of anything else.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Robert Simpson Replied
I had this same crash today on 7459.  I came here, saw 7468, installed it, and ran windows updates, and am in the process of rebooting the server.

0
Nathan Replied
Robert - Has 7468 made any difference?
0
Robert Simpson Replied
It only happened the one time on 7459.  It hasn't happened since.
0
Scarab Replied
Nathan,

Knock on wood but we've gone two days with Build 7468 installed without a crash.
0
Shane Woodham Replied
I have had this problem for almost 3 weeks now. I loaded updates from Microsoft (which included the KB4552933 .net update)  and loaded the 7459 version and the crashes started immediately.   I have worked with Kyle daily on the problem.  Everything works properly until my staff starts to get in at 8 am.  Then the crashes start.  I just built a fresh new server last night, patched and ready to go.  I only loaded Smartermail.  As soon as my staff started to get in, the crashes started again. I noticed the update 7468 so I updated to that version.  Within the first 3 minutes smartermail.exe crashed.   Brand new server load with nothing but updates and SmarterMail.  I am at a loss at this point.  I just can't have my main job as IT director nursing this server all day.   
0
Shane Woodham Replied
What I am seeing is the CPU starts to climb and stay like that for about 45 seconds and then Smartermail.exe fails with a clr.dll error.  The service gets restarted on its on but the webmail will show not responding.  When I restart the service one more time the webmail will come back up and work for a while.  
0
Tim Uzzanti Replied
Employee Post
In regards to Shane's situation, we believe it is related to dynamic memory configuration with Hyper-v on Windows 2012 R2.  We have seen issues with this in the past and it causes OS and .NET Framework level issues. Gigs of memory become allocated on the VM even without SmarterMail running and the memory is unaccountable at the OS level.  We appreciate Shane giving us access to evaluate and we will update when we have more info.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Merle Wait Replied
Just for tracking trends...
Am still on SM15; basically same thing started happening on our system as well.... after MS update for one of the .Net frameworks. Basically it looked like to us... W3P (World Wide Web) services.. was having the issue and was affecting Smartermail.
We took another SM update Sunday morning ... the issue doesn't seem as bad (or maybe we haven't had the right mix of events)..
Win2016 Standard

0
Tim Uzzanti Replied
Employee Post
Ronald, do you have a ticket open because that sounds very different.  We try to access windows performance counters for server health information around that area.  Maybe a windows update or something changed recently preventing us from being able to access it?  I have had corrupted performance counter issues before and have had to re-register them.  When they error, they can cause memory leaks as well.  Not saying that this is the issue but that would be the first place I would look diagnosing your server.

Please open a ticket so we can look at it further but its very different and unrelated to Shane's server.  Shane's server (operating system) is behaving extremely odd  I got on it today after hearing my guys try to explain things because I thought they were nuts but what they said was true.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Tim Uzzanti Replied
Employee Post
Ronald, what version of Windows Server are you running?
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Shane Woodham Replied
Changing the VM memory from dynamic to a static 8GB didn't help my situation.   I came in this morning with the server down. I started the service and restarted it so the webmail would come up.  The system run for about 40 min and smartermail.exe crashed.  
0
Tim Uzzanti Replied
Employee Post
Shane, we will be back on your machine today. If you notice your memory usage on the machine, there are gigs of memory unaccounted for which progressively got worse until a reboot.  The machine essentially runs out of memory but there is no process including SmarterMail using that memory.  If SmarterMail was the culprit the unaccounted for memory would decrease when SmarterMail crashes but it doesn't.  We are kind of fascinated by what is going on and will be doing Windows dumps today now that memory is static.

We also duplicated your data on a new test VM on our machines and everything runs properly.  We were hoping we could reproduce SmarterMail crashing with your data. Things keep leading us to Windows and Hyper-v.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Shane Woodham Replied
Thank you for your help with this. I have noticed at night when my admins are off of their computers that we wouldn't have any errors, but when the employees started getting into work a little after 8am, that is when the errors would start. When I loaded the new server the other night, I was very hopeful that the problem would be fixed, but at 8:18 am it started.  Its a possibility that the reason our data is running perfectly on your server is because no one is connecting to it.    I hope this helps :) 
0
A System Administrator Replied
I just wanted to chime in here to see if this info is helpful for someone. We're running SM b7468 (and b7459 before that) on a Hyper-V guest with Windows Server 2019 Standard and dynamic memory enabled. There are usually less than 100 active users reported in the dashboard.

We're pretty wary of Windows updates so usually wait a month or two before installing them. I noticed we have "2020-05 Cumulative Update for .NET Framework 3.5, 4.7.2 and 4.8 for Windows Server 2019 for x64 (KB4556441)" waiting for install. Maybe this is the update mentioned by Scarab? We have not seen any SM crashes at all (random webmail logouts however starting with b7459).
2
Tim Uzzanti Replied
Employee Post
Shane, 

The issue was related to a user with a parent folder and a sub folder using the same ID.  It looks like this occurred via an IMAP client.  If you could look into user Nancyroc... (not providing the whole user) to see what client she uses so we can test and see if there are scenarios this client uses duplicate ID's because it shouldn't.

Please keep the dynamic memory turned off.  If this wasn't a Windows 2012 R2 server we would be less concerned but memory usage was really unexplainable and caused issues debugging and profiling your machine.

At the moment, SmarterMail is using < 3% CPU and about 1 Gig of memory.  Technical support is going keep and eye on your machine a little longer to make sure everything stays stable.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Shane Woodham Replied
She is using Outlook 2016.  She said she never uses the webmail so that should be the only mail client she has worked out of.  
0
Scarab Replied
Bumping this as this issue has come back to bite us many-fold on Patch Tuesday. Hoping that there is a solution.

We just upgraded from Build 7503 to Build 7523 using the .EXE and also installed KB4569751 Cumulative Update for .NET Framework 3.5 and 4.8 for Windows Server 1909 on the same evening (so I can't really tell which is to blame atm) and things went from bad to worse (so worse that SM doesn't really work).

We are now getting @ 6 SM crashes per day. Worse is that for the first two hours SM restarts after crashing it is barely functional while it builds up RAM (resulting in over 2+ hour delays in delivery of email). The SmarterMail service starts out a little more than a GB and keeps adding memory until it gets to @ 8.5GB where it can finally function properly. In the meanwhile you can have only one of the following, but not all 3:

  • Delivery of Email
  • Mailbox Indexing.
  • Anti-Spam & Anti-Virus Checking (we only have Message Sniffer, Null Sender, SpamAssassin and ClamAV running on this server as our Incoming Gateways handle all the other Anti-Spam Checks).
CPU is pretty stable at < 50% (even with 30,000-120,000 messages stacked up in the Spool queue because our SM installation just can't cope with doing mail stuff anymore...although SM Spool Dashboard can only show a max of @ 2500 msgs in the Spool at a time) but Memory usage will work it's way up to 80% before crashing again.
 
We are running Windows Server 2019 on Hyper-V with Dynamic Memory. As we are crashing so frequently at this point it doesn't matter if we have Out-of-Band maintenance because SM is constantly down now. I will disable the Dynamic Memory tonight to see if that makes things any better...otherwise we will try to roll back to Build 7503 that still crashed frequently but recovered immediately from crashing and did it's job.
0
Robert Simpson Replied
Are you using MAPI?  I had to disable it for now.
0
Scarab Replied
Robert Simpson,

We are not running MAPI. Just basic POP/IMAP/SMTP & Webmail.
0
Tim Uzzanti Replied
Employee Post
We have no known issues that would cause a crash.  Please open a ticket so we can evaluate your server specifically. 
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
1
Scarab Replied
Tim,

In the end I didn't open a Support Ticket as I was able to isolate the issues one by one and remedy most of the problems,myself, which ultimately seemed to put an end to the frequent .NET crashes referencing clr.dll. They were either environmental issues or data issues. Although we were experiencing fringe issues with the data on Build 7523 they were definitely caused by recent changes as the data had existed that way in Build 7503 without a problem. I outlined them in a new thread at https://portal.smartertools.com/community/a93514/build-7523-issues-some-with-work-around-resolutions.aspx that includes my workarounds to some of the issues we were experiencing that lead to our problems post-upgrade.

Reply to Thread