SmarterMail 8747 high memory usage?
Question asked by Jay Altemoos - 3/20/2024 at 1:54 PM
Has anyone had issues with mailservice.exe using high amounts of ram for no reason? I had to restart our server today because SmarterMail was using 52GB out of the 64GB we have installed on this server. When I restarted the service, the previous mailservice.exe just got stuck in the taskmanager and I could not end it without force restarting the server.

After the server reboot the mail service went back to it's normal 10-12GB of usage. Looking through the Average Hardware Usage screen, it appears that the high memory usage started 2 months back and has been averaging 48-52GB of memory usage for no reason what so ever. Previously we were averaging 10-12GB of memory usage.

5 Replies

Reply to Thread
Jay Dubb Replied
We've seen that sporadically back on 8451 and prior.  RAM and CPU would suddenly spike to near 100% and hang there.  Sometimes the condition would correct on its own, sometimes it would require a service restart.  If we saw a huge memory spike, but CPU would stay below 70-80%, we'd leave it alone and most times RAM would drop mostly back to normal after a few hours.  But if the CPU was holding above 85-90% sustained, we knew from experience it would only get worse by the minute until all CPU cores were pegged at 100%.

Be very, very careful about hard-killing the mailservice.exe process.  We did that a time or two, and ended up corrupting enough JSON files that a couple hundred mailboxes across several domains failed to load and inbound mail was rejecting with 550 not-found status codes. That caused a lot of problems for users who were then automatically dropped from mailing lists.  What made that worse is the 550 rejections landed many of them on do-not-send-again lists, meaning they could not even re-subscribe.  We had to set up aliases for them, so they could re-subscribe with a different address.

With that said, when you stop the SmarterMail process in the Services applet, watch task manager for mailservice.exe to wind down and close.  Be patient.  It can take several minutes... so long in fact, that the script we use to reboot the SM server at 3:00 a.m. after Windows Updates, sends a Stop signal to the SmarterMail process and then 'sleeps' for 5 minutes before triggering the restart, to give mailservice.exe enough time to bleed off memory and gracefully shut down.
Jay Altemoos Replied
Good day Jay,
Thank you for the feedback. CPU was never an issue on our server, but the memory spike was definitely an issue. When I stopped the SM service memory usage went down considerably on the mailservice.exe but stayed stuck in the taskmanager, I could not kill it on my own since Windows gave me an access is denied on it. When the SM service started back up, mailservice.exe was now running twice in the taskmanager, So I really had no choice but to hard reboot the server since the mailservice.exe stuck in the taskmanager would not allow the server to reboot gracefully. This is definitely new behaviour and have not encountered this previously. Hopefully I don't run into that again and I appreciate the insight.
TOAST.net Replied
We also have had this issue (currently on Build 8797 for a gateway server). It seemed to start around the time that we switched to the .NET 8 version (build 8747). Currently working with ST support to figure out what's causing it. Similar series of events as what you described. Server is almost at a standstill, check task manager and MailService is using up almost all of the memory (97%-99%). We restart the service and that fixes the issue temporarily (after restart it goes back to about 30%), then after about 5-7 days it happens again. This week I am noticing that each day the memory usage increases. The day after reboot it was up to 40%-50%. The next day 60% - 70%. This morning it was running at 79% usage. Must be a memory leak somewhere.

For us SmarterMail Service also just hangs up when we try to stop it, but as Jay Dubb said, after a few minutes it will finally shut down. 
Jay Dubb Replied
@Toast, that's quite a memory leak if that's what is happening.  Are you running clam or any add-on antivirus or spam components, or is this pure out-of-the-box Smartermail with nothing added?

We never were able to get an answer why our RAM would suddenly spike-- I'm talking about a vertical line going straight up on the graph, as in going from 40 or 50GB running, then suddenly up to 94 GB within 60 seconds.  

The problem with diagnosing this kind of problem is, when they ran their JetBrains capture utility, it crushed the server to a dead stop for a very long time (45++ minutes) during the middle of a workday... something that we had to stop doing for obvious reasons.  It took a long time to fully capture the entire system image (96 GB RAM server) and write it to disk, at the same time the server was also being crushed by the runaway mail process.

Consequently, with only the original few dumps to go on, no root cause was ever found... and we lived under a constant state of paranoia over when it would happen next.  We set up additional external monitoring to send text alerts to the admin team if RAM use went above 80%.  Since we couldn't do the JetBrains dump, we'd just restart the mail server process to get it back online.

We are REALLY hoping the "new" .NET 8 version doesn't have these problems.
TOAST.net Replied
Jay, we run Clam with it and that doesn't seem to have any issues that we could see. Resource wise, Clam stays consistent. Today SM memory usage is at 88%... (running on 16GB of RAM). 
We had similar struggles when running the dot trace on JetBrains for ST Support. Hoping for better results for you on the newer version :)

Reply to Thread