4
17.0.6878 running 50-100% CPU
Problem reported by Chris - 11/6/2018 at 9:40 AM
Resolved
We updated to 17.0.6878 and noticed the smartermail service is running hot, anywhere between 50%, 70% and even 100% CPU, resulting in us needing to reboot it to make the server stable again. Is anyone else having this issue? Any idea what could be causing this?

22 Replies

Reply to Thread
0
Thomas Lange Replied
Hi Chris,

right now - outside our business hours - our SmarterMail v17.0.6878 is running betwenn 45-60% CPU,
(SmarterMail Enterprise / 250 users - around 130 users currently in use, EAS+EWS addons). Operating System is Windows Server 2012R2 on VMware ESXi 6.5.

I will keep an eye on CPU tomorrow during our businesss hours, at least we did not encounter performance/stability issues yet in v17. (btw: In the past with SM16 we had some CPU/IIS Worker Process utilization/performance issues, but these were already solved some SM16-minor-releases ago).

Thomas
0
Chris Replied
This version 17 server has less than 25 users.

Our production version 15 servers have 10K users and run at 20% CPU
0
Thomas Lange Replied
today during business hours I had an eye on process monitor at our Windows 2012R2-SmarterMail production server - running v17.0.6878: SmarterMail-process CPU was between 45-75% - most times between 50-60%.
Due to the fact that we do not receive any complaints about performance yet - this could normal.

But if I remember right, at prior v16 of Smartermail CPU was less... probably 20-30%.

If have no idea why your smaller v17 server has such high CPU... and if there are performance issues going up to 75 or 100% CPU that should not be ok - is there any kind of re-indexing of huge email-accounts going on at your SMv17-server?
0
Chris Replied
We were running SM16 on this particular server before upgrading to the BETA. It is not busy at all and I never noticed any high CPU usage before. For some reason, it was running at 100% and the entire server was unresponsive, requiring a reboot. To my memory, I have never seen Smartermail do that before in my 15+ years using this product. So something weird is going on with this latest BETA build version. I will see what happens on the next patch.
0
Employee Replied
Employee Post
Hi guys,

In SmarterMail 17.x, we've updated the indexing format. On your initial upgrade to 17.x (from any earlier major release), ALL user accounts will be reindexed in order to utilize this new format. As the reindexing occurs, you will see a temporary spike in CPU. However, please note that a user's experience and the overall server performance should not be negatively impacted, as the indexing thread is prioritized lower than all other threads. In addition, users will still be able to search while this one-time bulk reindex occurs. This is because a user's account will use the old indexing format for searches if the reindexing has not yet been performed for their account. As soon as their account has finished reindexing, it will simply switch to the updated format.

This information should have been noted in our BETA threads previously, and I apologize that it was not. I've since added this information right next to the installer link for the BETA release announcement thread (since it doesn't impact just a singular build). The CPU spike is expected and will rectify itself once the reindexing is done. 

I hope this helps!
0
Chris Replied
Hi Andrea,

I actually started with the 17.0.6870 version, and then patched to 17.0.6878. So I think all the re-indexing should have been done a couple weeks ago. Everything was running fine until the server became unresponsive yesterday, I checked and it was at 100% CPU so I had to reboot it. It has been stable since but still running at about 70% CPU now. 

Now that you mentioned it, my search inbox search is not really working.
0
Tim Uzzanti Replied
Employee Post
Can you look at the reports in SmarterMail and provide the chart of CPU usage before and after.  We have a new build out today with more optimizations and fixes.  Every build moving forward is small fixes and optimizations until release on Dec 3rd.  If we see a dramatic jump for a long period of time we want to profile your installation.  Thanks for testing things out!
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
Chris Replied
SM 16
10/20 0%
10/21 0%
10/22 0%
10/23 0%
10/24 0%

SM 17 BETA
10/25 38.46%
10/26 63.75%
10/27 63.83%
10/28 63.88%
10/29 63.92%

1
ram Replied
Any update for this issue?

I have the same issue without resolve it.
0
Matt Petty Replied
Employee Post
Hello ram,

We've made some significant improvements to our CPU and Memory usage. You can go to
https://www.smartertools.com/smartermail/release-notes/current and search (Ctrl+f) for "efficiency" to see these.

Keep in mind when you move to the new SmarterMail you might have some lingering high CPU as we need to re-index the user's on the box and this is done on a low-priority CPU thread. So even though it is using CPU other things in SmarterMail should get CPU priority.
Matt Petty Senior Software Developer SmarterTools Inc. www.smartertools.com
0
ram Replied
We have installed the last version 6948 and the CPU is high.

The re-index the user's on the box takes 4 weeks?
0
Tim Uzzanti Replied
Employee Post
No.  If you haven't already opened a ticket with support, please do so.  We can profile and see what might be using the CPU.
Tim Uzzanti CEO SmarterTools Inc. www.smartertools.com
0
echoDreamz Replied
Us as well, CPU stays low most of the time, but we do have spikes into the 90%+ that lasts anywhere from minutes up to 20+ minutes. However, it is significantly better with this build for us.
0
Matt Petty Replied
Employee Post
@echoDreamz,
I wasn't aware you were still getting those, I looked at your server all Friday morning and noticed a spike maybe every 5-10 minutes and lasting for a couple seconds (not 20+ minutes). Those spikes should have resolved ever since we applied the Calendar cleaning and all profiling I've gathered since reinforces this. Is there a period of time where this happens most?

EDIT (2:18): 20 minutes later since posting, Highest spike I've seen is 47%. Averaging ~28% cpu. Floats anywhere between 20 - 35 usually.
EDIT (2:24): Saw a spike to go 74% for 5 seconds, settled back down to averages immediately after.
EDIT (2:39): Saw a spike to go 88% for ~5-10 seconds, settled back down to averages immediately after.
Matt Petty Senior Software Developer SmarterTools Inc. www.smartertools.com
0
echoDreamz Replied
Correct, it usually stays low, but we have random spikes. We had a few occurrences this weekend where the web interface stalled completely, it did come back and being the weekend we didnt receive many complaints. As well as a few times yesterday morning.

I know we have JetBrains development stuff installed on the server, if you give me the instructions for it, I will generate a dump etc. using it when the SM process is spiking.
0
echoDreamz Replied
I am also not sure if it is normal in SM 17 now, but with v16, our Handles for the SM process was usually <10k sometimes going as high as almost 15k, however now we are seeing almost 30k handles. If it's normal, that's fine, just making sure it is not pointing to something wrong.
0
Matt Petty Replied
Employee Post
We've changed a ton on the backend and things are happening quicker now, which can actually slow things down because we had a lot of locking code.  Locks are artificial stopping points for an application and usually get put onto things like domain settings, user's mailboxes, etc. They basically give one-at-a-time access to whatever we put them on. SM is way more efficient now and can handle multiple threads hitting resources at the same time with less locks. This could actually mean people would see MORE cpu usage, as we now do less waiting and more work. So I'd say its normal to see the handles being a lot higher as there is more things happening (especially in your case), more pieces of data floating around.

Also, I sent you those instructions.
Matt Petty Senior Software Developer SmarterTools Inc. www.smartertools.com
0
echoDreamz Replied
Sounds good Matt, yeah, I knew you guys removed a lot of the locks. As for the handles, that is absolutely fine as long as it is normal :)

Instructions received, will keep an eye out.
0
Alex Clarke Replied
Working with Rod (from ST) we've identified that the indexing service is causing high CPU load on our servers (Build 6955 (Jan 16, 2019)).

I doubt this is affecting all customers, but ST are aware and working on a fix.

Thanks for the continued help Rod!
0
Damián Dela Huerta Replied
I am having the exact same issue, recently upgraded from 16x to 17 (Build 6928 ) and does same behavior Chris described in his original post, averages aroun 5-10% CPU, then what seems randomly will instantly jump to 70, 80, or 100 CPU until we restart the service..
Twin Vision Studios, Inc.
1
Alex Clarke Replied
I've just replied to your other post about this.

Contact ST for help with identifying possible corrupt .json files - this could be the issue.
0
Damián Dela Huerta Replied
thanks Alex!!
Twin Vision Studios, Inc.

Reply to Thread