Help needed with I/O Wait getting out of control
Problem reported by Rami El-Zein - 3/26/2026 at 2:12 AM
Submitted
Hello, I would appreciate your suggestions on how to resolve high I/O wait that causes SmarterMail to freeze and require a restart at least once or twice a day:
I’m running the latest Linux build on AWS with 16 cores and 32GB of RAM. The root partition is at 33% usage. The server hosts roughly 1800 email accounts, some exceeding 50GB each. Total storage is 11TB, with 3TB on SSD and 8TB on slower HDD.
Within SmarterMail, I’ve configured emails older than 90 days to be moved to slower storage. On a typical day, there are  250 IMAP connections and 100 MAPI connections, with the rest using webmail.
Current settings:
  • Max IMAP connections: 500
  • Max IMAP retrieval threads: 5
  • IMAP retrieval interval: 10 minutes
Outgoing mail is handled by MailChannels.
The I/O wait spikes occur once or twice daily. However, yesterday the issue worsened when two users with ~70GB mailboxes initiated Outlook archiving to reduce mailbox size. This causes a restart every 10 minutes.
What configuration changes or best practices would you recommend to mitigate this issue? Moving all storage to SSD is not financially viable.
Thank you in advance.
Douglas Foster Replied
I have a much smaller configuration supporting more users, but none of our users have such large mailboxes.

Before buying SmarterMail, we had some executives on an external Hosted Exchange domain, and we have never felt ready to move them off of it.   We have even moved a few users onto that environment when their Outlook performance became unacceptable.  

However, those problems were always user-level issues, not system-level.   Other posts in this forum suggest that the latest release is sensitive to dirty data hidden somewhere in  your environment.  You should engage support to get the dirty data found and fixed.  (Or if the problem is a bug, to get the bug found and fixed in the next release.)

Rami El-Zein Replied
Thanks Douglas, we decided to upgrade  /data2 which was the slowee HDD to gp3 SSD storage and we will see if that works out on Monday when the load is the highest. Ill post an update with the results.

Reply to Thread

Enter the verification text