IOwait causing SmarterMail Linux to freeze
Problem reported by Rami El-Zein - Today at 2:03 PM
Submitted
Hello,
I’m experiencing an intermittent issue with SmarterMail that I haven’t been able to isolate, and I’d appreciate any guidance from the community.
I’m running the latest SmarterMail Enterprise (version 100.0.9560.29387 – 03/05/2026) with  1,770 users on a Linux-based AWS instance (16 vCPU, 32 GB RAM, and 13 TB gp3 storage). 
Outgoing mail is routed through MailChannels.
System storage:
  • / (root): 48 GB total, 35% used
  • /data: 12 TB total, 92% used
The issue:
Once or twice during workdays, IOwait suddenly spikes to 70–90%, and load average climbs as high as 50. At that point, the server becomes unresponsive until I restart SmarterMail. It feels like a process or user action gets out of control, but I haven’t been able to identify the root cause or determine what limits I should enforce to prevent this.
Usage details:
  • Peak concurrent users: 700+ (webmail, IMAP, MAPI/EWS mix)
  • The issue can also occur at lower load (around half that number)
  • Some mailboxes are quite large (40–60 GB)
Storage tuning:
  • Increased gp3 IOPS from 3,000 → 6,000
  • Increased throughput from 125 MB/s → 375 MB/s
  • Despite this, the issue still occurred twice today
Today’s stats (Reports):
  • Bandwidth:
    • SMTP In: 55 GB
    • SMTP Out: 2.2 GB
    • IMAP: 86.8 GB
    • POP: 3.2 GB
  • Messages:
    • Inbound: 47K
    • Outbound: 7.7K
Sessions:
  • SMTP In:
    • New: 57K
    • Bad Commands: 31K
    • Terminations: 6.3K
  • SMTP Out:
    • New: 2.9K
    • Terminations: 93
  • IMAP:
    • New: 84.8K
    • Bad Commands: 1.1K
    • Terminations: 1.9K
  • POP:
    • New: 5K
    • Bad Commands: 2.2K
    • Terminations: 38
Spam:
  • Total inbound spam: ~3.8K (seems reasonable)
IDS settings (current):
  • Bad SMTP Sessions (Fast): Block, 5 min / 10 threshold / 60 min block
  • Bad SMTP Sessions (Slow): Block, 60 min / 25 threshold / 360 min block
  • Bounces: Quarantine, 5 min / 10 threshold / 30 min block
  • DoS: Block, 2 min / 100 threshold / 30 min block
  • Internal Spammer: Block, 10 min / 100 threshold / 60 min block
  • Password Brute Force/IP: Block, 5 min / 200 threshold / 30 min block
  • Password Retrieval Brute: Block, 5 min / 50 threshold / 30 min block
Other notes:
  • Max 3 concurrent migrations allowed (and rarely reached)
  • Indexing settings:
    • Max threads: 5
    • Items per pass: 100
    • Queue delay: 30 seconds
Questions:
  • Has anyone encountered similar IOwait spikes tied to SmarterMail?
  • Could this be related to indexing, large mailboxes, or IMAP behavior?
  • Are there recommended limits (connections, indexing, mailbox size, etc.) to prevent this type of resource spike?
  • Do my IDS thresholds look reasonable, or should they be more aggressive?
Any suggestions on where to start troubleshooting would be greatly appreciated.
Thanks in advance.

Reply to Thread

Enter the verification text