Some thoughts on obtaining success with email filtering, with references to Declude as appropriate.
Everything is about identifying and blocking malicious sources.
If you are having good results with Declude alone, it must be because of the many RBLs that it can use. Declude has no proprietary knowledge of its own.
A malicious source can be a server organization, the mail system operator (SMTP domain), or the end user (From domain or full From address)
If the problem is the server, you want to block all servers from that organization. You don't know all of the IP addresses used by the attacker, but you can use a block rule based on "host name ends with <value>" as well as "SourceIP=<value>". Spammers don't bother to change host names as often as I was told to expect. Declude has built-in capabilities to filter on Reverse DNS name, but it does nothing with HELO. I have used the HELO string from the HDR file, in custom scripts, to offset this deficiency.
If the attack comes from a malicious mail system owner using a legitimate server platform, then you block on MAILFROM, the SMTP address. Declude does this easily.
A significant number of attacks come from legitimate email service providers who have unethical client organizations. For these, you need to be able to block on the From address. Declude has no specific support for filtering on the From address, but again I extract the From address from the HDR file using a custom script. Parsing an email address out of the message file can be very complicated. SmarterMail does a pretty good job, but if it gets confused, it punts by repeating the SMTP MailFrom address in the HDR file line that is supposed to be the From address. I have written some custom code to parse the file myself, to correct specific scenarios where SmarterMail gets confused, but there are deficiencies in my code as well. I have reported the known parsing failures to SmarterTools, and they are in the queue for future bug fixes..
One particular email service provider has a particularly difficult client mix. Some of his clients are sending critical traffic like password resets, and others are sending fake bank account messages. They seem to be doing better at client control in the last 6 months, but I still have them on a short leash: Known-good client domains are allowed, known-bad client domains are blocked, and uncategorized client domains are quarantined. Not many products, at any price, can give you this much control.
For content filtering, I rely on a commercial product. For sender authentication, I use custom code to implement a better version of SPF and a crude approximation of DMARC. In both cases, the real goal is to use these checks to identify malicious senders, then add another rule to block that identifier.
Consider that if you identify a malicious organization, you ideally want to block all references to that domain name, including: Server HELO name, Server Reverse DNS name, SMTP address, and From address. If Declude could parse URLS in the message body, I would also have it parse the URLS for blocked domain names. Someone could probably build a custom script to parse the message body for URLs, but it is above my talent pool so I leave URL filtering to the commercial spam filter.
As I was implementing Declude and building my rule set, the text files quickly grew to 1000s of entries, and the number of text files was steadily growing as well. That's why I moved metadata into SQL so that some of my filtering could be done with indexed tables instead of text files, especially since every text file is re-parsed for every message.
The beauty of Declude is its customization.