3
Delay unrecognized domains - "Enable spool proc folder" - anti-spam idea?
Question asked by Colin M - 4/22/2015 at 10:36 AM
Answered
I've noticed a pattern of spammers that has been working rather well lately and that is sending spam from new domains before they can be blacklisted. I will often receive spam that doesn't hit any blacklists and then by the time I go check the blacklists manually the domain is listed! I can confirm by the timestamps that the email was received before the domain was blacklisted.
 
So, what I'd like to do is write a program that works similar to greylisting but only locally. That is, for any new domains (e.g. first instance seen within last 15 minutes), delay the email by say 15 minutes to give time for the domain to be blacklisted *before* applying any spam checks.
 
I see that SmarterMail has a "Enable spool proc folder" feature but it is not clear how this works. Does this spool proc folder feature apply before or after other filters? E.g. I have a remote spamassassin server that does most of my spam filtering so I would like for the message to be checked by spamassassin *after* I move it back into the spool. Is this how it works or is there any other way to accomplish this?
 
Note, I have greylisting enabled already. I'm not positive but I think the spammers are bypassing it by sending multiple duplicate emails in quick succession since often these spam messages come in duplicates.
 
Thanks,
Colin

12 Replies

Reply to Thread
0
Scarab Replied
Smartermail Antispam checks are done prior to the Incoming message being moved to the Spool Proc folder. The reason these Antispam checks are done first and foremost is so that Smartermail can determine whether to block the message if it exceeds your Incoming Weight Threshold for those Spam checks that have Enabled for Incoming SMTP Blocking. After Antispam checks it then moves the message to the Spool Proc folder for additional processing, such as Declude or MailSniffer, or other third-party applications.
 
To be honest I can't recall whether Incoming messages are sent to the external SpamAssassin server before they are moved to the Spool Proc folder or after they are returned to the primary Spool. I'll have to double-check. However, if it is after then you can disable the Antispam check in Smartermail and enable that Antispam check in your external SpamAssassin server, in which case you could enable your Spool Proc folder, have a Scheduled Task to fire a script at 1 minute intervals that moves messages older than XX minutes back to the Spool, and have your external SpamAssassin server do the Antispam checks.
0
Colin M Replied
Thanks for the reply Scarab. I did a test by enabling the feature briefly and disabling after an email popped up. As you say the .hdr file had the headers for the Incoming SMTP filters but not SpamAssassin since it is not an Incoming SMTP filter. So I think my idea will work with current functionality. Not sure best way to implement it not being a .NET programmer... Hoping to find a Node.js plugin that adds inotify like functionality on Windows..
0
Colin M Replied
Marked As Answer
I've implemented the "local greylisting" feature as described. All email for unrecognized domains is delayed for 15 minutes before the email is processed using the "Enable spool proc folder" option. Email that should not be delayed is only delayed by 100ms (to allow for possible filesystem delays).
 
The program is implemented in Node.JS and uses the fs.watch API to immediately detect new files (rather than polling). A local persistent database is kept to track the domains. Due to Node.JS event-based nature it is extremely resource-efficient. The script can install itself as a service or you can run it as any other node process for testing and errors are logged to a logfile. All existing files are processed when the service is restarted.
https://gist.github.com/colinmollenhour/5841afe111b13dc9c648
0
Dave Lerner Replied
Hi Colin,

Dude, *EXCELLENT* job with this. I've implemented this on my server and its working fantastically. Just have to install node js and follow your directions to install as a service.

I will point out however, the code is not compatible with subspools. Rather than modify your code (and when I have time I'll do that and post...I don't know nodejs that well right now...), I converted my server back to no subspools and its all working fine.

I'm also curious, when you say "unrecognized domains" are you saying not one of the ones that match (secondLevelDomain.match(/\.uk$/) ||
secondLevelDomain.match(/^(com?|org|net|gov|edu|biz)\.[a-z]{2}$/) ||
secondLevelDomain.match(/^[a-z]{2}\.(gov|us)$/) ?
Just asking because I've seen some delayed with .com domains and wondering about that logic. Anyway, I'll dig more through the code, I'm sure the answer is right there.

Thanks for the post and the code!
steve
0
Colin M Replied
Thanks, Dave/steve. I'm glad that someone found it useful, it sure made a huge difference for me for all of the register/throw-away domain spam.

I don't know at what point having multiple spools becomes beneficial, but I am guessing it is not until you are in the thousands of emails per second order of magnitude because in my observation the spool operates so fast that there are never more than a few files in the single spool on my server with over 200 users. So while you could modify it to work on multiple spools I really doubt there is any benefit. I hope someone corrects me if I'm wrong on this point.

An "unrecognized domain" is any second level domain that the system has never before seen. E.g. "example.com". Some work is done to ensure that TLDs like ".co.uk" and others are handled properly (see line 210-212). I think if it operated on only TLDs it would not be very effective.

Just an FYI.. My "domains.db" file is now over 173,000 records and is about 10MB. The Node.JS process uses ~250MB of memory and was using over 400MB before I restarted it. That's a lot of domains.. There should probably be a mechanism to clean up domains that haven't been seen in a long time to keep it from growing infinitely...
0
Dave Lerner Replied
Hi, Dave is the owner, I'm just a contractor using his account. In any case, yes, that code is awesome. What a great idea you came up with!
I've actually been modifying it all night and have created a new watcher for another project with smartermail, really, fantastic idea!

The multiple spool thing is their way of overcoming the windows limit for the number of files in a folder, but I think you have to hit some ridiculous number to even bump up against that. And if you have that many in your spool, you have big problems anyway. This particular installation is for an ISP and they have thousands of users, so the spool is pretty busy, and I've seen it fill to several hundred in just a few minutes. But still, its not going to hit the folder limit. I reconfigured the server to just use a single spool, and so your code works perfectly now.

I'm not too worried, yet, lol, about the size of the thing, this is running on a hosted server with tons of resources. But agreed, I will have to put it on my to-do list to write some clean up code...maybe next year.
0
Dave Lerner Replied
Hi Colin,
I'm still using this Node.js scheme and its working perfectly. However, I also have a second mail server, a free smartermail one configured as a backup MX, er, actually, its now configured for domain forward in SM gateway mode. In any case, recently, spammers have discovered this second server. I have the antispam setup the same way as on the primary, without bayesian of course. But without the spam delayer code running, it is subject to the same issues you describe above.
So, I set the whole thing up over on that server and couldn't understand why it was simply moving all the messages back into the queue. The answer was that on the backup mx box the SM configuration has not users or domains, so the .HDR files all have the last line as:
containsLocalDeliveries: False
Which causes a failure in our code here:
            if (lines.length > 5 &&
                lines[1].match(/@/) &&
                lines.some(function(line){ return line == 'containsLocalDeliveries: True'; })
....which in turn causes the code to write the files immediately back into the spool.
I modified that bit so it looks like:
            if (lines.length > 5 &&
                lines[1].match(/@/) &&
                lines.some(function(line){ return line == 'containsLocalDeliveries:'; })
..and it works fine.
 
Just curious about one other thing...did you ever figure out the order of processing in SM? I mean, @Scarab says the Antispam (and presumably the RBL checks) are done prior to moving the message into the proc folder. If that's the case, why would delaying the mail make any difference? How would you force SM to "redo" the spam checks if this is the case?
 
Thanks!
steve
0
Dave Lerner Replied
I guess if you are using a SpamAssassin server this scheme makes more sense...since that is doing its own thing out of band with SmarterMail. I suppose a feature request to SM would be to include a setting that implements your logic during the RBL process...where admins could configure the delay period, disable it, etc.
0
Colin M Replied
Yes I am using a separate SpamAssassin server. I don't know exactly which tests occur before and after the spool, but the external SpamAssassin definitely happens after.

Regarding the 'containsLocalDeliveries: True' check it has been a while so I don't remember exactly what I was thinking but I think I was thinking maybe the spool was used for outgoing email too and I only wanted to apply this delay to incoming. Can you confirm that 'containsLocalDeliveries' is not related to outgoing email for non-backup servers?

I haven't had much luck with SmarterTools and feature requests so I personally won't bother with this one, but feel free to present it to them if you wish. I know it's done wonders for my spam filtering with all of these throw-away domains spammers use these days.
0
keith dovale Replied
Hi,
 
we use declude, which gets the mails from the proc folder, I would like to use this but I see it will most likey create an issue, as declude and the delayer will try grab the files from the proc folder, is there a way to maybe get SM to deliver to a folder lie "delay" then the delay prog scans them and then once the peiod is over return them to the proc folder where declude can run its own checks and return them to the spool folder ?
 
it would be nice to be able to use both
0
Colin M Replied
That would all depend on if declude is configurable since I don't think SmarterMail is in that regard except for the "use multiple spool directories". I don't know anything about Declude though.. If it works only when configured one way then you could "misconfigure" it and use the delayer as the intermediate.
0
Colin M Replied
FYI to anyone using this still: Somewhere between SmarterMail 15.0 and 15.7 the .hdr files started receiving a UTF-8 BOM which broke the string matching used in the script. The script has been updated on gist with a fix.

Reply to Thread