Weird issue with DBLs
Question asked by Andrew Stein - August 28, 2014 at 2:36 PM
I'm using SM 12 with my own in-house DNS servers that are not using public DNS for forwarders on a relatively low volume server.   
The problem is that every few hours, and stop responding to my queries unless I go in and clear out the cache on my DNS server.  This is not happening on any other blacklists including Barracuda,, Sorbs, etc.   My first thought is that they were blocking my IP from doing queries, but if that's the case, clearing the cache on the DNS server shouldn't fix it.   Anyone run into a similar issue and any thoughts on how to fix it permanently?  I don't want to have to remember to clear the cache every few hours or even run a script as I suspect that is just a bandaid and not  a real solution..

Unless you have a subscription to the RBL service, you are being limited to between 100K and 200K queries per day to that RBL listing service.
If you are using anything other than a PRIVATE DNS, which belongs to you, and is run by you, then those DNS servers can easily generate in excess of millions of RBL queries per day.
Once the threshold for the number of queries per day set by an RBL is exceeded by any DNS server's IP address, the RBL begins to return weird responses.  These are sometimes "non-responses" where the RBL simply does not respond to the query and the query hangs.  In other cases, a false positive response is returned and the query bounces legitimate e-mail.
SUMMARY:  To prevent these issues:
  • Run your own DNS and query the RBL via your DNS servers
  • Never cache a DNS server in SmarterMail - always allow SmarterMail to request a fresh DNS query.
What DNS resolver are you using?  When you say you clear your DNS cache are you saying you have stale records in your DNS resolver cache?  Don't use the SmarterMail DNS cache option.
Sounds like you may need to force your DNS resolver to honor TTL and/or force a maximum TTL.  If it's not honoring those settings then you have a corrupt DNS install.
I recommend you use the SmarterMail server DNS server as a resolver.
SM is installed on Windows 2012 Essentials.  I'm running the Windows DNS server and in SM, left the primary and secondary DNS server blank so that it will use Windows.   The Windows DNS server itself is using the root hints and has no forwarders configured.

Every couple hours, the average resolve time for Spamhaus will go from ~500ms to ~150,000ms and counting.   When that happens, querying via nslookup will time out.   When that happens, I'll fire up the Windows DNS MMC, right click on Cached Lookups and select Clear Cache.   Afterwards, it immediately starts resolving again.   This is the only domain name this happens to, everything else, including all my other RBLs, resolve fine.
My backup MX server, which is setup similarly (albeit on Windows 2003), doesn't seem to have this issue.  I'll update this later today and report my results.
I haven't rebooted my server in awhile, so I'll try that first.   If that doesn't yield results, I'm going to experiment and tell my primary SM  to use the other Windows 2003 DNS server and see if it has the same issue.   
Rebooting didn't seem to help.   I just told SM to use the other Windows server as its primary DNS server and well see what happens.
Quick update.  Apparently Windows 2011 and 2012 Essentials automatically populates your gateway as a DNS forwarder from time to time.   I suspect my problems start occurring shortly after this happens.   I removed the forwarder and will see if the problem starts to occur when it reappears.  If so, I have to run a script periodically that removes all forwarders.   What a pain.
Well, that was a bust.   Spamhaus stopped resolving and my gateway had not yet reappeared as a forwarder.   I configured SM to use the other server for DNS.  Unfortunately, the other server is on the far side of a VPN and the router it is using is more limited in the amount of connections per device.   DNS stopped responding altogether within minutes, so I had to back out of it.
I'm setting up a scheduled task to clear the Windows DNS Server's cache every 2 hours, but that is just a band-aid.   The problem has to be either with my DNS server settings or possibly a firewall issue.   If I figure out a permanent solution, I'll update the thread, but any more suggestions would be appreciated.
This may be the last resort, but for testing purposes you could subscribe to a DNS hosting service. If it works nice then you could keep it, or hopefully you'll figure out the issue with Windows 2012

