1
Discussion: The Uses and Misuses of Sender Authentication (SPF, DKIM, DMARC)
Idea shared by Douglas Foster - 4/23/2020 at 10:35 PM
Proposed

The Uses and Misuses of Sender Authentication (SPF, DKIM, DMARC)

Sender Authentication is not for Blocking Fraud

Sender Authentication is usually assumed to be a tool for blocking messages, because sender authentication failures are assumed to indicate fraud.   My experience is that this assumption is completely false.

I have tried to block email based on SPF FAIL on multiple occasions, and it has been a complete failure which had to be quickly rolled back.   There were two significant problems:

  • SPF failures had negative predictive value for fraud.   Most or all of the failures were legitimate messages that we wanted to receive.
  • The spam filter platforms failed to provide a suitable exception mechanism for correcting false positives.

My work with DKIM indicates similar problems:   Not all messages are signed, some signatures will not validate for reasons that are impossible to determine, and some signatures are extraneous or redundant.    In one example, a forwarded message had two DKIM signatures applied by the forwarder:  one signature was applied as the message entered the forwarding process but was later invalidated by message changes, then a second signature was applied for the same domain, at the end of forwarding, which passed validation when it was received.   IETF says that an invalid signature is equivalent to no signature at all, so a failed signature tells us nothing.

DMARC is dependent on these key technologies, and therefore not likely to fare much better.   Since I knew that Gmail had a DMARC policy of quarantine, I checked my message stream for harmless advertising from businesses with gmail.com addresses.   As suspected, the advertising houses were happy to take the client’s money and send the message without obtaining a scope ID from Google (if that is even possible.)

An analogy should help illustrate the limitations of sender authentication:  Assume that I borrow a friend’s car, then get stopped for a burned-out tail light.   When the officer compares my license to the car registration, it will be evident that I am not the owner of the car, and I cannot prove that I have the owner’s permission to use the car.   This might mean that I just stole the car and filled the trunk with contraband, but it is not the most likely scenario.    If there are no warrants for my arrest, and no reports that this car was stolen, the officer will write up a warning about the tail light, to share with my friend, and send me on my way.    

Like the borrowed car, the Internet has a lot of senders using borrowed identities.   The most common usage is for mass mailings handled by third parties, but there are other situations as well.   I also find that there are plenty of SPF entries with unrepaired mistakes.   For all of these reasons, a recipient organization cannot block messages simply because they fail sender authentication; it requires needs more evidence.   

While innocent borrowing is common, malicious use of these identities are rare.   This is because Sender Authentication fraud is simply not necessary to the spammers.  To deceive end users, a spammer can use the very effective technique of including a hijacked corporate logo into the message body, then applying a corresponding Friendly Name next to the Message From address.    For extra effect, he may use a similar-but-different domain name in the Message From address.   Email clients do not display the Envelope From sender address at all, so it is irrelevant to the social engineering efforts of the spammer.    Even further, some email clients hide the Message From header address almost completely.   Microsoft Outlook only displays the Friendly Name in the message list view, although it does display the Message From address when the message is opened.

What Sender Authentication Does well

Sender Authentication can provide confidence that a source is sufficiently identified to permit a particular message to be exempted from one or more filter rules.  Quite simply, Sender Authentication makes whitelisting safe because the whitelisting policy is applied correctly.   Because of the known problems with Sender Authentication, the “source validation” will need to use different criteria for different sources.

  • For a specific business partner sending a specific transaction flow, the Source IP alone may be sufficient.
  • For a trusted source like PowerPoint, a HELO name or Reverse DNS name that can be forward confirmed to the Source IP may be sufficient.
  • For a trusted mass mailer, SPF PASS may be sufficient, even though the Message From may vary from one message to the next.
  • For an untrusted mass mailer, SPF PASS may identify the source, while the decision to allow or block the message may depend on the Message From address.
  • For known forwarding sources with trusted spam filtering, forward-confirmed HELO or Reverse DNS may be sufficient.
  • For unrecognized forwarding sources, DKIM verification of the Message From address may be required, even if the source does not have a DMARC policy.

Once these identification rules have been tailored to the message characteristics of senders to be whitelisted, they need to be matched to the specific content filters that are to be bypassed.   Effective email filtering requires a sophisticated rules engine that can handle multiple criteria to produce very granular actions.   Based on my efforts to survey the market, this type of rules engine is rare.

To build these rules, you need to understand your message stream, and that requires log parsing which goes beyond what all-too-many products are prepared to offer.

Log Analysis

Sender Authentication will be based on the five key identity attributes:   Source IP, Reverse DNS, HELO name, Envelope From address, and Message From header address.   In order to build rules based on these parameters, it is necessary to have a log mechanism which captures this information from the incoming mail stream.   Oddly, many of the email filters that I have examined are unable to capture all five of these attributes into the log.   The Message From header and the HELO name are common omissions.

Is SmarterMail as a Self-Sufficient Email Filter?

I judged the SmarterMail features for SPF and DMARC to be too limiting – too difficult to configure exceptions, and too difficult to link Sender Authentication rules to Content Filtering rules for source-specific tailoring.   This is offset by its extensibility, and I have been using Declude.

The production versions of SmarterMail hide the Message From address in most contexts.   This has been changed in the recent Beta releases, in response to a request on this forum.

I don’t believe there is any way to extract Message From address out of the SmarterMail logs.

My Configuration

I have built my Sender Authentication strategy around Declude within a SmarterMail incoming gateway, because Declude provides the rules engine that I have not found elsewhere.      This intial gateway forwards incoming mail to additional spam filters which have strengths in content filtering and message review, but have weak rules engines.   The combination has been very satisfying.

I parse the Declude log into a SQL database to obtain a single record for each incoming message.    The record contains all five identity attributes, the raw and decoded subject text, the Declude test results, and the final Declude disposition.   This have given me visibility into my mail stream that I could never achieve when the Message From was absent.

I have used Declude’s external call mechanism to integrate SPF and DKIM checking based on open-source code written by people very close to the IETF standards process.    These extensions provide visibility to all seven SPF results, DKIM checking based on alignment with a specific message header such as FROM, and Forward-Confirm DNS checking for the Reverse DNS and HELO names.    These verification functions provide the information needed to perform conditional whitelisting.

Much of our incoming mail stream originates from relatively few mass mailers.   Most of those platforms have good client control, so I know that while their content may be unwanted, it is not likely to be dangerous.    One mass mailer draws my ire, because it seems to have no client control.   Most of what they send is unwanted and irrelevant to our business purposes, and every week some of it is fraudulent password reset message or fraudulent bank notices.    Yet I cannot block them completely because they  they also have clients whose messages are highly desired.   Instead, I have configured Declude to whitelist the messages with a From address for known-good clients of this company, block messages from their known-bad clients, and quarantine messages from all of their other clients.    Without Message From in the log files, I could not see that this was even needed.   Even if I had figured this out by inspecting individual messages, my alternative spam filtering products would have been unable to implement the filtering logic necessary to handle this mass mailer.

Summary

I have asked many vendors how to integrate Sender Authentication, and Sender Authentication exceptions, into an email filtering defense.   I have been largely disappointed by the answers I received.   I hope that the vendors move in the direction that I have described, because all of civilization is at risk from bad email.   We need better defenses.


1 Reply

Reply to Thread
0
To clarify, I do a lot of blocking based on sender information.   About 40% of all of my incoming traffic is blocked based on sender information alone (Source IP, Reverse DNS domain, or Sender Domain.)   My general approach is to use content filtering to detect the bad guys, then use sender information to block everything that they send in the future.   Blocking based on known-bad sender attributes is different than Sender Authentication, because blocking known-bad senders does not require validation that the sender identification is legitimate.   It is only known-good senders that need to be validated, so that the "known-good" characteristic can be applied correctly.

Reply to Thread