Thank you for the data Sabatino. Below is my analysis of the strengths and weaknesses of greylisting as currently implemented, and how I would want it improved before I could turn it on. Since you are using it and I am not, it becomes important to test whether you and other existing users consider the plan to be an improvement you could support.
Design considerations for IP-based Greylisting
IP-based greylisting will screen senders based on Source IP and SMTP Mail From address. First attempt is given a temporary error, second attempt for the same IP and account allows the connection and allows future connections for the same IP+Domain pair for a period of time, such as a month. The expiration date may be extended if the IP+Domain pair receives regular use.
Effectiveness
- Greylisting is ineffective if the sending server reliably retransmits after a temporary error. This includes most email hosting services and Email Service Providers (ESPs). Email hosting services typically provide a reliable transport layer for all clients, and ESPs provide reliable transport so that they can be paid for messages delivered.
- Greylisting is effective if the sending server is fully controlled by the spammer. This includes spammer-owned servers and bare-metal hosting services that allow clients access to outbound TCP/25.
Negative Impacts
- Greylisting is destructive if the sending organization retries the connection from a different IP address. A message may be deferred repeatedly, causing unacceptable delivery delays or even abandonment of the delivery attempt. For this reason, if a server organization is expected to behave this way, it must be exempted from greylisting.
- Greylisting is also inefficient if a domain uses multiple IP addresses on the same server farm. This will occur whenver an organization may has multiple Internet-facing servers, each with its own P address, in the same server farm. Greylisting evaluates each IP address independently, so messages from one server may be deferred, even though messages from an adjacent server are being accepted.
- Greylisting may be inconvenient if a legitimate message is abandoned after the first attempt because the sender considers the message unimportant. This is expected to be rare. One possible scenario: DMARC reports are a courtesy service, provided at some expense to the reporting organization. If the reporting organization decides to abandon report delivery after a first failure, the recipient may find this inconvenient.
- Greylisting introduces delays to legitimate mail, unless those IP+Domain pairs have been exempted or previously validated. The amount of delay will depend on the greylisting retry rule on the receiving system and the temporary failure retry interval on the sending system. If a message is time-limited, such as a password reset link, that delay might cause the link to expire. In other cases, the user may be waiting for a specific message, and the delivery delay may affect his productivity.
- Some senders, notably Salesforce.Com, use complex domain names in the SMTP Mail From address. Each Mail From domain will be greylisted separately. If an exception is needed, extracting the list of eligible IP addresses may be difficult because of SPF macros. If an exemption does not exist, many messages will be deferred because the domain names are not re-used.
Positive Impacts
- When greylisting causes an attacker to abandon delivery attempts, the receiving system is spared both overhead and risk. Effective analysis of abandoned delivery attempts could lead to blacklisting, so that the spammer will be blocked even if he begins doing delivery retries.
Mitigating the Known Problems
To prevent one message from facing multiple delays, some method must be used to build a list of server organizations that have variable delivery servers. This process is based on imperfect knowledge, so being dependent on uncertain data is an unresolved risk within the current design.
Having obtained a list of organizations, the list must be converted into a list of address CIDRs. This source for this process is presumably the organization’s SPF record. This data source has its own problems, since some organizations use the macro features of SPF, which make construction of a complete CIDR list difficult.
When the exemption list is built and maintained by a vendor, the mail system administrator typically has no information about which server organizations are exempted and which are not. Manual additions to the list are difficult, because the administrator must follow the same process of exploding an organization name into a set of CIDRs, and then entering each CIDR into the exceptions list without making any errors or omissions in the process. Because of the difficulty of list additions, most trusted correspondents will still be subjected to the greylisting process, despite their trust level.
Improving the design by using Server name
The first two problems can be eliminated by using a verified server domain, instead of IP address, for greylist validation. (Server name validation is based on forward-confirmed DNS to the source IP.) Reattempts from a second server in the same domain will be considered acceptable, so the risk of floating IP addresses is completely neutralized.
After one message is validated for a particular domain pair, all future message from the same pair will be accepted without delay, reducing the total volume of deferred-but-acceptable messages.
Exemptions are no longer required for reliability reasons, so there is no need to develop a list of organizations that use floating delivery addresses, and no need to track the IP addresses of those organizations.
Exemptions for convenience and performance are possible. The exemption system could allow exception rules based on the domain pair, or for either of the domains separately. Since list changes are based on server domain names, maintaining an exception list becomes simple, and desirable for trusted business partners.
Because some server names will not be verifiable, greylisting based on IP address will still be needed as a fallback. Unverified server names have a high probability of being spam sources.
When checking the pending validation list, the system should check both pending IPs and pending domains. This ensures that if a transient DNS problem caused IP-based greylisting on a first attempt, any re-attempt will be matched whether the second attempt verifies a host name or not.
Considerations for Server name validation Which name to use?
The Helo name is presumed to indicate the server organization, so it is preferred whenever it can be validated. The Reverse DNS name may indicate the server organization or the ISP organization.
If the Helo and Reverse DNS names are in the same organization, but only the Reverse DNS name verifies, the Helo name is verified implicitly. This is often the case for servers at Outlook.com, but is uncommon elsewhere.
If the Helo name and Reverse DNS names indicate different organizations, greylisting based on verified Reverse DNS means that a validation result may apply to an entire ISP. Since greylisting validation does not disable any other defenses, the risk associated with ISP-level validation seems minimal. But an implementer may do well to allow the system administrator to decide whether to accept validation based on verified Reverse DNS names or not.
What match level to use?
This document has already argued for migrating the server side from individual machine indicated by IP address to at least server domain. I suggest that using the server organization is preferable because it reduces false positives and unnecessary delays.
Server organization is defined by using the same public-suffix list (PSL) lookup used for DMARC. That list consolidates multiple domains which share a common ownership indicated by a common parent.
Similarly, after a Mail From address completes validation, a set of addresses are exempted from subsequent greylisting delays. The set of exempted addresses could be either the server domain or the server organization. Grouping by organization will solve the problem caused by the complex Mail From addresses used at
Salesforce.com
While I favor organization grouping for both server and Mail From addresses, the choice can be deferred to the administrators that implement greylisting.
Reporting
The greylisting feature should have enough data collection to document what message volume and which message sources were blocked because of greylisting, as well as the message volume and message sources that were deferred but later accepted after greylisting validation.