SORBS: Accuracy Rates and False Positives

The blacklist SORBS (aka the “Spam and Open Relay Blocking System”) was created in 2002 by Australian Matthew Sullivan. SORBS publishes a main “aggregate zone” (dnsbl.sorbs.net) containing listings meeting a multitude of criteria beyond open relaying mail services. SORBS also publishes multiple other zones meeting various criteria.

As related previously, SORBS appears to be undergoing changes. Some of these changes appear to relate to the fact that the SORBS maintainer has repeatedly taken issue with the methodology used by DNSBL.com to measure accuracy rates and false positive rates.

SORBS has indicated that they have the ability to feed false or different data in response to queries from DNSBL.com. As such, it's unclear if recent query results are indicative of results seen by other users. Because of concerns that SORBS may be attempting to sway the data reported, it's important to share current data and information, so that system administrators can make an educated determination as to whether or not it would be wise to use this DNSBL.

Historical Information

I've been tracking data on the main SORBS zone, dnsbl.sorbs.net, since March, 2007. Here's what I've found.

  • For most of the past fifteen weeks, the DNSBL had an effectiveness rate varying between fifty percent and fifty six percent, week over week. This means that SORBS correctly blocked a piece of spam in my spamtrap about five to six times out of ten.

  • For many weeks, I believe SORBS clearly suffered from significant false positive issues. As measured by my own calculations (see here and here for more info), the false positive rate is in the 7.9% - 11.1% range. This means that if your users sign up for the same kind of mail that I did, that for every one hundred pieces of solicited mail your users signed up for and expected to receive, SORBS is likely to block seven to twelve of them.


Recent Data Changes

  • On July 9, 2007, changes were made to SORBS. As you can see from the chart above, around this time (near the start of week 12), the net result is that the effectiveness rate and false positive rates have both significantly declined.

  • Since July 9, 2007, I have not noted any additional false positive from the main SORBS zone. Because of indication from SORBS that they are able to feed false data, it is unclear if the results I am seeing are accurate.

  • Similarly, the effectiveness rate of the main SORBS zone seems to have greatly declined as well. Since July 9, 2007, it is hovering in the 18% range.

There are two possible conclusions to make here:

  • SORBS is somehow able to feed different blacklist data to DNSBL.com than to others. If so, then the historical data I have summarized above is likely to be the most accurate view of SORBS. Or,

  • SORBS has gutted its lists and the poor effectiveness rates I'm now seeing are reflective of how it would likely work for others.

It's hard to say which scenario is the more accurate one, and what future testing will reveal. I'll certainly continue to collect data, but right now, there's an open question of SORBS' effectiveness and false positives.

As of Thursday, July 19, 2007, SORBS changed the default zone mentioned in configuration guidance pages from dnsbl.sorbs.net to a domain not owned by SORBS. As a result, if any SORBS user copies and pastes a configuration snippet from one of the SORBS configuration pages verbatim, the result is that 100% of a site's inbound email will be blocked. My recommendation is to proceed with caution – if you are not sure what you're doing with DNSBL use and mail server configuration, a misstep here will have significant consequences.

SORBS has leveled the following criticism, assumably as justification for for the results published on DNSBL.com. Below is an overview and response to the points raised:

  • SORBS claims that the DNSBL.com email feed data is US-centric. This is true. The domains involved in these hamtraps and spamtraps are "dot com " domains, and have always been hosted in the US. If this means that SORBS is inaccurate as a result, it suggests that SORBS is Australia-centric, and likely will not work as well for those in other countries.

  • SORBS claims that a false positive as defined on DNSBL.com is not what everyone calls a false positive. This is true. I consider a false positive to be a requested message that was blocked. Others have different definitions. I believe the definition used on DNSBL.com to be accurate. I further believe that the most common definition of a false positive as used by regular end users or system administrators is most likely to align with my own.

  • SORBS is unable to verify false positive hits, as DNSBL.com does not provide IP addresses correlating to false positive hits. This is true. If data were provided to any blacklist operator regarding false positives, this would enable the DNSBL to whitewash over the issues by removing the IP addresses reported (and no others). This is similar to why blacklist groups do not provide spamtrap information – they do not want their spamtraps “compromised,” which would allow a bad sender to simply stop sending to spamtraps, but continue spamming elsewhere. Therefore, this information is not provided to any blacklist. (Other list operators have been more understanding.)

  • SORBS claims that the zone “dnsbl.sorbs.net” being queried by DNSBL.com is not the zone used by most users or recommended by SORBS as the main or default zone. This is untrue. It has or had clearly been positioned as the default zone or default recommended configuration choice, and remains the zone first listed, positioned as the “aggregate zone” as of July 20, 2007.

  • SORBS claims that Spamhaus volunteers have (or had) access to the SORBS database and have entered listings in the past to drive significant false positive issues. I am not associated with either SORBS or Spamhaus so I can't speak to this accusation.

  • SORBS claims that the methodology of checking mail against DNSBLs within 15 minutes of receipt is inaccurate. This is untrue. Anyone who uses a DNSBL is enabling their mail server or spam filter to check the mail against the DNSBL within seconds to minutes of receipt. If, as SORBS states, their DNSBL distribution model is such that it suffers from this methodology, then it suggests that it may be slow to respond to real spam trends.

    (10/29/2007 update: At a recent conference, over a beer with a colleague who builds tools to block spam for a living, I was gently chided over this bit of methodology. I was told that I was letting mail get far too old. 15 minutes is a hundred years as far as spam vector measurement is concerned; the vendor in question uses a 60 second interval at maximum. By this logic, I was being too forgiving as far as slowly updating anti-spam blacklists were concerned. This is further at odds with the criticism from SORBS.)

  • SORBS has picked a specific sender as the source for the SORBS false positive rates I report, saying that this sender is a "habitual source of spam." I have no financial interest or any other connection to the sender in question, except that I ordered pillows from them in December, 2006, and was happy with the product and service they provided. As a result, I signed up to receive mail from them, and happily do so. If I used SORBS to reject mail, that mail would not reach me. Additionally, this sender is far from the only source of false positives I found when utilizing the SORBS blacklist.

    (11/09/2007 update: The specific sender is/was Overstock.com. SORBS categorizes Overstock as a spammer. Matthew Sullivan (now known as Michelle Sullivan), in fact, indicated that "1000's of people who receive unsolicited commercial/bulk email from them." There are two additional problems with his characterizations here.

    First
    , Overstock.com is not listed on ANY OTHER of the approximately 47 blacklists I check, except FIVETEN (which lists many hundreds of potentially legitimate senders, and therefore, is not very useful as a second opinion here.) It's not on any of the lists that commonly do list supposedly-legitimate senders who may have run afoul of spamtraps.

    Second
    , the last mail I had received from Overstock.com was on May 25, 2007. This is significantly before the July 9th cutoff of my data, and measured false positives were on the rise even with no further mail from Overstock.com in the data set. Incidentally, I have no idea why I've received no mail since. I didn't unsubscribe.)

Additionally, SORBS has made numerous statements questioning the accuracy of data published here, and characterizing this project as something other than honest and transparent. Here's how it works: I have a feed of mail, and I check all mail received for DNSBL hits. I give internet users a live, rolling snapshot of how various lists intersect with my mail steams. That's all there is to it. I leave it to you, the reader, to decide if I've been honest and clear at every step of this process, and as always, I welcome your feedback.

(11/18/2007 Update: Added the phrase "that if your users sign up for the same kind of mail that I did" above to clarify false positive comments.)

Status of dnsbl.radparker.com: NOT A BLACKLIST

For a time, SORBS was found to be inaccurately referring to “dnsbl.radparker.com” on the mail server configuration pages over on the SORBS blacklist website. This appears to have been done in retaliation for DNSBL.com publishing data on the effectiveness of the SORBS blacklist. (Both domains are owned by me.)

The real problem was for potential SORBS users – if they followed the instructions verbatim, they ended up rejecting 100% of your inbound mail. Sadly, I've seen traffic, which implies that this has happened to some degree.

If you're going to use the SORBS blacklist, be very careful to make sure you've implemented it correctly. Both this, and SORBS' claim that dnsbl.sorbs.net is an unsafe zone to use, suggest that the SORBS' list may not be a wise starting point for those looking to simply, safely block spam.

More information on SORBS can be found here.

There has never been a blacklist with a zone name of dnsbl.radparker.com -- and if you type that into the DNSBL section of your mail server config, you will break your inability to receive inbound mail.

SORBS: Changes?

This thread on the news.admin.net-abuse.email newsgroup suggests that changes are afoot at SORBS. Apparently the entire spam.dnsbl.sorbs.net and dul.dnsbl.sorbs.net zones have been emptied out.

Newsgroup participant Ian Manners reports that SORBS maintainer Matthew Sullivan posted about this to the SORBS-Announce mailing list:

Some of you have noticed that the DUHL (dul.dnsbl.sorbs.net) and Spam (*.spam.dnsbl.sorbs.net) zones are empty on the Rsync and DNS server. This is quite deliberate and I apologise for not posting a message about it previously.

Ian goes on to explain that he believes Matthew is "taking a break due to emotional and SORBS stress and has emptied those databases that need his constant attention."

From review of the data I track on SORBS main DNSBL zone (dnsbl.sorbs.net), it's clear that changes started taking place on July 9th. Since then, the false positive rate as dropped to zero, and the effectiveness rate has also plummeted (from around 47% percent effectiveness on recent 7 day averages, down to 14-19% effectiveness on recent daily averages). Based on the fact that dnsbl.sorbs.net is a combined zone, I would suspect that this is in line with the SORBS maintainer removing the significant amounts of data provided by the "spam" and "DUHL" zones.

Over the past thirteen weeks, my measured effectiveness of the main SORBS zone had been on a slight, but steady, decline. From 56% effective in March, to 47% effective for the week before the data changes took place. False positives (measured against the solicited mail I sign up for, of course), varied slightly but generally were around 9% week over week.

I've been reading all I can on the topic of SORBS lately (and engaging in occasionally-heated discussions with Matthew, learning more on how he runs things), with the intent of writing a more detailed review. I'm not quite ready to do that yet, and these changes suggest that I should wait and see what happens, first. Stay tuned for more on this topic in the future.

Changes at UCEPROTECT

Whatever your opinion of UCEPROTECT, hold on to your hat, as things are apparently about to change.

This posting to the USENET newsgroup news.admin.net-abuse.email indicates that Johann Steigenberger is no longer involved with UCEPROTECT. Going forward, Claus v. Wolfhausen has indicated that he is charge of the lists.

At first there was some concern that this post wasn’t true, that it was a deception. I’ve spoken to Claus via email, and that, along with other information, leads me to believe that this is in fact true and correct. (I’ve met neither individual in person, so I suppose this could be a giant hoax, but I’ve got no reason to believe so at this time.)

Claus indicates that UCEPROTECT will no longer list for backscatter and sender verification callouts. These two listing criteria were controversial and I am told that they resulted in numerous complaints of false positives relating to UCEPROTECT. These data relating to listings based on these criteria are being repurposed into a new blacklist at www.backscatterer.org.

He went on to say due to his intervention, UCEPROTECT has ceased publishing the controversial “anonymous” APEWS blacklist data, and that he is unsure if UCEPROTECT will again publish the APEWS data in the future.

APEWS, an “anonymous” list widely thought to be created as a replacement for the defunct SPEWS, has been regularly criticized by respected anti-spam advocates such as Steve Linford of Spamhaus and Suresh Ramasubramanian of ISP Outblaze. Controversy includes listing policies considered to be broad and inaccurate, and contact/removal policies perceived as cruel to listees (by deflecting all contact away from the blacklist and toward public discussion forums where listees are often subject to abuse from unrelated parties).

I have yet to write and post reviews of UCEPROTECT or APEWS for dnsbl.com. Look for this in the future.