Friday, October 26, 2007

Status of blackhole.securitysage.com: DOWN?

The RHSBL (right hand side blacklist) blackhole.securitysage.com appears to have been created by Jeffrey Posluns and appears to have been around since at least August, 2004.

I received a report today indicating that a mail administrator has been unable to reliably query the blackhole.securitysage.com blacklist zone. With the help of my friends, I was able to confirm this issue.

It looks to be a DNS issue. What we see from here is that the zone blackhole.securitysage.com is delegated to nameserver blackhole.securitysage.com. The two DNS "glue entries" for the zone are servers that aren't configured to be authoritative for the zone, so no results are returned. Ultimately, this points toward a DNS configuration issue with this domain and/or sub-domain.

The popular anti-spam filter SpamAssassin has been tracking this issue since at least October 8, 2007. On October 17th, SpamAssassin decide to remove support for this blacklist (implemented in the DNS_FROM_SECURITYSAGE rule), due to the ongoing issues with accessing this blacklist.

As a result of this ongoing issue, I recommend against using the blackhole.securitysage.com blacklist. If you continue to check against this blacklist; queries are likely to time out and it could delay the receipt of inbound mail. Use of this list while this issue persists is likely to provide no blocking or filtering benefit.

I, and others, have contacted Security Sage and Mr. Posluns, making him aware of the issue and asking for more information. I'll be sure to update this page with more information as I have it.

11/03/2007 update: I've seen no response to my email to Mr. Posluns, nor to a friend's email to Security Sage's support address. I emailed that support address today, and my attempt bounced. The error message suggested an SPF failure. The fact that I publish a working SPF record, and other information in the bounce, suggest that it is in error. I guess that means either nobody's home, or they don't want anyone to contact them.

Thursday, October 18, 2007

Expanded Spamtraps and Hamtraps

As always, I'm looking to maintain and improve the accuracy of the data behind the reports over at the Blacklist Statistics Center here at DNSBL Resource. Here's a quick overview of a couple of recent improvements I've made.

Just this week, I've turned up an additional spamtrap feed. This data is based on a set of domains that were no longer routed, but are found on many spammer lists. Not sure how much this will change the spamtrap data, but we will see. It's always good to mix things up.

What this means: I'm broadening my view of the net, working to keep things on the up-and-up by expanding the inbound spam feed to keep this data from becoming biased; if my spamtrap feeds were small enough that they overlapped with one list's traps significantly, but not another's, it could potentially bias results in favor of that list. I don't think this is happening currently, but I'm going to continue to change things up periodically in an attempt to pro-actively attempt to prevent this from ever happening.

On the hamtrap front, I'm now periodically testing to see if various blacklists are blocking large webmail provider outbound mail servers. So far, I'm checking AOL, Hotmail, Yahoo, and Gmail. I don't have a complete view of what all of the outbound IP addresses are for each site; only AOL seems to publish a comprehensive list. I've determined what I can based on headers from real mail that I've sent and/or received over the past week or so. Feel free to contact me if you have pointers to official, published information from the bigger sites.

What this means: If a blacklist blocks all mail from various Yahoo IP addresses, and you have friends who use Yahoo, they're going to have trouble emailing you. If that's the case, this is going to generate significant false positives. It certainly would generate false positives for me; all of my friends seem to use one of those four webmail providers.

Some blacklists might not like that I now make, and publish, this measurement. It's true that some ISP outbound mail servers send spam sometimes, and it's true that those ISP outbound mail servers might be appropriately listed on a given blacklist. But it's also true that even though those servers might send some spam, they also send quite a bit of legitimate mail, and avoiding false positives in that situation becomes near impossible.

Ultimately, it's up to you, as a potential user of a given blacklist, to decide if that risk of false positives is acceptable. In some cases, it is acceptable. In other cases, it may not be.

Also, this makes the hamtrap measurements more likely to reflect real, one-to-one email, in addition to the newsletter and (non-spam) list mail already being tracked. I think this is a good thing.

(More good stuff is on the way...stay tuned!)

Sunday, October 14, 2007

PSBL: Easy On, Easy Off

The Passive Spam Block List, or PSBL (psbl.surriel.com) is a spamtrap-driven anti-spam blacklist that has been around since at least June, 2003. Created by Rik van Riel, who explains on the PSBL website that “the idea is that 99% of the hosts that send me spam never send me legitimate email, but that people whose mail server was used by spammers should still be able to send me email."

The passive nature of the list means that there's no probing or poking of remote servers on the internet (which tends to make ISPs very angry and was a significant issue back in the days of testing for open relays). It also means that there is no debate or argument with listees. As the PSBL website states, “Want to remove your mail server from PSBL? Go ahead.” No need for lawsuit threats, arguments over why listing is denied, or anything of the sort. Anyone can remove any entry for any reason.

Sounds scary, doesn't it? In theory, bad guys could game the system, and rob PSBL of its ability to stop spam. Thankfully, the data shows that this isn't something to worry about. PSBL is a pretty neat tool that can help system administrators filter or reject spam in a way that makes it very easy to prevent false positives. And even though it doesn't take a line as hard as Spamhaus or Spamcop, it manages to block some spam that they do not.

Success Rates
PSBL's success rate seems to greatly vary from week to week. Over the past ninety days, its overall effective rate is 41.4% against the spam hitting my spamtraps. Over the past thirty days, it has been 36.5% effective against spam.

False Positives
False positives are often non-zero, but generally very low. For the past eleven weeks, consistently under 1%. I suspect that this is due to the “easy on, easy off” removal policy-- If anyone trying to send you mail receives a bounce message back from you referring to the PSBL website, it's very easy for them to have their sending IP address removed from the list.

For the most up-to-date numbers, visit the PSBL page in the Blacklist Statistics Center.

Additive Numbers
Even though PSBL catches a lower amount of spam (on its own) than some other more well-known blacklists, it manages to catch some spam that those other lists do not. To determine this, I took the last thirty days worth of results, and looked for intersection and overlap between PSBL and other blacklists.

What I found is that about 9% of successful PSBL hits against spam stopped spam from IP addresses not found on Spamhaus ZEN. When compared against Spamcop, the numbers were even higher -- about 13% of successful PSBL hits stopped spam from IP addresses not listed on Spamcop.

This suggests to me that PSBL would be an excellent blacklist to configure second or third in your mail server configuration. That 9% of IP addresses not found on both Spamhaus and PSBL won't lead to a straight 9% boost in spam filtering effectiveness, due to lists being different sizes. But, if your data is like mine, you're likely to receive a boost of 3% or more.

Conclusion: I recommend PSBL. It helps to block spam that some other lists could miss, and it has friendly anti-false positive policies that make any revealed blocking issues easy to resolve.

The usual caveats applies here: This data illustrates how my own mail streams intersect with PSBL. Your mileage may vary, and I strongly recommend that you test and review results against your own mail streams.

Saturday, October 13, 2007

The Fiveten Blacklist: Not Accurate

Fiveten” (blackholes.five-ten-sg.com) is a combination anti-spam blacklist run by Carl Byington, publishing under the name of “510 Software Group.” This blacklist has been available since at least February, 2001.

It has a multitude of criteria for listings. As of this writing, the website lists the following current criteria:

  • Individual spam sources: “These are generally taken from spam samples that have arrived here, and from discussions on news.admin.net-abuse.email.”
  • Bulk mailers that don't require closed loop confirmed opt-in from all their customers, or that have have allowed known spammers to become clients.”
  • “Networks that provide services to spammers.”
  • Web servers running software vulnerable to spam relay, such as FormMail.
  • Open relaying mail servers.
  • “Free mail providers.” One assumes this relates to sites like Yahoo or Hotmail.
  • “Systems that send virus notifications (klez, sobig, etc) to the supposed sender.” In other words, a specific type of backscatter.
  • “Systems that have delivered challenge-response” messages to Carl's mail server. Yet another type of backscatter.
  • “Systems that are owned by organizations that latently violate the TCPA.” This refers to what most would call phone spammers, entities where Carl is aware them sending pre-recorded telephone message solicitations. (In other words, not email related.)

I've been tracking the effectiveness of the Fiveten blacklist going back to March, 2007. It, along with Spamcop, were two blacklists where I had little data about their current effectiveness. I was intensely curious as to what it targeted and how well it succeeded at stopping spam.

Over the years, I've answered a lot of questions from a lot of companies trying to figure out how to do the right thing with regard to list management and application of abuse prevention best practices. One of the recurring themes in the many emails I receive is blacklisting. I'm blacklisted! What do I do? How do I get un-blacklisted? How do I prevent myself from being blacklisted? Interestingly, one of the blacklists I'm most frequently asked about is Fiveten. Why is that?

Well, after tracking the effectiveness of Fiveten for many months, I've figured out why: Fiveten is inexact and inaccurate. It blocks only a so-so level of spam, and, on a percentage basis, it tends to block more non-spam than spam.

The chart above shows the thirteen-week average effectiveness as measured against my spamtrap and hamtrap mail sources. Fiveten has an approximately 40% success rate with regard to filtering spam. However, it gets it wrong a staggering 44% (approximate) of the time with regard to non spam.

Analysis of the raw data suggests to me that Fiveten's poor (high) false positive rates is primarily due to Fiveten's listing of “bulk mailers that don't require closed loop confirmation opt-in from all their customers.” As a result, Fiveten has thousands of senders listed that have never send spam, specifically because they choose not to utilize double opt-in. This means that Fiveten is effectively a tool that blocks “things the maintainer doesn't like,” which is a wholly different criteria than blocking spam. Against my own data, it appears that there is no direct correlation between spam and the blacklist maintainer's choices for listing criteria.

There's nothing wrong with making a blacklist that requires that any sender not utilizing double opt-in be listed. It's fair to ask how accurate such a list would be, or is. Is there a correlation between lack of confirmed opt-in and spam? Double opt-in, or confirmed opt-in, is a practice that I have strongly promoted for many years. Indeed, I've designed and built a number of confirmed opt-in systems myself over the years, and continue to promote it to this day. However, ISPs generally do not block mail from senders only because they don't utilize double opt-in. What do they know that Fiveten doesn't know?

It's perfectly acceptable to create and publish a blacklist that operates on specific, arbitrary criteria. Blacklist operators clearly have the right to block any email, or any sender, even if only because their email messages might contain the letter “T” or the number “7.” Blacklists are opinions, and I support a blacklist publisher's right to define whatever listing criteria they feel appropriate. But, how does arbitrary relate to accuracy? What if there was a blacklist that listed any IP address containing the number 7? I'm in a good position to test exactly how well a blacklist like that might work. Since March, over 1.6 million email messages (a combination of spam and non-spam) have crossed my tracking mechanism, and I've saved the IP address (and other data) for each. So, it's actually pretty easy for me comb through that data and measure the effectiveness of this type of hypothetical, clearly arbitrary (and some would add, silly) blacklist.

After a few minutes of coding and data compilation, here's what I've come up with: The “Luckyseven” blacklist. As the name suggests, any mail server “lucky” enough to have an IP address containing the number 7 is listed. When comparing Luckyseven to Fiveten, it is approximately 10% more accurate against spam (50% vs. 40%), and slightly less inaccurate against non-spam (43% vs 44%).

I think this exercise suggests that arbitrary listing criteria not based on direct correlation to spam can result in a blacklist that doesn't target spam accurately or successfully.

Any ISP who uses this list is going to block a lot of mail that their users actually desire to receive. As just a sampling, using Fiveten means rejecting various email messages from Microsoft, multiple public radio newsletters (from different radio stations in different states), travel notifications and newsletters from Expedia and Hotwire, lots of other newsletters and news updates from various newspapers and TV shows, and even the newsletter from my favorite pizza place back in my home town of Minneapolis.

Could any of these senders have list management issues? Could any of them be spammers, or be engaging in bad acts warranting blacklisting? Potentially, yes. I know nothing about the practices of any of these entities listed. But, I do know that even the “good guys” can go off the rails once in a while and end up on a blacklist. But it seems unlikely that this is the case with
all of them. To me, this is further indication that Fiveten is unsuitable for use as a spam blocking mechanism.

For the most up-to-date Fiveten accuracy data available from DNSBL Resource, visit the Fivten data page at the Blacklist Statistics Center.

(Please note: The Luckyseven list is fake; an exercise; do not use it for spam filtering.)

Friday, October 12, 2007

Spamhaus ZEN: The DNSBL Resource Review

Spamhaus ZEN is a composite blacklist run by the Spamhaus Project. This UK-based organization was created in 1998 by Steve Linford, and is maintained by a group of employees spread across the globe.

Spamhaus runs a number of different spam-blocking lists. These include:

  • SBL (Spamhaus Block List), which aims to block verified spam sources, spam gangs, and supporters of spam. This list is manually operated, in that every listing is the result of a volunteer deciding that a given IP address or network block merits listing.

  • XBL (Exploits Block List), which aims to block infected computers, open proxies, and the like. Data for this list is supplied by (or supplemented by) outside sources, such as the CBL (Composite Blocking List), meaning that if you use the XBL to filter or reject mail, you do not need to also use the CBL.

  • PBL (Policy Block List), which aims to reject mail from machines that are not meant to be mail servers, ones that would not normally send mail. This includes end user computers on dynamic internet connections (dialup, cable modems, DSL), unassigned IP addresses, web servers, etc. The data from this list is compiled by Spamhaus based on their personal observations, and also from information provided from various internet service providers who choose to cooperate in attempts to help reduce spam delivery effectiveness.

  • ZEN (zone: zen.spamhaus.org) is a combination of all of the above lists. If you are using the ZEN list, you do not need to also use the other lists individually.

Zones Choices and Accuracy Rates
The Spamhaus zones seem to work best when used in combination. SBL alone captures very little spam, as it is very focused and manually maintained. XBL and PBL do most of the “heavy lifting,” providing the most spam-blocking value. When used separately, XBL blocks on average around 50% of spam, and PBL often blocks more than 60% of spam. Combined, with overlap accounted for, and the addition of the SBL, the resulting ZEN list zone regularly blocks more than 80% of spam on a weekly averaged basis, and its effectiveness seems to be slowly trending upward. In short, ZEN is a very accurate list, and I find it to be an excellent tool to help reduce the amount of spam received.

For weekly updated average accuracy and failure rates for Spamhaus ZEN, visit the Blacklist Statistics Center and click on zen.spamhaus.org.

False Positives
Spamhaus, like other blacklists, has been known to escalate listings to include corporate mail servers, networks, or other resources in order to nudge ISPs, companies, and organizations to change policies to make their resources less favorable to spammers. When this happens, if a personal correspondent, or a properly run mailing list, sends from an IP address found on this escalated listing, they would not be able to send mail to users of the Spamhaus SBL or ZEN lists, even though those senders may not have sent spam. I don't have significant data on how often this happens, but I suspect it to be rare. Additionally, this is rarely, if ever, represented in the DNSBL Resource data, due to the relatively small sizes of the spamtrap and hamtrap address pools.

There's an inherent risk with any sort of third-party reputation system in that you're relying on the third party to make a determination for you what mail to accept and what mail to filter or reject. I would always recommend that before using Spamhaus ZEN, or any other blacklist, that you test and investigate on your own, to make sure that you are comfortable with the blacklist provider's policies.

Data Access
Access to the Spamhaus lists is generally free for hobby users and small businesses. However, some users find their ability to query the lists blocked if they are deemed by Spamhaus to be draining on their freely-provided resources. Because of this, sites with more than a handful of users would be well advised to reach out to Spamhaus regarding data feed licensing.

Second Stage Filtering
Spamhaus recommends using the SBL for second stage filtering, effectively creating a “URI SBL.” This allows you to filter or block mail that contains references to websites that resolve to IP addresses blacklisted on the SBL. DNSBL Resource does not currently incorporate this second stage filtering, and has no data regarding its effectiveness. However, this is planned for a future project.

Controversy

Spamhaus, like most other blacklists, is not entirely free from controversy. They are currently embroiled in an ongoing legal battle with a company called E360. As Spamhaus is not based in the US, there are varying opinions regarding whether or not Spamhaus is subject to any judgments or actions under US law, not to mention whether or not the underlying actions have merit. I'm not a lawyer, so I can't speak to this. I can tell you, however, that I do not find that this ongoing issue stops me from utilizing Spamhaus ZEN for my own personal spam filtering. If you'd like to learn more about the E360 matter, Mickey Chandler's spam lawsuit site at SpamSuite.com is a great place to start.

Google reveals a great many links to articles, posts, and comments from people who have various negative opinions of Spamhaus. Sometimes these opinions are based on a lack of understanding of how spam filtering works. Sometimes these are based on dissatisfaction resulting from disagreement over what constitutes spam or non-spam. In particular, many entities purposely listed on Spamhaus are likely to be unhappy about the fact, and some subset of this negative commentary is likely to have been actively spread by spammers. I personally suspect very little of it is likely to be accurate.