spamassassin and comment spam


I have the blog set up (as most others I suspect) to email me when a new comment is pending for moderation. Main reason for this being, of course, comment spam. Now I get lots of comment spam, but it's easy to ignore.

But I also have spamassassin set up to dump any email that scores 5 or higher (I think). And SA learns. Put two and two together and it turns out that after a while SA apparently came to the conclusion that any email that came from comments on the weblog was spam. Now I don't get notified of any comments at all. I'll have to dig into the SA config and figure out which rule is triggering this, or how to override it. I'm not even sure how it's separating them, since the trackback notifications are still arriving, and they have a similar message pattern.

You know, I'd be pissed at SA if it wasn't that it's pretty much on the mark in its decision. Hopefully eventually we'll have SA connected to something like Cyc so that it knows a bit more about the world. Now wouldn't that be a treat.

Categories: technology
Posted by diego on January 29 2005 at 12:41 PM | TrackBack (0)
Comments (please see the comments & trackback policy).

It's most likely to be the Bayesian filtering, other rules are dynamic in some way (RBL's and their derivatives) but it's unlikely that they're the problem if you are still getting tarckback nofications from the same system. You're probably best just to whitelist the address they come from, something like "whitelist_from_rcvd address@domain.com domain.com" in local.cf should do it. I guess I better actually email you this or you'll never see it :)

Posted by: Matthew Walker at January 30, 2005 1:02 AM

Thanks Matthew! Actually, I do check regularly to see which comments have to be approved. :) It's a pain though.

I have to check the source email address, but I am pretty sure that it's detecting it based on something else... the source address for the messages is the sender of the comment (with return path an internal user name that's different). What's really strange is that I checked the history and up until messages stopped arriving the messages were regularly getting negative score...

Posted by: Diego at January 30, 2005 9:17 AM

My mention of the source address was how to overcome the score from the bayesian (which I suspect is tipping it over the 5 points) while not losing the usefulness of the bayesian scoring on other email. The reason they would have been negative before is probably because bayesian previously was scoring them quite low resulting in negative hits and is now at the opposite end of the scale resulting in large positive hits.

The other common alternative to whitelisting is to send the email to a separate "secret" email account that does not have SpamAssassin enabled but has an obscure and unreferenced email address so it won't be guessed by spammers.

Posted by: Matthew Walker at January 30, 2005 12:21 PM
Post a comment









Remember personal info?







Copyright © Diego Doval 2002-2007.
Powered by
Movable Type 3.35