Skip to content

Die Spam!

*** Note: I don’t guarantee that this will remove all of your spam, please take care in implementing any of this. My methods might not be the best for your application. I assume that you have some basic working knowledge of the concepts that follow. Have fun! ***

I’ve been really not happy with the amount of spam I’ve been getting to my email address. So I decided to tackle it with the help of spamassassin. Through trial and error I think I’ve finally figured it out. I use mutt to read mail directly on the server, but you could also adapt some of this for clients that don’t directly read mail on the server using cron. This article assumes you have spamassassin and procmail already set up and configured for your account, if you need help use some of the above links, the manuals are quite good. First, I set up procmail to handle spamassassin by sending everything it marks as spam to a mbox called Spam:

#——————————
# Spamassassin
#——————————
:0fw: spamassassin.lock
* < 256000
| spamassassin

:0:
* ^X-Spam-Status: Yes
Spam

:0
* ^^rom[ ]
{
LOG=”*** Dropped F off From_ header! Fixing up. ”

:0 fhw
| sed -e ‘1s/^/F/’
}

I want to have 3 classes of mail, Spam as I set up above, normal mail and possible spam which I set up below. I do this by telling procmail to send aliases that aren’t me such as blahblah@mydomain.com to a mbox called possible-spam. I do this by defining what I know are aliases I use for my domain, I’m not the best when it comes to procmail recipes but this is what I got:

:0
* ! ^To:.*myalias@mydomain\.com
* ! ^Cc:.*myalias@mydomain\.com
* ! ^From.*mydomain\.com
* ! ^Received:.*myalias@mydomain\.com
possible-spam

So now we have 3 mboxes one that contains my regular mail, one that has possible spam and one that has spam that spamassassin has marked as such. Now you ask how does spamassassin know what is spam? Well you have to train it with sa-learn. There are many ways to do this but I think I’ve found a method that works for me. A good way to start is telling spamassassin what is good mail, usually if you have a bunch of saved messages laying around this will work. We are going to tell spamassassin that all these messages are good and not spam like this:

sa-learn –showdots –mbox –ham /path/to/good/mail

Ok, now we need a system to promote mail to the spam folder, to later mark as spam. You’ll find that right off the bat you’ll still be getting spam, but stick with it! It won’t happen over night. Ok moving on, upon checking your mail in the 3 mboxes you’ll find good mail and spam, sort these accordingly moving good mail to your inbox and spam to the spam mbox. Be sure to check your spam mbox every once and a while, to insure that you’re not missing any good mail. Ok now that we have mail sorted and put in the right places, it’s time to tell spamassasin to learn that this mail sucks and should be treated as such. Once again we’ll use sa-learn to do this:

sa-learn –showdots –mbox –spam ~/mail/Spam

So now this mail has been learned to the Bayesian filter. This too does not happen over night, it may take quite a few times for this spam to be learned, be patient. Your thinking, this sucks, I don’t want to do this every time I have spam, I can’t remember those commands. Usually I would say stop being a wuss, but instead I’ve written a ridiculously simple script called diespam to do this for you, it learns the spam from the spam mbox and then removes the mbox file and recreates it, not sure if this is the best way. Eh, it works for me:

#! /bin/bash
sa-learn –showdots –mbox –spam ~/mail/Spam
rm ~/mail/Spam
touch ~/mail/Spam

Ok so if you don’t check your mail directly on your server and don’t want to log in you could just add that script to your crontab to run at a given time. Just be sure to check that there isn’t any good mail slipping through the cracks first, maybe remove the “rm ~/mail/Spam” and “touch ~/mail/Spam” and delete your mail manually. The following sets it to daily at 3:30am, you might want to do it weekly, you pick:

30 3 * * * username /path/to/diespam

I think that’s pretty much it. I hope I didn’t leave anything out. Feel free to comment with suggestions and questions, I’ll do my best to reply with a solution.

Categories: Geek.

Comment Feed

No Responses (yet)



Some HTML is OK

or, reply to this post via trackback.