Fight Spam best practice ...


... in my point of view.




Goals


  • Do not receive spam or at least not very much of it
  • Do never block Ham
  • Dont let spammers occupy all your CPU cycles
  • Never get a spammer yourself

Architecture


We use postfix for sending and receiving mail, amavisd-new with spamassassin and clamav for content inspection and virus scanning and Cyrus imapd for local delivery (but you could use another local delivery too).

We block spam during the SMTP dialog. We don't want to accept the spam and then have to decide what to do with it. This has advantages and disadvantages.
The advantage is that if we think it is spam and it is not spam, then the sender can get informed. If you would receive the mail and then decide to drop the mail because you think it is spam, then you cannot inform the sender. If you would send a "delivery failure" mail, then you get a spammer too (backscatter).
The disadvantage is that one could try a denial of service attack against your server by sending loads of mail. I never encountered that, but a high traffic site should take care of that.

We want to save cpu cycles. So we want to use low level filters before trying high level filters like spamassassin. We use one good dnsbl to make a first decision (mor discussion later).

I always disliked greylisting because it delays mail - although it is definitly helpful fighting spam. I recently discovered another similar method that does not delay mail delivery and is really very easy to setup. It is called Fake MX. More later.


Postfix


dnsbl


First i set up a http://www.spamhaus.org dnsbl for a first decision. Add

maps_rbl_domains = sbl-xbl.spamhaus.org

to your /etc/postfix/main.cf

This setting is surely the most discussed point. If you use a dnsbl that sometimes lists the good boys in the blacklist, you get false negatives - you block legitime email. This is really not good. For spamhaus i only know one evident where this happened. At this time i was thinking about commenting out this line in postfix. But over all i feel good with spamhaus.
The main benefit you have from this setting is to save cpu cycles. I get around 200 ham mails per day but more than 2000 tries to send spam to my server. At work we have more than 70000 tries a day and you can eleminate more than 80% with this method.

reject unknown users


You should also avoid accepting mail for unknown recipients. Return a permanent error if the user is unknown. This is the default in postfix. You might think that it is good to hide the users and handle emails to unknown users as if they were existent. I think this no good idea. You will attrackt more and more spammers. We switches from a policy that accepted mail to unknown users to the default postfix policy to return a "unknown user" and the connection rate from the spammers dropped to 25%.

proxy filtering spam


Next we never accept a mail to drop it later. So we scan the mail while receiving and return then either a error if we dont want to accetp the mail or we return OK and take it. This is called "smtp proxy filter" in postfix. Setting in /etc/postfix/master.cf:

smtp inet n - n - 20 smtpd -o smtpd_proxy_filter=127.0.0.1:10024

Postfix asks the service at localhost:10024 if the mail shall be accepted. We will plug amavisd-new there.
So why not take the mail and scan it afterwards? You are responsible to get a final recipient for that mail as soon as you accept the mail. If you late find out that this mail is spam, then the mail rfc says that you have to send a "non delivery notification" if you cannot (or dont want to) deliver this mail. But sender adresses are faked usually in spam. You will create backscatter spam yourself.
To be able to use the proxy filter we have to reduce spam by low level filters or we are not able to scan mail with the appropriate
speed.

Amavisd-new


amavisd-new has to listen to port 10024. So we have to set:

$inet_socket_port = 10024;

in /etc/amavisd.conf. I also like to see the spam score in mail headers in every mail. To get this set:

$sa_tag_level_deflt = undef

Set

$sa_local_tests_only = 0;

to make use of oher dnsbls and razor2 and other spam identifiing services. Finally set

$final_spam_destiny = D_REJECT;

to make amavisd recejt spams in postfixs proxy filtering process. Also uncomment the 'ClamAV-clamd' section to get a virus scanner in the boat.

Make amavisd run as daemon.

Clamav


Get clamav and configure it to run as daemon. Run freshclam also as daemon to keep the virus database fresh.

Spamassassin


default settings


The default settings should enable bayes filter, rbl checks and razor2. If not, set:

use_bayes 1
bayes_auto_learn 1
skip_rbl_checks 0
use_razor2 1

in /etc/mail/spamassassin/local.cf. Check http://wiki.apache.org/spamassassin/NetTestFirewallIssues to see if you have to adapt firewall rules for razor2. You should also think about installing a caching nameserver at the scanning machine to make rbl checks work faster. Bind is a good caching nameserver (even if some say other nameservers are better, bind is the best in this szenario).

bayesian learning


These settings will automatically learn spam mails with your bayesian filter. Since spamassassin is called by amavisd-new, the bayesian database is at ~amavisd-user/.spamassassin (for Suse the amavisd-user is called vscan for example). Check that the files there have the right permissions so that the amavisd-user can write there.

You can also manually learn spam and ham. The program is called sa-learn. For my installation where i use cyrus i enabled the automatic learning of mails placed in "spamlearn" or "hamlearn" folders. I use the following script with cron:

#!/bin/bash

A=`find /var/spool/imap/user -type d -name spamlearn`

for i in $A ; do
  echo "Spamlearn from $i"
  sa-learn --spam --dbpath /var/spool/amavis/.spamassassin/ ${i}/*.
done

B=`find /var/spool/imap/user -type d -name hamlearn`
for j in $B ; do
  echo "Hamlearn from $j"
  sa-learn --ham --dbpath /var/spool/amavis/.spamassassin/ ${j}/*.
done


Check if the files in /var/spool/amavis/.spamassassin belong all to the user vscan so that amavisd can use the filter. Check if BAYES_XX checks appear in Spam tag headers generated by spamassassin (X-SPAM-STATUS).

run sa-update


Also auto-update the spamassassin rules automatically. Create a crontab entry like this one:

00 5 * * * /usr/bin/sa-update && (/etc/init.d spamd stop ; /etc/init.d/amavis stop ; sleep 5 ; /etc/init.d/spamd start ; /etc/init.d/amavis start)

Sometimes spam patterns change and the spamassassin people adapt the rules. The sa-update updates the rules in your spamassassin.

Fake MX Records


At the moment i try to use Fake MX records. The effect could be similar to greylisting but it is easy to configure and does not delay mail delivery. The only problem: You must have 2 IP adresses that reject (not drop!) connections to port 25. The idea is to create 3 MX records in your DNS setup for your mail domain. The ones with the highest and lowest preference are the ones that reject connections to port 25 and the one with the middle preference is your mail server:

MX 10 fake1.mydomain.com.
MX 20 mailserver.mydomain.com.
MX 30 fake2.mydomain.com.

The idea: Malware sometimes have their own smtp mta and try to deliver mail itself, but these mtas are not full featured and may give up if the MX with the lowest number will not accept mail (the greylisting principle). On the other hand some Malware thinks the mta with the highest number is a backup MX and may be bad configured. I hope i can reduce the load of incoming requests for about 80% with that. Probably i can then switch off the greylist in postfix and let spamassassin decide.

How to get more spam


Once you've set up all and you want to get some real spam to check if all works, you should do 2 things: Enable getting mail for *@your-domain.com. This can been done via virtualusers map in postfix. Then post some messages on crowded forums or mailing lists and leave a fake email adress including your domain in the signature. Be patient. Spam will come some day :-)