Apache SpamAssassin is an extensible email filter that is used to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules allowing Apache SpamAssassin to be used in a wide variety of email systems.
Install Spamassassin in Debian
#apt-get install spamassassin spamc
spamassassin package can also be integrated into a Mail Transport Agent such as postfix.
Preparation
By default Spamassassin will run as root users when you install from debian repository and is not started to avoid that, we are going to create a specific user and group for spamassassin.
#groupadd -g 5001 spamd
#useradd -u 5001 -g spamd -s /sbin/nologin -d /var/lib/spamassassin spamd
#mkdir /var/lib/spamassassin
#chown spamd:spamd /var/lib/spamassassin
we need to change some settings in /etc/default/spamassassin and make sure you get the following values
ENABLED=1
SAHOME=”/var/lib/spamassassin/”OPTIONS="--create-prefs --max-children 5 --username spamd --helper-home-dir ${SAHOME} -s ${SAHOME}spamd.log"
PIDFILE=”${SAHOME}spamd.pid”
We are going to run spamd daemon as user spamd and make it use its own home dir (/var/lib/spamassassin/) and is going to output its logs in /var/lib/spamassassin/spamd.log
spamassassin Configuration
we need to give spamassassin some rules. The default settings are quite fine, but you might tweak them up a bit. So let’s edit /etc/spamassassin/local.cf and make it looks like that
#vi /etc/spamassassin/local.cf
Modify this file looks like below
rewrite_header Subject [***** SPAM _SCORE_ *****]
required_score 2.0
#to be able to use _SCORE_ we need report_safe set to 0
#If this option is set to 0, incoming spam is only modified by adding some “X-Spam-” headers and no changes will be made to the body.
report_safe 0
# Enable the Bayes system
use_bayes 1
use_bayes_rules 1
# Enable Bayes auto-learning
bayes_auto_learn 1
# Enable or disable network checks
skip_rbl_checks 0
use_razor2 0
use_dcc 0
use_pyzor 0
we set spamassassin’ spamd default settings to rewrite email subject to [***** SPAM _SCORE_ *****], where _SCORE_ is the score attributed to the email by spamassassin after running different tests, only if the actual score is greater or equal to 2.0. So email with a score lower than 2 won’t be modified.
To be able to use the _SCORE_ in the rewrite_header directive, we need to set report_safe to 0.
In the next section, we tell spamassassin to use bayes classifier and to improve itself by auto-learning from the messages it will analyse.
In the last section, we disable collaborative network such as pyzor, razor2 and dcc. Those collaborative network keep an up-to-date catalogue of know mail checksum to be recognized as spam. Those might be interresting to use, but I’m not going to use them here as I found it took long enough to spamassassin to deal with spams only using it rules.
Restart spamassassin using the following command
#/etc/init.d/spamassassin start
Configuring Postfix call Spamassassin
spamassassin will be invoked only once postfix has finished with the email.
To tell postfix to use spamassassin, we are going to edit /etc/postfix/master.cf
#vi /etc/postfix/master.cf
Change the following line
smtp inet n - - - - smtpd
to
smtp inet n - - - - smtpd
-o content_filter=spamassassin
and then, at the end of master.cf file add the following lines
spamassassin unix - n n - - pipe
user=spamd argv=/usr/bin/spamc -f -e
/usr/sbin/sendmail -oi -f ${sender} ${recipient}
Save and exit the file
That’s it our spam filter is setted up, we need to reload postfix settings and everything should be ready.
#/etc/init.d/postfix reload