Spamassassin can learn from Exchange before 2007

I found a better way of teaching Spamassassin from Exchange. But it works trough IMAP. And Exchange 2007 doesn’t allow imap to access to public folders any more 🙁

http://sstern.ccim.com/2006/07/14/training-sitewide-spam-filters/

here is a copy-paste :

How does one enable end-user training of a site-wide Bayesian spam filter for SpamAssassin when the users are reading mail through Microsoft Exchange and the filtering takes place on several Linux MX servers?

We have created two public folders, should-be-spam and should-be-ham. We created an exchange user, spamiam, that has full rights to these folders. End-users move misclassified mail from their inbox or junk-mail folder into the appropriate should-be public folder.

At the top of every hour, this script is run on the one MX server:

/usr/local/scripts/get_ham_spam
#! /bin/sh
rm -f /var/spool/mail/spamiam
touch /var/spool/mail/spamiam
chown spamiam:mail /var/spool/mail/spamiam
su spamiam -c 'fetchmail -a -K -f
/usr/local/scripts/spamiam.fetchmailrc -r "Public Folders/should-
be-spam"'
cat /var/spool/mail/spamiam >> /var/www/html/spamstuff/should-be-spam
sa-learn --spam --mbox /var/www/html/spamstuff/should-be-spam
rm -f /var/spool/mail/spamiam
touch /var/spool/mail/spamiam
chown spamiam:mail /var/spool/mail/spamiam
su spamiam -c 'fetchmail -a -K -f
/usr/local/scripts/spamiam.fetchmailrc -r "Public Folders/should-
be-ham"'
cat /var/spool/mail/spamiam >> /var/www/html/spamstuff/should-be-ham
sa-learn --ham --mbox /var/www/html/spamstuff/should-be-ham

/usr/local/scripts/spamiam.fetchmailrc
poll exchange.xxxx.com
proto imap
user spamiam
password xxxxxxxxx
is spamiam here

At 15 past each hour, the two other mail servers use wget to grab the
should-be files to their local /tmp and run sa-learn.

get-ham-spam
#! /bin/sh
cd /tmp
rm -f should-be-spam should-be-ham
wget -q http://xxx.xxx.com/spamstuff/should-be-spam
wget -q http://xxx.xxx.com/spamstuff/should-be-ham
sa-learn --spam --mbox should-be-spam
sa-learn --ham --mbox should-be-ham

The files are included in logrotate on the source server, so they get zero’d every Sunday
morning.

Leave a Reply

Your email address will not be published. Required fields are marked *