[Mimedefang] Re: SPAM/HAM Trap

Yizhar Hurwitz yizhar at mail.com
Sat May 19 12:41:13 EDT 2007


HI.


> From: "Daniel Aquino" <mr.danielaquino at gmail.com>
> Subject: [Mimedefang] SPAM/HAM Trap
> I would like to add some scripting to mimedefang to create copies of
> spam/ham so I could collect a nice sized database to perform bayes
> training on...

> Would it be simpler for me to just use a mua?
Yes, unless you have a better option.

> Any way I have a feeling adding a custom function to make a mbox
> formated copy of the email in mimedefang-filter and calling it just
> after spam assassin runs would be trivial...

The general idea is ok but:

* First and foremost, you should understand some issues related to email archiving:

The privacy of your email clients - you should coordinate such actions with the company manager(s),
and I recommend also to inform all the users about it.

The disk space (and other resrouces) required.

The ammount and size of expected mails, do you plan to scavenge it and remove old items?


Now to some implementation tips:

* You should read the spamassassin documentation about bayes training:
http://wiki.apache.org/spamassassin/BayesInSpamAssassin
http://wiki.apache.org/spamassassin/BayesFaq
It is recommended to train bayes against the actual and current email traffic,
not against historic private or public corpus.
However, if you have a copy of the email bodies then you can do some mistake based learning.


* Don't use the mbox format.
It is better to setup a local imap server on the mail relay (or other) machine,
which uses either the newer mbx format (uw-imap), maildir, or cyrus imap.


* No special scripting is required, this simple command will do the job:
in filter_end:
add_recipient('what at ever');
Or better:
add_recipient('journal at localhost') if $JournalEnabled;

You can setup different journal addresses for spam, probable_spam, mail, etc.
You should know how and where to use the add_recipient command in mimedefang-filter to do it.
For example: spam at localhost and also ham at localhost.

You can also send all emails to the same journal at localhost address,
then use a delivery filter (procmail, cyrus seive, etc) to sort the messages into different folders using the X-SpamScore header.

* Another option is to quarantine all (or whatever) email you wish to keep.
action_quarantine_entire_message.
But for this, you will need to have an interface for managing the quarantine.

* There are other options, such as keeping the email bodies in a database.
This is as far as I know what Canit does, at least for the blocked emails.
This method requires a complete solution with interface, db, etc...


Any comments?

Yizhar Hurwitz
http://yizhar.mvps.org





More information about the MIMEDefang mailing list