[Mimedefang] Re: SPAM/HAM Trap

Daniel Aquino mr.danielaquino at gmail.com
Mon May 21 09:29:21 EDT 2007


> > Would it be simpler for me to just use a mua?
> Yes, unless you have a better option.

I think I wanted to say MDA...

> * First and foremost, you should understand some issues related to email archiving:
>
> The privacy of your email clients - you should coordinate such actions with the company manager(s),
> and I recommend also to inform all the users about it.

Well I'm not going to read the emails...  I just want to collect some
detected spam/ham to train bayes.. I do have automatic training
enabled but doesn't bayes need a kick start ?

> The disk space (and other resrouces) required.

I have about 30 gigs and only about 3 for the system...

> The ammount and size of expected mails, do you plan to scavenge it and
remove old items?

Yes it would only be temporary ... I could write a cron job to train
based on spam/ham folders and then delete them...

> It is recommended to train bayes against the actual and current email traffic,
> not against historic private or public corpus.

So does automatic learning work even before bayes has 200 emails ?

How can I verify that the bayes training is taking place ?

> * No special scripting is required, this simple command will do the job:
> in filter_end:
> add_recipient('what at ever');
> Or better:
> add_recipient('journal at localhost') if $JournalEnabled;
> You can setup different journal addresses for spam, probable_spam, mail
> You should know how and where to use the add_recipient command in mimedefang-filter to do it.
> For example: spam at localhost and also ham at localhost.

Thats a great and simple idea!!

> You can also send all emails to the same journal at localhost address,
> then use a delivery filter (procmail, cyrus seive, etc) to sort the messages into > different folders using the X-SpamScore header.

Wouldn't the multi user approach be easier ?

> * Another option is to quarantine all (or whatever) email you wish to keep.
> action_quarantine_entire_message.
> But for this, you will need to have an interface for managing the quarantine.

But then i'd have to sort through each message and figure out what
they are and wouldn't it mix with possible virus quarantines ?

> * There are other options, such as keeping the email bodies in a database.
> This is as far as I know what Canit does, at least for the blocked emails.
> This method requires a complete solution with interface, db, etc...

yes but what does this solve?  all I want to do is trap ham/spam and
then train bayes on it...

> Any comments?

How was that? :]



More information about the MIMEDefang mailing list