[Mimedefang] black-/grey- listing using a moving average of scores per source IP

Chris Stromsoe cbs at cts.ucla.edu
Tue Sep 4 02:32:26 EDT 2007


I'm looking into doing some minimal reputation based black-/grey- listing 
using a real-time moving average of spamassassin spam scores.

My idea in rough form is that mimedefang-filter will do all of it's normal 
scanning, run any whitelists or other blacklists up front, run 
spamassassin, and come up with a spam score for a message.

I'll push the source ip, a timestamp, and the score to a central database 
using a stored procedure to insert the row and return an average score for 
that IP.  The score that gets inserted will be weighted 50% of the prior 
insert from that IP to flatten spikes.  I'm not set on 50%; it's just 
somewhere to start.  This will happen for every message, whethere I accept 
it or not.

If the average score returned for the relay is ham, I'll continue with 
other processing.  If the average score for the relay is spam, I'll 4xx if 
the incoming message is ham or 5xx if it's spam.  A separate process will 
periodically remove stale entries from the database every hour.

Before I start doing any testing, I was wondering if anybody else has done 
anything similar and already has numbers (both of the "it works and 
dropped mail load XX percent" and of the "X emaisl processed with Y 
transactions per minute on Z hardware with load under 2" variety).

Thanks.


-Chris
-------------- next part --------------


create table scores (
  addr inet not null,
  score real not null,
  timestamp bigint not null
);


---
--- generate timestamps using
--- perl -e 'use Time::HiRes; printf "%s\n", Time::HiRes::time * 100000'
---
--- select insertscore('10.11.12.13', 5.0, 118887789440771);
---

CREATE OR REPLACE FUNCTION insertscore (inet, real, bigint) RETURNS real
    AS '
DECLARE
        _tmp REAL;
BEGIN

        SELECT MAX(score) INTO _tmp FROM scores
        WHERE addr = $1 AND timestamp = (SELECT MAX(timestamp) FROM scores);

        IF _tmp IS NULL THEN
                _tmp := $2;
        ELSE
                _tmp := _tmp * 0.5 + $2 * 0.5;
        END IF;

        INSERT INTO scores
                (addr, score, timestamp)
                VALUES
                ($1, _tmp, $3);

        SELECT AVG(score) INTO _tmp FROM scores
        WHERE addr = $1;

        RETURN _tmp;
END;
'
    LANGUAGE plpgsql;





More information about the MIMEDefang mailing list