[Mimedefang] IP Reputation data collection (announcement, Internet draft)
David F. Skoll
dfs at roaringpenguin.com
Fri Apr 30 14:58:27 EDT 2010
[Kevin's original message did not appear on the list because it was in
HTML and the list censor disapproved...]
Kevin A. McGrail wrote:
> I think adding more events are needed to be considered for the initial
> draft. There is also the potential need for additional information on a
> report that should be considered. So not all of these are EVENTS.
Right. The important goals of this design are:
Goal 1) Bandwidth efficiency. We receive a LOT of
reports... something like 350 IP events per second, and our goal is to
scale up to support at least a few thousand per second. Every byte
Goal 2) Simplicity. It's easy to know what to do with 16 gigabytes of
data when each data point is "<something> happened." It's a lot
harder to know what to do with several hundred million
filenames... how do you distill useful information?
Dave O'Neill addressed your questions... I'd like to add my perspective:
> 1 - including the product / version used for auto-ham/spam and the
> automated score & threshold of a spam
We don't want that in every packet (see Bandwidth Efficiency) and it's
not clear what we'd do with the information anyway (Simplicity).
> 2 - including virii/malware as a note
Do you mean the virus name? What would we do with the information?
> 3 - dangerous attachments and a filename
Same comment as (2)
> 4 - dangerous content
What is "dangerous content"? What's dangerous to a Windoze user might
not be dangerous to me. :)
I guess I should add Goal 3, which is to handle (reasonably) objective
events only. Yes, spam vs. non-spam is subjective, but the other
events are all very clear-cut: A recipient is either valid or is not.
A machine either passed greylisting or it did not.
> 5 - reverse DNS failures
These are not objective events. A DNS failure could be because of a
transient network problem.
> 6 - improper HELO/EHLO statements
That's a good one. We should probably add that.
> 7 - invalid MX records
Since we're collecting IP reputation data, "MX records" don't come into play.
> I liked that in in #3 that REPUTATION database is not specific to
> indexing by IPv4 or IPv6.
Err... we'd better fix that. This proposal is (currently) strictly an IP
address reputation protocol.
> The system should be extensible to report
> more data such as the email address of the sender or recipient, the
> subject of the email, etc.
See Goals (1) and (2).
> In the same way, #2 Introduction, specifically talks about IP based
> lists. You might want to broaden that to keep people in a broad mindset.
Nope. It's specifically an IP reputation system. I don't want to expand
it to other kinds of reputation.
> The use of port 6568 could be expanded to stated something like unless
> the AGGREGATOR utilizes an alternate port or something. I have other
> listeners on 6568 already, for example.
*tsk* :-) IANA gave us that port. :) (but it's a SHOULD, so you have
an escape hatch.)
> 4.2 would be best organized into 4.2.0 for reserved, 4.2.1 for
> GREYLISTED, etc. so that all event types have a clear report
> restriction. Then 4.2 should be restrictions for all events like IPv4
OK. Though with only 8 event types, that seems like a bit of overkill.
More information about the MIMEDefang