[Mimedefang] Cleaning up antivirus integration

Thu May 24 10:55:39 EDT 2007

On Wed, May 02, 2007 at 03:42:09PM -0400, Dave O'Neill wrote:
> I'm about to begin some cleanup work on the antivirus integration within

Good! Unfortunately this thread started in my holiday and I kept it
in the archive for a while. Hope I'm not too late.

> 1) what AV engines do you actually use with MD?  Some of the AV
>    integration code in mimedefang.pl looks fairly stale, so if a

We currently use ClamAV daemon, f-prot daemon and sophos via the sophie
daemon. We're looking at NOD32 and F-Secure (as a replacement for
the soon-to-be-disabled F-prot). MIMEDefang currently doesn't support
NOD32, if we get it working I'll send you (or the list) a howto.

> 2) what sort of API would you like to see for the restructured code?
>    I'm currently thinking of something like:
> 
>         # In your filter
>         use vars qw( $VS );
> 
>         # In filter_initialize()
>         $VS = Email::VirusScan->new({
>            
>             # the engines to use, and their configurations
>             engines => {
>                 'ClamAV::Daemon' => {
>                     socket_name => '/var/spool/MIMEDefang/clamd.sock'
>                 },
>                 'FProtD' => {
>                     host => '127.0.0.1',
>                     port => 10200,
>                 },
>             },
> 
>             # the order to use the engines in
>             order => [ 'FProtD', 'ClamAV::Daemon' ],
>         });
> 
>         # And, later, in filter_end()
>         my $result = $VS->scan_path( "$CWD/Work" )
>         if( $result->is_virus ) {
>             my @viruses = $result->get_virus_names();
>             # ... 
>         }

Can this handle the "zip module failure" of clam Daemon with a
fallback to clamscan (or would that be Clamav::Cmdline ?), or
would that be implemented from with Email::VirusScan::ClamAV::Daemon?

>    Email::VirusScan->scan() or ->scan_path() would iterate over all the
>    configured backend engines and invoke the equivalent method.  The
>    results of all scans would be returned as a container object that can
>    be queried for overall status (->is_virus, etc), or for the
>    information about individual scan results ( so that you can see which
>    scanner got a hit, the name of the infected file, etc).

As others already suggested, bailing out early on a hit would be nice,
but the option to continue would also be needed.

I have already rewritten the virus scanning logic in our current filter,
the current setup is slightly complicated...

On the outgoing servers, we stop at the first virus hit and give a 5xx
error.

On the incoming servers, it is far more tricky... we currently run all
scanners (but might decide to stop in the future after 2 hits. By
running all scanners we can compare how effective they are. FYI-
clamav usually wins this, even without counting the phishes).

tempfails are usually ignored, unless all scanners tempfail, in which
case the mail is tempfailed.

Obviously if no scanner finds anything, the mail is accepted.

If some scanner found something, and the list of recipients contains
both people who want virus scanning, and those who chose NOT to use
virus scanning, then the recipients with virus scanning are removed
and the mail gets a header: "X-Virus: found virus $VirusName".

If every recipient wants virus scanning, and the virus found has 
a name on the list of "does not fake return address" viruses,
then the mail is rejected (action_bounce, 5xx error code). This 
list of virus names currently contains eicar, Jokes, hoaxes, and
office macro viruses.

If only one scanner found a virus, then we tempfail the message.
If any other scanner as much as makes a peep (for example, some
scanners have an option to say "likely a new type of virus" or
"contains suspicious code" or some such), then we don't tempfail.
(This gives us a chance to detect false positives, which are not
as uncommon as you'd hope).

If more than one scanner detects a virus (or one scanner has a
definitive positive and another scanner thinks it's suspicious),
then we discard the mail.

Oh, and finally, if clamav detects phishing mail, we disregard that,
treat it like a clean email as far as clamav goes, and add some magic
to increase the spamassassin score.

I believe most of the above can be done by your suggested API, as long
as you provide options to abort or continue scanning in case of a
detected virus.

-- 
Jan-Pieter Cornet <johnpc at xs4all.nl>
!! Disclamer: The addressee of this email is not the intended recipient. !!
!! This is only a test of the echelon and data retention systems. Please !!
!! archive this message indefinitely to allow verification of the logs.  !!