[Mimedefang] How to parse pdf files or pass them to spamassassin
    Dianne Skoll 
    dfs at roaringpenguin.com
       
    Fri May 29 10:02:52 EDT 2015
    
    
  
On Fri, 29 May 2015 15:38:33 +0200
Benoit Panizzon <benoit.panizzon at imp.ch> wrote:
> => Extract text from PDF and pass it to spamassassin to match
> blacklisted URI's within the PDF.
There is a program called pdftotext, which on Debian systems is part
of the poppler-utils package.  I'm sure it's packaged in most Linux distros.
So I'm thinking you could run the PDF through that, add a text/plain part
to INPUTMSG with MIME::tools and pass that to SpamAssassin.  You wouldn't
actually modify the original message; just temporarily add the text/plain
part.  Something like this:
1) Convert PDFs to text and add them as attachment with MIME::tools
   methods.
2) Rename ./INPUTMSG to ./INPUTMSG.ORIG
3) Write out the modified message to ./INPUTMSG
4) Call SpamAssassin
5) Rename ./INPUTMSG.ORIG to ./INPUTMSG
I haven't tried this, but it seems that it should work.
Regards,
Dianne
    
    
More information about the MIMEDefang
mailing list