[Mimedefang] How to parse pdf files or pass them to spamassassin
Dianne Skoll
dfs at roaringpenguin.com
Fri May 29 10:02:52 EDT 2015
On Fri, 29 May 2015 15:38:33 +0200
Benoit Panizzon <benoit.panizzon at imp.ch> wrote:
> => Extract text from PDF and pass it to spamassassin to match
> blacklisted URI's within the PDF.
There is a program called pdftotext, which on Debian systems is part
of the poppler-utils package. I'm sure it's packaged in most Linux distros.
So I'm thinking you could run the PDF through that, add a text/plain part
to INPUTMSG with MIME::tools and pass that to SpamAssassin. You wouldn't
actually modify the original message; just temporarily add the text/plain
part. Something like this:
1) Convert PDFs to text and add them as attachment with MIME::tools
methods.
2) Rename ./INPUTMSG to ./INPUTMSG.ORIG
3) Write out the modified message to ./INPUTMSG
4) Call SpamAssassin
5) Rename ./INPUTMSG.ORIG to ./INPUTMSG
I haven't tried this, but it seems that it should work.
Regards,
Dianne
More information about the MIMEDefang
mailing list