[Mimedefang] strip invalid IMG tag

David F. Skoll dfs at roaringpenguin.com
Mon Nov 29 09:29:23 EST 2004


On Mon, 29 Nov 2004, Marco Supino wrote:

> any html IMG tag which is NOT JPG, JPEG, JPE,GIF, BMP, PNG or SWF will
> be replaced with ------ , for example , the following IMG tag :

You cannot tell from the (lack of a) file extension whether or not
a IMG tag is valid or not.  The following tag:

> <IMG SRC="http://www.12345.com/123/6813978.41.2394">

might be perfectly valid if fetching the URL returns a proper image and
MIME type.

You want to take a look at Anomy::HTMLCleaner from CPAN.  But it's very
buggy and probably needs a lot of work.  Alternatively, look at
writing something with HTML::Parser.

Regards,

David.



More information about the MIMEDefang mailing list