[Mimedefang] Illegal Chars in Email Addresses

Chris Myers chris at by-design.net
Tue Jan 20 14:21:50 EST 2004


----- Original Message ----- 
From: "David F. Skoll" <dfs at roaringpenguin.com>
To: <mimedefang at lists.roaringpenguin.com>
Sent: Tuesday, January 20, 2004 11:18 AM
Subject: Re: [Mimedefang] Illegal Chars in Email Addresses


> On Tue, 20 Jan 2004, Ben Kamen wrote:
>
> > Hey, what's the quicklist of illegal chars for email addresses?
>
> Be careful...  Upon reading RFC 2822, you'll see there are virtually
> no illegal characters in email addresses.
>
> The following address, for example, is legal according to the RFC:
>
> "&@^#*&^#@&^!@@@     !-_-+  + +@^$%"@roaringpenguin.com

There's actually a bit of disagreement between RFC2821 (SMTP) and RFC2822
(Mail Message Format).  RFC2821 is a bit stricter, and says in section
4.1.2:

   Systems MUST NOT define mailboxes in such a way as to require the use
   in SMTP of non-ASCII characters (octets with the high order bit set
   to one) or ASCII "control characters" (decimal value 0-31 and 127).
   These characters MUST NOT be used in MAIL or RCPT commands or other
   commands that require mailbox names.

So in perl-like syntax,

    $Sender !~ /[0x00-0x1f0x7f-0xff]/

also,

   To promote interoperability and consistent with long-standing
   guidance about conservative use of the DNS in naming and applications
   (e.g., see section 2.3.1 of the base DNS document, RFC1035 [22]),
   characters outside the set of alphas, digits, and hyphen MUST NOT
   appear in domain name labels for SMTP clients or servers.  In
   particular, the underscore character is not permitted.  SMTP servers
   that receive a command in which invalid character codes have been
   employed, and for which there are no other reasons for rejection,
   MUST reject that command with a 501 response.

As David points out, it gets really dicey to do any form of enforcement
after that.

In particular, virtually anything inside double-quote (") characters is fair
game, and backslash (\) escapes any other character preventing the "normal"
interpretation (in other words, \@ and \" are treated much differently then
@ and ").  Interestingly, RFC2821 also states that ALL quoted strings in the
local-part are identical for comparison purposes, so "foo"@erewhon.com and
"bar"@erewhon.com are "the same"!

I consider unquoted spaces to be fair game for blocking.  The most common
form of trash addresses I see are "user at domain.com<WHITESPACE>" and
"user at domain.com\r" (\r = newline).

See RFC2821 section 4.1.2 for more details.

Chris Myers
Networks By Design




More information about the MIMEDefang mailing list