rfc2822 April 2001
[< Prev]
[TOC] [ Next >]
4. Obsolete Syntax
Earlier versions of this standard allowed for different (usually more
liberal) syntax than is allowed in this version. Also, there have
been syntactic elements used in messages on the Internet whose
interpretation have never been documented. Though some of these
syntactic forms MUST NOT be generated according to the grammar in
section 3, they MUST be accepted and parsed by a conformant receiver.
This section documents many of these syntactic elements. Taking the
grammar in section 3 and adding the definitions presented in this
section will result in the grammar to use for interpretation of
messages.
Note: This section identifies syntactic forms that any implementation
MUST reasonably interpret. However, there are certainly Internet
messages which do not conform to even the additional syntax given in
this section. The fact that a particular form does not appear in any
section of this document is not justification for computer programs
to crash or for malformed data to be irretrievably lost by any
implementation. To repeat an example, though this document requires
lines in messages to be no longer than 998 characters, silently
discarding the 999th and subsequent characters in a line without
warning would still be bad behavior for an implementation. It is up
to the implementation to deal with messages robustly.
One important difference between the obsolete (interpreting) and the
current (generating) syntax is that in structured header field bodies
(i.e., between the colon and the CRLF of any structured header
field), white space characters, including folding white space, and
comments can be freely inserted between any syntactic tokens. This
allows many complex forms that have proven difficult for some
implementations to parse.
Another key difference between the obsolete and the current syntax is
that the rule in section 3.2.3 regarding lines composed entirely of
white space in comments and folding white space does not apply. See
the discussion of folding white space in section 4.2 below.
Finally, certain characters that were formerly allowed in messages
appear in this section. The NUL character (ASCII value 0) was once
allowed, but is no longer for compatibility reasons. CR and LF were
allowed to appear in messages other than as CRLF; this use is also
shown here.
Other differences in syntax and semantics are noted in the following
sections.
[< Prev]
[TOC] [ Next >]
#Diff to first older