usefor-article-10 April 2003
[< Prev]
[TOC] [ Next >]
2.4.2. Syntax adapted from Email and MIME
Much of the syntax of Netnews Articles is based on the corresponding
syntax defined in [RFC 2822] or in the MIME specifications [RFC 2045]
et seq, which are deemed to have been incorporated into this standard
as required. However, there are some differences arising from some
special requirements of Netnews and the fact that [RFC 2822] includes
much syntax described as "obsolete" (which is excluded from this
standard, as detailed below).
NOTE: Netnews parsers historically have been much less
permissive than Email parsers, and this is reflected in the
modifications referred to, and in some further specific rules.
The following syntactic rules therefore supersede the corresponding
rules given in [RFC 2822] and [RFC 2045].
unstructured = 1*( [FWS] ( utext / encoded-word ) ) [FWS]
[The one rule might not seem much of a difference, but there are likely
to be others brought into here later as part of this reorganization. So
the existing structure is left alone for now.]
Observe, in contradistinction to [RFC 2822], that an unstructured
header MUST contain at least one non-whitespace character (see also
remarks about empty headers in 4.2.6).
Wherever in this standard the syntax is stated to be taken from [RFC
2822], it is to be understood as the syntax defined by [RFC 2822]
after making the above change(s), but NOT including any syntax
defined in section 4 ("Obsolete syntax") of [RFC 2822]. Software
compliant with this standard MUST NOT generate any of the syntactic
forms defined in that Obsolete Syntax, although it MAY accept such
syntactic forms. Certain syntax from the MIME specifications [RFC
2045] et seq is also considered a part of this standard (see 6.21).
[< Prev]
[TOC] [ Next >]
#Diff to first older
--- ../usefor-article-09/Syntax_adapted_from_Email_and_MIME.out February 2003
+++ ../usefor-article-10/Syntax_adapted_from_Email_and_MIME.out April 2003
@@ -3,76 +3,22 @@
Much of the syntax of Netnews Articles is based on the corresponding
syntax defined in [RFC 2822] or in the MIME specifications [RFC 2045]
et seq, which are deemed to have been incorporated into this standard
- as required. However, there are some important differences arising
- from the fact that [RFC 2822] does not recognize anything beyond US-
- ASCII characters, that it does not recognize the MIME headers [RFC
- 2045], and that it includes much syntax described as "obsolete"
- (which is excluded from this standard, as detailed below).
+ as required. However, there are some differences arising from some
+ special requirements of Netnews and the fact that [RFC 2822] includes
+ much syntax described as "obsolete" (which is excluded from this
+ standard, as detailed below).
NOTE: Netnews parsers historically have been much less
permissive than Email parsers, and this is reflected in the
modifications referred to, and in some further specific rules.
- The following syntactic rules therefore supersede the corresponding
- rules given in [RFC 2822] and [RFC 2045], thus allowing UTF-8
- characters [RFC 2279] to appear in certain contexts (the five rules
- beginning with "strict-" reflect the corresponding original rules
- from [RFC 2822]).
- UTF8-2 = %xC2-DF UTF8-tail
- UTF8-3 = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2(UTF8-tail) /
- %xED %x80-9F UTF8-tail / %xEE-EF 2(UTF8-tail)
- UTF8-4 = %xF0 %x90-BF 2(UTF8-tail) / %xF1-F7 3(UTF8-tail)
- UTF8-5 = %xF8 %x88-BF 3(UTF8-tail) / %xF9-FB 4(UTF8-tail)
- UTF8-6 = %xFC %x84-BF 4(UTF8-tail) / %xFD 5(UTF8-tail)
- UTF8-tail = %x80-BF
- UTF8-xtra-char = UTF8-2 / UTF8-3 / UTF8-4 / UTF8-5 / UTF8-6
- text = %d1-9 / ; all UTF-8 characters except
- %d11-12 / ; US-ASCII NUL, CR and LF
- %d14-127 /
- UTF8-xtra-char
- ctext = NO-WS-CTL / ; all of <text> except
- %d33-39 / ; SP, HTAB, "(", ")"
- %d42-91 / ; "\" and DEL
- %d93-126 /
- UTF8-xtra-char
- qtext = NO-WS-CTL / ; all of <text> except
- %d33 / ; SP, HTAB, "\" DQUOTE
- %d35-91 / ; and DEL
- %d93-126 /
- UTF8-xtra-char
- utext = NO-WS-CTL / ; Non white space controls
- %d33-126 / ; The rest of UTF-8
- UTF8-xtra-char
- strict-text = %d1-9 / ; text restricted to
- %d11-12 / ; US-ASCII
- %d14-127
- strict-qtext = NO-WS-CTL / ; qtext restricted to
- %d33 / ; US-ASCII
- %d35-91 /
- %d93-126
- strict-quoted-pair
- = "\" strict-text
- strict-qcontent = strict-qtext / strict-quoted-pair
- strict-quoted-string
- = [CFWS] DQUOTE
- *( [FWS] strict-qcontent ) [FWS]
- DQUOTE [CFWS]
- unstructured = 1*( [FWS] utext ) [FWS]
+ The following syntactic rules therefore supersede the corresponding
+ rules given in [RFC 2822] and [RFC 2045].
- The syntax for UTF8-xtra-char excludes those redundant sequences of
- octets which cannot occur in UTF-8, as defined by [RFC 2279], either
- because they would not be the shortest possible encodings of some UCS
- character [ISO/IEC 10646], or they would represent one of the
- characters D800 through DFFF, disallowed in UCS because of their
- surrogate use in the UTF-16 encoding. These sequences MUST NOT be
- generated by posting agents. Where they occur inadvertently, they
- SHOULD be passed on untouched by other agents, but attempts to
- interpret them as malformed UTF-8 MUST NOT be made. However, if there
- is reason to suppose they are representations of some other character
- set they MAY, as suggested in section 4.4.1, be interpreted as such.
- The syntax also includes, for completeness, the cases UTF8-5 and
- UTF8-6 which cannot, in fact, arise in [UNICODE 3.2] (though they
- might conceivably arise in some future extension).
+ unstructured = 1*( [FWS] ( utext / encoded-word ) ) [FWS]
+[The one rule might not seem much of a difference, but there are likely
+to be others brought into here later as part of this reorganization. So
+the existing structure is left alone for now.]
Observe, in contradistinction to [RFC 2822], that an unstructured
header MUST contain at least one non-whitespace character (see also
@@ -80,10 +26,10 @@
Wherever in this standard the syntax is stated to be taken from [RFC
2822], it is to be understood as the syntax defined by [RFC 2822]
- after making the above changes, but NOT including any syntax defined
- in section 4 ("Obsolete syntax") of [RFC 2822]. Software compliant
- with this standard MUST NOT generate any of the syntactic forms
- defined in that Obsolete Syntax, although it MAY accept such
+ after making the above change(s), but NOT including any syntax
+ defined in section 4 ("Obsolete syntax") of [RFC 2822]. Software
+ compliant with this standard MUST NOT generate any of the syntactic
+ forms defined in that Obsolete Syntax, although it MAY accept such
syntactic forms. Certain syntax from the MIME specifications [RFC
2045] et seq is also considered a part of this standard (see 6.21).