usefor-article-09 February 2003
[< Prev]
[TOC] [ Next >]
4.4.2. Character Sets within Article Bodies
Within article bodies, characters are represented as octets according
to the encoding scheme implied by any Content-Transfer-Encoding- and
Content-Type-headers [RFC 2045]. In the absence of such headers,
reading agents cannot be relied upon to display correctly more than
the US-ASCII characters, though they MUST display at least those.
NOTE: The use of non-ASCII characters in the absence of an
appropriate Content-Type-header is not compliant with this
standard. Nevertheless such usage has been seen in some
hierarchies, and it would be reasonable for reading agents to
make an informed "guess" when confronted with that situation,
and in particular it would be wise at least to test whether they
were in the form of valid UTF-8 (see also the suggestion for
such a test in 4.4.1).
NOTE: It is not expected that reading agents will necessarily be
able to present characters in all possible character sets. For
example, a reading agent might be able to present only the ISO-
8859-1 (Latin 1) characters [ISO 8859], in which case it Ought
to present undisplayable characters using some distinctive
glyph, or by exhibiting a suitable warning.
Followup agents MUST be careful to apply appropriate encodings to the
outbound followup. A followup to an article containing non-ASCII
material is very likely to contain non-ASCII material itself.
[< Prev]
[TOC] [ Next >]
#Diff to first older
--- ../usefor-article-08/Character_Sets_within_Article_Bodies.out August 2002
+++ ../usefor-article-09/Character_Sets_within_Article_Bodies.out February 2003
@@ -6,9 +6,12 @@
reading agents cannot be relied upon to display correctly more than
the US-ASCII characters, though they MUST display at least those.
- NOTE: Observe that reading agents are not forbidden to "guess"
- when confronted with unannounced non-ASCII characters, and in
- particular it would be reasonable at least to test whether they
+ NOTE: The use of non-ASCII characters in the absence of an
+ appropriate Content-Type-header is not compliant with this
+ standard. Nevertheless such usage has been seen in some
+ hierarchies, and it would be reasonable for reading agents to
+ make an informed "guess" when confronted with that situation,
+ and in particular it would be wise at least to test whether they
were in the form of valid UTF-8 (see also the suggestion for
such a test in 4.4.1).