usefor-article-05 July 2001
[< Prev]
[TOC] [ Next >]
4.4.2. Character Sets within Article Bodies
Within article bodies, characters are represented as octets according
to the encoding scheme implied by any Content-Transfer-Encoding and
Content-Type headers [RFC 2045]. In the absence of such headers,
reading agents cannot be relied upon to display correctly more than
the US-ASCII characters.
NOTE: Observe that reading agents are not forbidden to "guess",
or to interpret as UTF-8 regardless, which would be the simplest
course for them to take.
NOTE: It is not expected that reading agents will necessarily be
able to present characters in all possible character sets,
although they MUST be able to present all US-ASCII characters.
For example, a reading agent might be able to present only the
ISO-8859-1 (Latin 1) characters [ISO 8859], in which case it
Ought to present undisplayable characters using some distinctive
glyph, or by exhibiting a suitable warning. Older reading agents
that do not understand MIME headers or UTF-8 should be able to
display bodies in US-ASCII (with some loss of human
comprehensibility) except possibly when the Content-Transfer-
Encoding is "8bit".
Followup agents MUST be careful to apply appropriate encodings to the
outbound followup. A followup to an article containing non-ASCII
material is very likely to contain non-ASCII material itself.
[< Prev]
[TOC] [ Next >]
#Diff to first older
--- ../usefor-article-04/Character_Sets_within_Article_Bodies.out April 2001
+++ ../usefor-article-05/Character_Sets_within_Article_Bodies.out July 2001