usefor-article-04 April 2001
[< Prev]
[TOC] [ Next >]
4.4.2. Character Sets within Article Bodies
Within article bodies, characters are represented as octets according
to the encoding scheme implied by any Content-Transfer-Encoding and
Content-Type headers [RFC 2045]. In the absence of such headers,
reading agents cannot be relied upon to display correctly more than
the US-ASCII characters.
NOTE: Observe that reading agents are not forbidden to "guess",
or to interpret as UTF-8 regardless, which would be the simplest
course for them to take.
NOTE: It is not expected that reading agents will necessarily be
able to present characters in all possible character sets,
although they MUST be able to present all US-ASCII characters.
For example, a reading agent might be able to present only the
ISO-8859-1 (Latin 1) characters [ISO 8859], in which case it
Ought to present undisplayable characters using some distinctive
glyph, or by exhibiting a suitable warning. Older reading agents
that do not understand Mime headers or UTF-8 should be able to
display bodies in US-ASCII (with some loss of human
comprehensibility) except possibly when the Content-Transfer-
Encoding is "8bit".
Followup agents MUST be careful to apply appropriate encodings to the
outbound followup. A followup to an article containing non-ASCII
material is very likely to contain non-ASCII material itself.
[< Prev]
[TOC] [ Next >]
#Diff to first older
--- ../usefor-article-03/Character_Sets_within_Article_Bodies.out February 2000
+++ ../usefor-article-04/Character_Sets_within_Article_Bodies.out April 2001
@@ -1,20 +1,21 @@
4.4.2. Character Sets within Article Bodies
- Within article bodies, the CES and CCS implied by any Content-
- Transfer-Encoding and Content-Type headers [RFC 2045] SHOULD be
- applied by reading agents. In the absence of such headers, reading
- agents cannot be relied upon to display correctly more than the US-
- ASCII characters.
-[Observe that reading agents are not forbidden to "guess", or to
-interpret as UTF-8 regardless, which would be the simplest course for
-them to take.]
+ Within article bodies, characters are represented as octets according
+ to the encoding scheme implied by any Content-Transfer-Encoding and
+ Content-Type headers [RFC 2045]. In the absence of such headers,
+ reading agents cannot be relied upon to display correctly more than
+ the US-ASCII characters.
+
+ NOTE: Observe that reading agents are not forbidden to "guess",
+ or to interpret as UTF-8 regardless, which would be the simplest
+ course for them to take.
NOTE: It is not expected that reading agents will necessarily be
able to present characters in all possible character sets,
although they MUST be able to present all US-ASCII characters.
For example, a reading agent might be able to present only the
ISO-8859-1 (Latin 1) characters [ISO 8859], in which case it
- SHOULD present undisplayable characters using some distinctive
+ Ought to present undisplayable characters using some distinctive
glyph, or by exhibiting a suitable warning. Older reading agents
that do not understand Mime headers or UTF-8 should be able to
display bodies in US-ASCII (with some loss of human