s-o-1036 June 1994

[< Prev] [TOC] [ Next >]
4.1. Overall Syntax

The overall syntax of a news article is:

INTERNET DRAFT to be        NEWS                    sec. 4.1


     article         = 1*header separator body
     header          = start-line *continuation
     start-line      = header-name ":" space [ nonblank-text ] eol
     continuation    = space nonblank-text eol
     header-name     = 1*name-character *( "-" 1*name-character )
     name-character  = letter / digit
     letter          = <ASCII letter A-Z or a-z>
     digit           = <ASCII digit 0-9>
     separator       = eol
     body            = *( [ nonblank-text / space ] eol )
     eol             = <EOL>
     nonblank-text   = [ space ] text-character *( space-or-text )
     text-character  = <any ASCII character except NUL (ASCII 0),
                         HT (ASCII 9), LF (ASCII 10), CR (ASCII 13),
                         or blank (ASCII 32)>
     space           = 1*( <HT (ASCII 9)> / <blank (ASCII 32)> )
     space-or-text   = space / text-character

An  article consists of some headers followed by a body.  An
empty line separates the two.  The  headers  contain  struc-
tured information about the article and its transmission.  A
header begins with a header name identifying it, and can  be
continued  onto  subsequent lines by beginning the continua-
tion line(s) with white space.   (Note  that  section  4.2.3
adds some restrictions to the header syntax indicated here.)
The body is largely-unstructured text  significant  only  to
the poster and the readers.

     NOTE:  Terminology here follows the current custom
     in the news community, rather than the  MAIL  con-
     vention  of  (sometimes) referring to what is here
     called a "header" as a "header field" or  "field".

Note that the separator line must be truly empty, not just a
line containing white space.  Further empty lines  following
it  are  part  of the body, as are empty lines at the end of
the article.

     NOTE: Some systems  make  no  distinction  between
     empty lines and lines consisting entirely of white
     space;  indeed,  some  systems  cannot   represent
     entirely  empty  lines.  The grammar's requirement
     that header continuation lines contain some print-
     able  text is meant to ensure that the empty/space
     distinction cannot confuse identification  of  the
     separator line.

     NOTE:  It  is tempting to authorize posting agents
     to strip empty lines at the beginning and  end  of
     the  body,  but such empty lines could possibly be
     part of a preformatted document.

Implementors are warned that trailing white  space,  whether
alone  on  the  line or not, MAY be significant in the body,

INTERNET DRAFT to be        NEWS                    sec. 4.1


notably in early versions of  the  "uuencode"  encoding  for
binary  data.  Trailing white space MUST be preserved unless
the article is known to have originated within a cooperating
subnet  that  avoids using significant trailing white space,
and SHOULD be preserved regardless.   Posters  SHOULD  avoid
using  conventions  or  encodings  which make trailing white
space significant;  for  encoding  of  binary  data,  MIME's
"base64"  encoding  is recommended.  Implementors are warned
that ISO C implementations  are  not  required  to  preserve
trailing  white space, and special precautions may be neces-
sary in implementations which do not.

     NOTE: Unfortunately, the signature-delimiter  con-
     vention (described in section 4.3.2) does use sig-
     nificant trailing white space.  It's too  late  to
     fix  this;  there  is work underway on defining an
     organized signature convention as  part  of  MIME,
     which is a preferable solution in the long run.

Posters  are warned that some very old relayer software mis-
behaves when the first non-empty line  of  an  article  body
begins with white space.
[< Prev] [TOC] [ Next >]
#Diff to first older
NewerOlder



Documents were processed to this format by Forrest J. Cavalier III