*--*  10-14-94  -  01:47:22  *--*



Msg#: 4672 *bbs.tbbs*
10-08-94 11:12:21
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
From: alan@papaioea.manawatu.planet.co.nz (Alan Brown)
Newsgroups:
comp.bbs.tbbs,comp.bbs.misc,comp.bbs.majorbbs,alt.bbs,alt.bbs.allsysop,alt.bbs.a
miga.excelsior,alt.bbs.cnet,alt.bbs.first-class,alt.bbs.metal,alt.bbs.pcboard,al
t.bbs.renegade,alt.bbs.searchlight,alt.bbs.wildcat,alt.bbs.watergate
Subject: Network specifications: RFC 822
Date: 8 Oct 1994 20:23:27 +1300
Organization: PlaNet (Manawatu) Palmerston North, New Zealand
Reply-To: /dev/null

news.admin.misc removed from headers, most bbs groups added, as this is 
relevant to all.

Hopefully several authors will read this and following messages and
either stop falsely claiming Useneet compatibility or modify their
product so it really is compatible.

All the RFCs need to be read together, as the later ones build on the 
-More-earlier ones. 
 
====================

     RFC #  822

     Obsoletes:  RFC #733  (NIC #41952)












                        STANDARD FOR THE FORMAT OF

                        ARPA INTERNET TEXT MESSAGES

-More-




                              August 13, 1982






                                Revised by

                             David H. Crocker


                      Dept. of Electrical Engineering
                 University of Delaware, Newark, DE  19711
                      Network:  DCrocker @ UDel-Relay



-More-










 
     Standard for ARPA Internet Text Messages


                             TABLE OF CONTENTS


     PREFACE ....................................................   ii

     1.  INTRODUCTION ...........................................    1

         1.1.  Scope ............................................    1
         1.2.  Communication Framework ..........................    2

     2.  NOTATIONAL CONVENTIONS .................................    3

     3.  LEXICAL ANALYSIS OF MESSAGES ...........................    5

         3.1.  General Description ..............................    5
         3.2.  Header Field Definitions .........................    9
         3.3.  Lexical Tokens ...................................   10
-More-         3.4.  Clarifications ..................................   11

     4.  MESSAGE SPECIFICATION ..................................   17

         4.1.  Syntax ...........................................   17
         4.2.  Forwarding .......................................   19
         4.3.  Trace Fields .....................................   20
         4.4.  Originator Fields ................................   21
         4.5.  Receiver Fields ..................................   23
         4.6.  Reference Fields .................................   23
         4.7.  Other Fields .....................................   24

     5.  DATE AND TIME SPECIFICATION ............................   26

         5.1.  Syntax ...........................................   26
         5.2.  Semantics ........................................   26

     6.  ADDRESS SPECIFICATION ..................................   27

         6.1.  Syntax ...........................................   27
         6.2.  Semantics ........................................   27
         6.3.  Reserved Address .................................   33

-More-     7.  BIBLIOGRAPHY ...........................................   34


                             APPENDIX

     A.  EXAMPLES ...............................................   36
     B.  SIMPLE FIELD PARSING ...................................   40
     C.  DIFFERENCES FROM RFC #733 ..............................   41
     D.  ALPHABETICAL LISTING OF SYNTAX RULES ...................   44


     August 13, 1982               - i -                      RFC #822



 
     Standard for ARPA Internet Text Messages


                                  PREFACE


          By 1977, the Arpanet employed several informal standards for
     the  text  messages (mail) sent among its host computers.  It was
     felt necessary to codify these practices and  provide  for  those
     features  that  seemed  imminent.   The result of that effort was
     Request for Comments (RFC) #733, "Standard for the Format of ARPA
     Network Text Message", by Crocker, Vittal, Pogran, and Henderson.
     The specification attempted to avoid major  changes  in  existing
     software, while permitting several new features.

          This document revises the specifications  in  RFC  #733,  in
     order  to  serve  the  needs  of the larger and more complex ARPA
     Internet.  Some of RFC #733's features failed  to  gain  adequate
     acceptance.   In  order to simplify the standard and the software
-More-     that follows it, these features have been removed.   A  different
     addressing  scheme  is  used, to handle the case of inter-network
     mail; and the concept of re-transmission has been introduced.

          This specification is intended for use in the ARPA Internet.
     However, an attempt has been made to free it of any dependence on
     that environment, so that it can be applied to other network text
     message systems.

          The specification of RFC #733 took place over the course  of
     one  year, using the ARPANET mail environment, itself, to provide
     an on-going forum for discussing the capabilities to be included.
     More  than  twenty individuals, from across the country, partici-
     pated in  the  original  discussion.   The  development  of  this
     revised specification has, similarly, utilized network mail-based
     group discussion.  Both specification efforts  greatly  benefited
     from the comments and ideas of the participants.

          The syntax of the standard,  in  RFC  #733,  was  originally
     specified  in  the  Backus-Naur Form (BNF) meta-language.  Ken L.
     Harrenstien, of SRI International, was responsible for  re-coding
     the  BNF  into  an  augmented  BNF  that makes the representation
     smaller and easier to understand.
-More-











     August 13, 1982              - ii -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     1.  INTRODUCTION

     1.1.  SCOPE

          This standard specifies a syntax for text messages that  are
     sent  among  computer  users, within the framework of "electronic
     mail".  The standard supersedes  the  one  specified  in  ARPANET
     Request  for Comments #733, "Standard for the Format of ARPA Net-
     work Text Messages".

          In this context, messages are viewed as having  an  envelope
     and  contents.   The  envelope  contains  whatever information is
     needed to accomplish transmission  and  delivery.   The  contents
     compose  the object to be delivered to the recipient.  This stan-
     dard applies only to the format and some of the semantics of mes-
     sage  contents.   It contains no specification of the information
     in the envelope.
-More-
          However, some message systems may use information  from  the
     contents  to create the envelope.  It is intended that this stan-
     dard facilitate the acquisition of such information by programs.

          Some message systems may  store  messages  in  formats  that
     differ  from the one specified in this standard.  This specifica-
     tion is intended strictly as a definition of what message content
     format is to be passed BETWEEN hosts.

     Note:  This standard is NOT intended to dictate the internal for-
            mats  used  by sites, the specific message system features
            that they are expected to support, or any of  the  charac-
            teristics  of  user interface programs that create or read
            messages.

          A distinction should be made between what the  specification
     REQUIRES  and  what  it ALLOWS.  Messages can be made complex and
     rich with formally-structured components of information or can be
     kept small and simple, with a minimum of such information.  Also,
     the standard simplifies the interpretation  of  differing  visual
     formats  in  messages;  only  the  visual  aspect of a message is
     affected and not the interpretation  of  information  within  it.
-More-     I

<*>Replies
<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N

Msg has replies, read now(Y/N)? N


Msg#: 4675 *bbs.tbbs*
10-08-94 11:12:36
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
plementors may choose to retain such visual distinctions.

          The formal definition is divided into four levels.  The bot-
     tom level describes the meta-notation used in this document.  The
     second level describes basic lexical analyzers that  feed  tokens
     to  higher-level  parsers.   Next is an overall specification for
     messages; it permits distinguishing individual fields.   Finally,
     there is definition of the contents of several structured fields.



     August 13, 1982               - 1 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     1.2.  COMMUNICATION FRAMEWORK

          Messages consist of lines of text.   No  special  provisions
     are  made for encoding drawings, facsimile, speech, or structured
     text.  No significant consideration has been given  to  questions
     of  data  compression  or to transmission and storage efficiency,
     and the standard tends to be free with the number  of  bits  con-
     sumed.   For  example,  field  names  are specified as free text,
     rather than special terse codes.

          A general "memo" framework is used.  That is, a message con-
     sists of some information in a rigid format, followed by the main
     part of the message, with a format that is not specified in  this
     document.   The  syntax of several fields of the rigidly-formated
     ("headers") section is defined in  this  specification;  some  of
     these fields must be included in all messages.

-More-          The syntax  that  distinguishes  between  header  fields  is
     specified  separately  from  the  internal  syntax for particular
     fields.  This separation is intended to allow simple  parsers  to
     operate on the general structure of messages, without concern for
     the detailed structure of individual header fields.   Appendix  B
     is provided to facilitate construction of these parsers.

          In addition to the fields specified in this document, it  is
     expected  that  other fields will gain common use.  As necessary,
     the specifications for these "extension-fields" will be published
     through  the same mechanism used to publish this document.  Users
     may also  wish  to  extend  the  set  of  fields  that  they  use
     privately.  Such "user-defined fields" are permitted.

          The framework severely constrains document tone and  appear-
     ance and is primarily useful for most intra-organization communi-
     cations and  well-structured   inter-organization  communication.
     It  also  can  be used for some types of inter-process communica-
     tion, such as simple file transfer and remote job entry.  A  more
     robust  framework might allow for multi-font, multi-color, multi-
     dimension encoding of information.  A  less  robust  one,  as  is
     present  in  most  single-machine  message  systems,  would  more
     severely constrain the ability to add fields and the decision  to
-More-     include specific fields.  In contrast with paper-based communica-
     tion, it is interesting to note that the RECEIVER  of  a  message
     can   exercise  an  extraordinary  amount  of  control  over  the
     message's appearance.  The amount of actual control available  to
     message  receivers  is  contingent upon the capabilities of their
     individual message systems.





     August 13, 1982               - 2 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     2.  NOTATIONAL CONVENTIONS

          This specification uses an augmented Backus-Naur Form  (BNF)
     notation.  The differences from standard BNF involve naming rules
     and indicating repetition and "local" alternatives.

     2.1.  RULE NAMING

          Angle brackets ("<", ">") are not  used,  in  general.   The
     name  of  a rule is simply the name itself, rather than "<name>".
     Quotation-marks enclose literal text (which may be  upper  and/or
     lower  case).   Certain  basic  rules  are  in uppercase, such as
     SPACE, TAB, CRLF, DIGIT, ALPHA, etc.  Angle brackets are used  in
     rule  definitions,  and  in  the rest of this  document, whenever
     their presence will facilitate discerning the use of rule names.

     2.2.  RULE1 / RULE2:  ALTERNATIVES
-More-
          Elements separated by slash ("/") are alternatives.   There-
     fore "foo / bar" will accept foo or bar.

     2.3.  (RULE1 RULE2):  LOCAL ALTERNATIVES

          Elements enclosed in parentheses are  treated  as  a  single
     element.   Thus,  "(elem  (foo  /  bar)  elem)"  allows the token
     sequences "elem foo elem" and "elem bar elem".

     2.4.  *RULE:  REPETITION

          The character "*" preceding an element indicates repetition.
     The full form is:

                              <l>*<m>element

     indicating at least <l> and at most <m> occurrences  of  element.
     Default values are 0 and infinity so that "*(element)" allows any
     number, including zero; "1*element" requires at  least  one;  and
     "1*2element" allows one or two.

     2.5.  [RULE]:  OPTIONAL
-More-
          Square brackets enclose optional elements; "[foo  bar]"   is
     equivalent to "*1(foo bar)".

     2.6.  NRULE:  SPECIFIC REPETITION

          "<n>(element)" is equivalent to "<n>*<n>(element)"; that is,
     exactly  <n>  occurrences  of (element). Thus 2DIGIT is a 2-digit
     number, and 3ALPHA is a string of three alphabetic characters.


     August 13, 1982               - 3 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     2.7.  #RULE:  LISTS

          A construct "#" is defined, similar to "*", as follows:

                              <l>#<m>element

     indicating at least <l> and at most <m> elements, each  separated
     by  one  or more commas (","). This makes the usual form of lists
     very easy; a rule such as '(element *("," element))' can be shown
     as  "1#element".   Wherever this construct is used, null elements
     are allowed, but do not  contribute  to  the  count  of  elements
     present.   That  is,  "(element),,(element)"  is  permitted,  but
     counts as only two elements.  Therefore, where at least one  ele-
     ment  is required, at least one non-null element must be present.
     Default values are 0 and infinity so that "#(element)" allows any
     number,  including  zero;  "1#element" requires at least one; and
     "1#2element" allows one or two.
-More-
     2.8.  ; COMMENTS

          A semi-coln, set off some distance to  the  right  of  rule
     text,  starts  a comment that continues to the end of line.  This
     is a simple way of including useful notes in  parallel  with  the
     specifications.
















-More-










     August 13, 1982               - 4 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     3.  LEXICAL ANALYSIS OF MESSAGES

     3.1.  GENERAL DESCRIPTION

          A message consists of header fields and, optionally, a body.
     The  body  is simply a sequence of lines containing ASCII charac-
     ters.  It is separated from the headers by a null line  (i.e.,  a
     line with nothing preceding the CRLF).

     3.1.1.  LONG HEADER FIELDS

        Each header field can be viewed as a single, logical  line  of
        ASCII  characters,  comprising  a field-name and a field-body.
        For convenience, the field-body  portion  of  this  conceptual
        entity  can be split into a multiple-line representation; this
        is called "folding".  The general rule is that wherever  there
        may  be  linear-white-space  (NOT  simply  LWSP-chars), a CRLF
-More-        immediately followed by AT LEAST one LWSP-char may instead  be
        inserted.  Thus, the single line

            To:  "Joe & J. Harvey" <ddd @Org>, JJV @ BBN

        can be represented as:

            To:  "Joe & J. Harvey" <ddd @ Org>,
                    JJV@BBN

        and

            To:  "Joe & J. Harvey"
                            <ddd@ Org>, JJV
             @BBN

        and

            To:  "Joe &
             J. Harvey" <ddd @ Org>, JJV @ BBN

             The process of moving  from  this  folded   multiple-line
        representation  of a header field to its single line represen-
-More-        tation is called "unfolding".  Unfolding  is  accomplished  by
        regarding   CRLF   immediately  followed  by  a  LWSP-char  as
        equivalent to the LWSP-char.

        Note:  While the standard  permits  folding  wherever  linear-
               white-space is permitted, it is recommended that struc-
               tured fields, such as those containing addresses, limit
               folding  to higher-level syntactic breaks.  For address
               fields, it  is  recommended  that  such  folding  occur


     August 13, 1982               - 5 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


               between addresses, aft

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4678 *bbs.tbbs*
10-08-94 11:12:51
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
er the separating comma.

     3.1.2.  STRUCTURE OF HEADER FIELDS

        Once a field has been unfolded, it may be viewed as being com-
        posed of a field-name followed by a colon (":"), followed by a
        field-body, and  terminated  by  a  carriage-return/line-feed.
        The  field-name must be composed of printable ASCII characters
        (i.e., characters that  have  values  between  33.  and  126.,
        decimal, except colon).  The field-body may be composed of any
        ASCII characters, except CR or LF.  (While CR and/or LF may be
        present  in the actual text, they are removed by the action of
        unfolding the field.)

        Certain field-bodies of headers may be  interpreted  according
        to  an  internal  syntax  that some systems may wish to parse.
        These  fields  are  called  "structured   fields".    Examples
        include  fields containing dates and addresses.  Other fields,
-More-        such as "Subject"  and  "Comments",  are  regarded  simply  as
        strings of text.

        Note:  Any field which has a field-body  that  is  defined  as
               other  than  simply <text> is to be treated as a struc-
               tured field.

               Field-names, unstructured field bodies  and  structured
               field bodies each are scanned by their own, independent
               "lexical" analyzers.

     3.1.3.  UNSTRUCTURED FIELD BODIES

        For some fields, such as "Subject" and "Comments",  no  struc-
        turing  is assumed, and they are treated simply as <text>s, as
        in the message body.  Rules of folding apply to these  fields,
        so  that  such  field  bodies  which occupy several lines must
        therefore have the second and successive lines indented by  at
        least one LWSP-char.

     3.1.4.  STRUCTURED FIELD BODIES

        To aid in the creation and reading of structured  fields,  the
-More-        free  insertion   of linear-white-space (which permits folding
        by inclusion of CRLFs)  is  allowed  between  lexical  tokens.
        Rather  than  obscuring  the  syntax  specifications for these
        structured fields with explicit syntax for this  linear-white-
        space, the existence of another "lexical" analyzer is assumed.
        This analyzer does not apply  for  unstructured  field  bodies
        that  are  simply  strings  of  text, as described above.  The
        analyzer provides  an  interpretation  of  the  unfolded  text


     August 13, 1982               - 6 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        composing  the body of the field as a sequence of lexical sym-
        bols.

        These symbols are:

                     -  individual special characters
                     -  quoted-strings
                     -  domain-literals
                     -  comments
                     -  atoms

        The first four of these symbols  are  self-delimiting.   Atoms
        are not; they are delimited by the self-delimiting symbols and
        by  linear-white-space.   For  the  purposes  of  regenerating
        sequences  of  atoms  and quoted-strings, exactly one SPACE is
        assumed to exist, and should be used, between them.  (Also, in
        the "Clarifications" section on "White Space", below, note the
-More-        rules about treatment of multiple contiguous LWSP-chars.)

        So, for example, the folded body of an address field

            ":sysmail"@  Some-Group. Some-Org,
            Muhammed.(I am  the greatest) Ali @(the)Vegas.WBA

















-More-










     August 13, 1982               - 7 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        is analyzed into the following lexical symbols and types:

                    :sysmail              quoted string
                    @                     special
                    Some-Group            atom
                    .                     special
                    Some-Org              atom
                    ,                     special
                    Muhammed              atom
                    .                     special
                    (I am  the greatest)  comment
                    Ali                   atom
                    @                     atom
                    (the)                 comment
                    Vegas                 atom
                    .                     special
                    WBA                   atom
-More-
        The canonical representations for the data in these  addresses
        are the following strings:

                        ":sysmail"@Some-Group.Some-Org

        and

                            Muhammed.Ali@Vegas.WBA

        Note:  For purposes of display, and when passing  such  struc-
               tured information to other systems, such as mail proto-
               col  services,  there  must  be  NO  linear-white-space
               between  <word>s  that are separated by period (".") or
               at-sign ("@") and exactly one SPACE between  all  other
               <word>s.  Also, headers should be in a folded form.







-More-










     August 13, 1982               - 8 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     3.2.  HEADER FIELD DEFINITIONS

          These rules sow a field meta-syntax, without regard for the
     particular  type  or internal syntax.  Their purpose is to permit
     detection of fields; also, they present to  higher-level  parsers
     an image of each field as fitting on one line.

     field       =  field-name ":" [ field-body ] CRLF

     field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">

     field-body  =  field-body-contents
                    [CRLF LWSP-char field-body]

     field-body-contents =
                   <the ASCII characters making up the field-body, as
                    defined in the following sections, and consisting
-More-                    of combinations of atom, quoted-string, and
                    specials tokens, or else consisting of texts>





















-More-










     August 13, 1982               - 9 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     3.3.  LEXICAL TOKENS

          The following rules are used to define an underlying lexical
     analyzer,  which  feeds  tokens to higher level parsers.  See the
     ANSI references, in the Bibliography.

                                                 ; (  Octal, Decimal.)
     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
     ALPHA       =  <any ASCII alphabetic character>
                                                 ; (101-132, 65.- 90.)
                                                 ; (141-172, 97.-122.)
     DIGIT       =  <any ASCII decimal digit>    ; ( 60- 71, 48.- 57.)
     CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
                     character and DEL>          ; (    177,     127.)
     CR          =  <ASCII CR, carriage return>  ; (     15,      13.)
     LF          =  <ASCII LF, linefeed>         ; (     12,      10.)
     SPACE       =  <ASCII SP, space>            ; (     40,      32.)
-More-     HTAB        =  <ASCII HT, horizontal-tab>   ; (     11,       9.)
     <">         =  <ASCII quote mark>           ; (     42,      34.)
     CRLF        =  CR LF

     LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE

     linear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE
                                                 ; CRLF => folding

     specials    =  "(" / ")" / "<" / ">" / "@"  ; Must be in quoted-
                 /  "," / ";" / ":" / "\" / <">  ;  string, to use
                 /  "." / "[" / "]"              ;  within a word.

     delimiters  =  specials / linear-white-space / comment

     text        =  <any CHAR, including bare    ; => atoms, specials,
                     CR & bare LF, but NOT       ;  comments and
                     including CRLF>             ;  quoted-strings are
                                                 ;  NOT recognized.

     atom        =  1*<any CHAR except specials, SPACE and CTLs>

     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
-More-                                                 ;   quoted chars.

     qtext       =  <any CHAR excepting <">,     ; => may be folded
                     "\" & CR, and including
                     linear-white-space>

     domain-literal =  "[" *(dtext / quoted-pair) "]"




     August 13, 1982              - 10 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     dtext       =  <any CHAR excluding "[",     ; => may be folded
                     "]", "\" & CR, & including
                     linear-white-space>

     comment     =  "(" *(ctext / quoted-pair / comment) ")"

     ctext       =  <any CHAR exc

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4681 *bbs.tbbs*
10-08-94 11:13:06
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
luding "(",     ; => may be folded
                     ")", "\" & CR, & including
                     linear-white-space>

     quoted-pair =  "\" CHAR                     ; may quote any char

     phrase      =  1*word                       ; Sequence of words

     word        =  atom / quoted-string


     3.4.  CLARIFICATIONS

     3.4.1.  QUOTING

        Some characters are reserved for special interpretation,  such
        as  delimiting lexical tokens.  To permit use of these charac-
        ters as uninterpreted data, a quoting mechanism  is  provided.
-More-        To quote a character, precede it with a backslash ("\").

        This mechanism is not fully general.  Characters may be quoted
        only  within  a subset of the lexical constructs.  In particu-
        lar, quoting is limited to use within:

                             -  quoted-string
                             -  domain-literal
                             -  comment

        Within these constructs, quoting is REQUIRED for  CR  and  "\"
        and for the character(s) that delimit the token (e.g., "(" and
        ")" for a comment).  However, quoting  is  PERMITTED  for  any
        character.

        Note:  In particular, quoting is NOT permitted  within  atoms.
               For  example  when  the local-part of an addr-spec must
               contain a special character, a quoted  string  must  be
               used.  Therefore, a specification such as:

                            Full\ Name@Domain

               is not legal and must be specified as:
-More-
                            "Full Name"@Domain


     August 13, 1982              - 11 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     3.4.2.  WHITE SPACE

        Note:  In structured field bodies, multiple liner space ASCII
               characters  (namely  HTABs  and  SPACEs) are treated as
               single spaces and may freely surround any  symbol.   In
               all header fields, the only place in which at least one
               LWSP-char is REQUIRED is at the beginning of  continua-
               tion lines in a folded field.

        When passing text to processes  that  do  not  interpret  text
        according to this standard (e.g., mail protocol servers), then
        NO linear-white-space characters should occur between a period
        (".") or at-sign ("@") and a <word>.  Exactly ONE SPACE should
        be used in place of arbitrary linear-white-space  and  comment
        sequences.

        Note:  Within systems conforming to this standard, wherever  a
-More-               member of the list of delimiters is allowed, LWSP-chars
               may also occur before and/or after it.

        Writers of  mail-sending  (i.e.,  header-generating)  programs
        should realize that there is no network-wide definition of the
        effect of ASCII HT (horizontal-tab) characters on the  appear-
        ance  of  text  at another network host; therefore, the use of
        tabs in message headers, though permitted, is discouraged.

     3.4.3.  COMMENTS

        A comment is a set of ASCII characters, which is  enclosed  in
        matching  parentheses  and which is not within a quoted-string
        The comment construct permits message originators to add  text
        which  will  be  useful  for  human readers, but which will be
        ignored by the formal semantics.  Comments should be  retained
        while  the  message  is subject to interpretation according to
        this standard.  However, comments  must  NOT  be  included  in
        other  cases,  such  as  during  protocol  exchanges with mail
        servers.

        Comments nest, so that if an unquoted left parenthesis  occurs
        in  a  comment  string,  there  must  also be a matching right
-More-        parenthesis.  When a comment acts as the delimiter  between  a
        sequence of two lexical symbols, such as two atoms, it is lex-
        ically equivalent with a single SPACE,  for  the  purposes  of
        regenerating  the  sequence, such as when passing the sequence
        onto a mail protocol server.  Comments are  detected  as  such
        only within field-bodies of structured fields.

        If a comment is to be "folded" onto multiple lines,  then  the
        syntax  for  folding  must  be  adhered to.  (See the "Lexical


     August 13, 1982              - 12 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        Analysis of Messages" section on "Folding Long Header  Fields"
        above,  and  the  section on "Case Independence" below.)  Note
        that  the  official  semantics  therefore  do  not  "see"  any
        unquoted CRLFs that are in comments, although particular pars-
        ing programs may wish to note their presence.  For these  pro-
        grams,  it would be reasonable to interpret a "CRLF LWSP-char"
        as being a CRLF that is part of the comment; i.e., the CRLF is
        kept  and  the  LWSP-char is discarded.  Quoted CRLFs (i.e., a
        backslash followed by a CR followed by a  LF)  still  must  be
        followed by at least one LWSP-char.

     3.4.4.  DELIMITING AND QUOTING CHARACTERS

        The quote character (backslash) and  characters  that  delimit
        syntactic  units  are not, generally, to be taken as data that
        are part of the delimited or quoted unit(s).   In  particular,
        the   quotation-marks   that   define   a  quoted-string,  the
-More-        parentheses that define  a  comment  and  the  backslash  that
        quotes  a  following  character  are  NOT  part of the quoted-
        string, comment or quoted character.  A quotation-mark that is
        to  be  part  of  a quoted-string, a parenthesis that is to be
        part of a comment and a backslash that is to be part of either
        must  each be preceded by the quote-character backslash ("\").
        Note that the syntax allows any character to be quoted  within
        a  quoted-string  or  comment; however only certain characters
        MUST be quoted to be included as data.  These  characters  are
        the  ones that are not part of the alternate text group (i.e.,
        ctext or qtext).

        The one exception to this rule  is  that  a  single  SPACE  is
        assumed  to  exist  between  contiguous words in a phrase, and
        this interpretation is independent of  the  actual  number  of
        LWSP-chars  that  the  creator  places  between the words.  To
        include more than one SPACE, the creator must make  the  LWSP-
        chars be part of a quoted-string.

        Quotation marks that delimit a quoted string  and  backslashes
        that  quote  the  following character should NOT accompany the
        quoted-string when the string is passed to processes  that  do
        not interpret data according to this specification (e.g., mail
-More-        protocol servers).

     3.4.5.  QUOTED-STRINGS

        Where permitted (i.e., in words in structured fields)  quoted-
        strings  are  treated  as a single symbol.  That is, a quoted-
        string is equivalent to an atom, syntactically.  If a  quoted-
        string  is to be "folded" onto multiple lines, then the syntax
        for folding must be adhered to.  (See the "Lexical Analysis of


     August 13, 1982              - 13 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        Messages"  section  on "Folding Long Header Fields" above, and
        the section on "Case  Independence"  below.)   Therefore,  the
        official  semantics  do  not  "see" any bare CRLFs that are in
        quoted-strings; however particular parsing programs  may  wish
        to  note  their presence.  For such programs, it would be rea-
        sonable to interpret a "CRLF LWSP-char" as being a CRLF  which
        is  part  of the quoted-string; i.e., the CRLF is kept and the
        LWSP-char is discarded.  Quoted CRLFs (i.e., a backslash  fol-
        lowed  by  a CR followed by a LF) are also subject to rules of
        folding, but the presence of the quoting character (backslash)
        explicitly  indicates  that  the  CRLF  is  data to the quoted
        string.  Stripping off the first following LWSP-char  is  also
        appropriate when parsing quoted CRLFs.

     3.4.6.  BRACKETING CHARACTERS

        There is one type of bracket which must occur in matched pairs
-More-        and may have pairs nested within each other:

            o   Parentheses ("(" and ")") are used  to  indicate  com-
                ments.

        There are three types of brackets which must occur in  matched
        pairs, and which may NOT be nested:

            o   Colon/semi-colon (":" and ";") are   used  in  address
                specifications  to  ndicate that the included list of
                addresses are to be treated as a group.

            o   Angle brackets ("<"

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4684 *bbs.tbbs*
10-08-94 11:13:20
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
 and ">")  are  generally  used  to
                indicate  the  presence of a one machine-usable refer-
                ence (e.g., delimiting mailboxes), possibly  including
                source-routing to the machine.

            o   Square brackets ("[" and "]") are used to indicate the
                presence  of  a  domain-literal, which the appropriate
                name-domain  is  to  use  directly,  bypassing  normal
                name-resolution mechanisms.

     3.4.7.  CASE INDEPENDENCE

        Except as noted, alphabetic strings may be represented in  any
        combination of upper and lower case.  The only syntactic units




-More-



     August 13, 1982              - 14 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        which requires preservation of case information are:

                    -  text
                    -  qtext
                    -  dtext
                    -  ctext
                    -  quoted-pair
                    -  local-part, except "Postmaster"

        When matching any other syntactic unit, case is to be ignored.
        For  example, the field-names "From", "FROM", "from", and even
        "FroM" are semantically equal and should all be treated ident-
        ically.

        When generating these units, any mix of upper and  lower  case
        alphabetic  characters  may  be  used.  The case shown in this
        specification is suggested for message-creating processes.
-More-
        Note:  The reserved local-part address unit, "Postmaster",  is
               an  exception.   When  the  value "Postmaster" is being
               interpreted, it must be  accepted  in  any  mixture  of
               case, including "POSTMASTER", and "postmaster".

     3.4.8.  FOLDING LONG HEADER FIELDS

        Each header field may be represented on exactly one line  con-
        sisting  of the name of the field and its body, and terminated
        by a CRLF; this is what the parser sees.  For readability, the
        field-body  portion of long header fields may be "folded" onto
        multiple lines of the actual field.  "Long" is commonly inter-
        preted  to  mean greater than 65 or 72 characters.  The former
        length serves as a limit, when the message is to be viewed  on
        most  simple terminals which use simple display software; how-
        ever, the limit is not imposed by this standard.

        Note:  Some display software often can selectively fold lines,
               to  suit  the display terminal.  In such cases, sender-
               provided  folding  can  interfere  with   the   display
               software.

-More-     3.4.9.  BACKSPACE CHARACTERS

        ASCII BS characters (Backspace, decimal 8) may be included  in
        texts and quoted-strings to effect overstriking.  However, any
        use of backspaces which effects an overstrike to the  left  of
        the beginning of the text or quoted-string is prohibited.





     August 13, 1982              - 15 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     3.4.10.  NETWORK-SPECIFIC TRANSFORMATIONS

        During transmission through heterogeneous networks, it may  be
        necessary  to  force data to conform to a network's local con-
        ventions.  For example, it may be required that a CR  be  fol-
        lowed  either by LF, making a CRLF, or by <null>, if the CR is
        to stand alone).  Such transformations are reversed, when  the
        message exits that network.

        When  crossing  network  boundaries,  the  message  should  be
        treated  as  passing  through  two modules.  It will enter the
        first module containing whatever network-specific  transforma-
        tions  that  were  necessary  to  permit migration through the
        "current" network.  It then passes through the modules:

            o   Transformation Reversal

-More-                The "current" network's idiosyncracies are removed and
                the  message  is returned to the canonical form speci-
                fied in this standard.

            o   Transformation

                The "next" network's local idiosyncracies are  imposed
                on the message.

                                ------------------
                    From   ==>  | Remove Net-A   |
                    Net-A       | idiosyncracies |
                                ------------------
                                       ||
                                       \/
                                  Conformance
                                  with standard
                                       ||
                                       \/
                                ------------------
                                | Impose Net-B   |  ==>  To
                                | idiosyncracies |       Net-B
                                ------------------
-More-










     August 13, 1982              - 16 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     4.  MESSAGE SPECIFICATION

     4.1.  SYNTAX

     Note:  Due to an artifact of the notational conventions, the syn-
            tax  indicates that, when present, some fields, must be in
            a particular order.  Header fields  are  NOT  required  to
            occur  in  any  particular  order, except that the message
            body must occur AFTER  the  headers.   It  is  recommended
            that,  if  present,  headers be sent in the order "Return-
            Path", "Received", "Date",  "From",  "Subject",  "Sender",
            "To", "cc", etc.

            This specification permits multiple  occurrences  of  most
            fields.   Except  as  noted,  their  interpretation is not
            specified here, and their use is discouraged.

-More-          The following syntax for the bodies of various fields should
     be  thought  of  as  describing  each field body as a single long
     string (or line).  The "Lexical Analysis of Message"  section  on
     "Long  Header Fields", above, indicates how such long strings can
     be represented on more than one line in  the  actual  transmitted
     message.

     message     =  fields *( CRLF *text )       ; Everything after
                                                 ;  first null line
                                                 ;  is message body

     fields      =    dates                      ; Creation time,
                      source                     ;  author id & one
                    1*destination                ;  address required
                     *optional-field             ;  others optional

     source      = [  trace ]                    ; net traversals
                      originator                 ; original mail
                   [  resent ]                   ; forwarded

     trace       =    return                     ; path to sender
                    1*received                   ; receipt tags

-More-     return      =  "Return-path" ":" route-addr ; return address

     received    =  "Received"    ":"            ; one per relay
                       ["from" domain]           ; sending host
                       ["by"   domain]           ; receiving host
                       ["via"  atom]             ; physical path
                      *("with" atom)             ; link/mail protocol
                       ["id"   msg-id]           ; receiver msg id
                       ["for"  addr-spec]        ; initial form


     August 13, 1982              - 17 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


                        ";"    date-time         ; time received

     originator  =   authentic                   ; authenticated addr
                   [ "Reply-To"   ":" 1#address] )

     authentic   =   "From"       ":"   mailbox  ; Single author
                 / ( "Sender"     ":"   mailbox  ; Actual submittor
                     "From"       ":" 1#mailbox) ; Multiple authors
                                                 ;  or not sender

     resent      =   resent-authentic
                   [ "Resent-Reply-To"  ":" 1#address] )

     resent-authentic =
                 =   "Resent-From"      ":"   mailbox
                 / ( "Resent-Sender"    ":"   mailbox
                     "Resent-From"      ":" 1#mailbox  )
-More-
     dates       =   orig-date                   ; Original
                   [ resent-date ]               ; Forwarded

     orig-date   =  "Date"        ":"   date-time

     resent-date =  "Resent-Date" ":"   date-time

     destination =  "To"          ":" 1#address  ; Primary
                 /  "Resent-To"   ":" 1#address
                 /  "cc"          ":" 1#address  ; Secondary
                 /  "Resent-cc"   ":" 1#address
                 /  "bcc"         ":"  #address  ; Blind carbon
                 /  "Resent-bcc"  ":"  #address

   

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4687 *bbs.tbbs*
10-08-94 11:13:35
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
  optional-field =
                 /  "Message-ID"        ":"   msg-id
                 /  "Resent-Message-ID" ":"   msg-id
                 /  "In-Reply-To"       ":"  *(phrase / msg-id)
                 /  "References"        ":"  *(phrase / msg-id)
                 /  "Keywords"          ":"  #phrase
                 /  "Subject"           ":"  *text
                 /  "Comments"          ":"  *text
                 /  "Encrypted"         ":" 1#2word
                 /  extension-field              ; To be defined
                 /  user-defined-field           ; May be pre-empted

     msg-id      =  "<" addr-spec ">"            ; Unique message id





-More-
     August 13, 1982              - 18 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     extension-field =
                   <Any field which is defined in a document
                    published as a formal extension to this
                    specification; none will have names beginning
                    with the string "X-">

     user-defined-field =
                   <Any field which has not been defined
                    in this specification or published as an
                    extension to this specification; names for
                    such fields must be unique and may be
                    pre-empted by published extensions>

     4.2.  FORWARDING

          Some systems permit mail recipients to  forward  a  message,
     retaining  the original headers, by adding some new fields.  This
-More-     standard supports such a service, through the "Resent-" prefix to
     field names.

          Whenever the string "Resent-" begins a field name, the field
     has  the  same  semantics as a field whose name does not have the
     prefix.  However, the message is assumed to have  been  forwarded
     by  an original recipient who attached the "Resent-" field.  This
     new field is treated as being more recent  than  the  equivalent,
     original  field.   For  example, the "Resent-From", indicates the
     person that forwarded the message, whereas the "From" field indi-
     cates the original author.

          Use of such precedence  information  depends  upon  partici-
     pants'  communication needs.  For example, this standard does not
     dictate when a "Resent-From:" address should receive replies,  in
     lieu of sending them to the "From:" address.

     Note:  In general, the "Resent-" fields should be treated as con-
            taining  a  set  of information that is independent of the
            set of original fields.  Information for  one  set  should
            not  automatically be taken from the other.  The interpre-
            tation of multiple "Resent-" fields, of the same type,  is
            undefined.
-More-
          In the remainder of this specification, occurrence of  legal
     "Resent-"  fields  are treated identically with the occurrence of








     August 13, 1982              - 19 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     fields whose names do not contain this prefix.

     4.3.  TRACE FIELDS

          Trace information is used to provide an audit trail of  mes-
     sage  handling.   In  addition,  it indicates a route back to the
     sender of the message.

          The list of known "via" and  "with"  values  are  registered
     with  the  Network  Information  Center, SRI International, Menlo
     Park, California.

     4.3.1.  RETURN-PATH

        This field  is  added  by  the  final  transport  system  that
        delivers  the message to its recipient.  The field is intended
        to contain definitive information about the address and  route
-More-        back to the message's originator.

        Note:  The "Reply-To" field is added  by  the  originator  and
               serves  to  direct  replies,  whereas the "Return-Path"
               field is used to identify a path back to  the  origina-
               tor.

        While the syntax  indicates  that  a  route  specification  is
        optional,  every attempt should be made to provide that infor-
        mation in this field.

     4.3.2.  RECEIVED

        A copy of this field is added by each transport  service  that
        relays the message.  The information in the field can be quite
        useful for tracing transport problems.

        The names of the sending  and  receiving  hosts  and  time-of-
        receipt may be specified.  The "via" parameter may be used, to
        indicate what physical mechanism the message  was  sent  over,
        such  as  Arpanet or Phonenet, and the "with" parameter may be
        used to indicate the mail-,  or  connection-,  level  protocol
        that  was  used, such as the SMTP mail protocol, or X.25 tran-
-More-        sport protocol.

        Note:  Several "with" parameters may  be  included,  to  fully
               specify the set of protocols that were used.

        Some transport services queue mail; the internal message iden-
        tifier that is assigned to the message may be noted, using the
        "id" parameter.  When the  sending  host  uses  a  destination
        address specification that the receiving host reinterprets, by


     August 13, 1982              - 20 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        expansion or transformation, the receiving host  may  wish  to
        record  the original specification, using the "for" parameter.
        For example, when a copy of mail is sent to the  member  of  a
        distribution  list,  this  parameter may be used to record the
        original address that was used to specify the list.

     4.4.  ORIGINATOR FIELDS

          The standard allows only a subset of the combinations possi-
     ble  with the From, Sender, Reply-To, Resent-From, Resent-Sender,
     and Resent-Reply-To fields.  The limitation is intentional.

     4.4.1.  FROM / RESENT-FROM

        This field contains the identity of the person(s)  who  wished
        this  message to be sent.  The message-creation process should
        default this field  to  be  a  single,  authenticated  machine
-More-        address,  indicating  the  AGENT  (person,  system or process)
        entering the message.  If this is not doe, the "Sender" field
        MUST  be  present.  If the "From" field IS defaulted this way,
        the "Sender" field is  optional  and  is  redundant  with  the
        "From"  field.   In  all  cases, addresses in the "From" field
        must be machine-usable (addr-specs) and may not contain  named
        lists (groups).

     4.4.2.  SENDER / RESENT-SENDER

        This field contains the authenticated identity  of  the  AGENT
        (person,  system  or  process)  that sends the message.  It is
        intended for use when the sender is not the author of the mes-
        sage,  or  to  indicate  who among a group of authors actually
        sent the message.  If the contents of the "Sender" field would
        be  completely  redundant  with  the  "From"  field,  then the
        "Sender" field need not be present and its use is  discouraged
        (though  still legal).  In particular, the "Sender" field MUST
        be present if it is NOT the same as the "From" Field.

        The Sender mailbox  specification  includes  a  word  sequence
        which  must correspond to a specific agent (i.e., a human user
        or a computer program) rather than a standard  address.   This
-More-        indicates  the  expectation  that  the field will identify the
        single AGENT (person,  system,  or  process)  responsible  for
        sending  the mail and not simply include the name of a mailbox
        from which the mail was sent.  For example in the  case  of  a
        shared login name, the name, by itself, would not be adequate.
        The local-part address unit, which refers to  this  agent,  is
        expected to be a computer system term, and not (for example) a
        generalized person reference which can  be  used  outside  the
        network text message context.


     August 13, 1982              - 21 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        Since the critical function served by the  "Sender"  field  is
        identification  of  the agent responsible for sending mail and
        since computer programs cannot be held accountable  for  their
        behavior, it is strongly recommended that when a computer pro-
        gram generates a message, the HUMAN  who  is  responsible  for
        that program be referenced as part of the "Sender" field mail-
        box specification.

     4.4.3.  REPLY-TO / RESENT-REPLY-TO

        This field provides a general  mechanism  for  indicating  any
        mailbox(es)  to which responses are to be sent.  Three typical
        uses for this feature can  be  distinguished.   In  the  first
        case,  the

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4690 *bbs.tbbs*
10-08-94 11:13:50
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
  author(s) may not have regular machine-based mail-
        boxes and therefore wish(es) to indicate an alternate  machine
        address.   In  the  second case, an author may wish additional
        persons to be made aware of, or responsible for,  replies.   A
        somewhat  different  use  may be of some help to "text message
        teleconferencing" groups equipped with automatic  distribution
        services:   include the address of that service in the "Reply-
        To" field of all messages  submitted  to  the  teleconference;
        then  participants  can  "reply"  to conference submissions to
        guarantee the correct distribution of any submission of  their
        own.

        Note:  The "Return-Path" field is added by the mail  transport
               service,  at the time of final deliver.  It is intended
               to identify a path back to the orginator  of  the  mes-
               sage.   The  "Reply-To"  field  is added by the message
               originator and is intended to direct replies.

-More-     4.4.4.  AUTOMATIC USE OF FROM / SENDER / REPLY-TO

        For systems which automatically  generate  address  lists  for
        replies to messages, the following recommendations are made:

            o   The "Sender" field mailbox should be sent  notices  of
                any  problems in transport or delivery of the original
                messages.  If there is no  "Sender"  field,  then  the
                "From" field mailbox should be used.

            o   The  "Sender"  field  mailbox  should  NEVER  be  used
                automatically, in a recipient's reply message.

            o   If the "Reply-To" field exists, then the reply  should
                go to the addresses indicated in that field and not to
                the address(es) indicated in the "From" field.




     August 13, 1982              - 22 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


            o   If there is a "From" field, but no  "Reply-To"  field,
                the  reply should be sent to the address(es) indicated
                in the "From" field.

        Sometimes, a recipient may actually wish to  communicate  with
        the  person  that  initiated  the  message  transfer.  In such
        cases, it is reasonable to use the "Sender" address.

        This recommendation is intended  only  for  automated  use  of
        originator-fields  and is not intended to suggest that replies
        may not also be sent to other recipients of messages.   It  is
        up  to  the  respective  mail-handling programs to decide what
        additional facilities will be provided.

        Examples are provided in Appendix A.

     4.5.  RECEIVER FIELDS
-More-
     4.5.1.  TO / RESENT-TO

        This field contains the identity of the primary recipients  of
        the message.

     4.5.2.  CC / RESENT-CC

        This field contains the identity of  the  secondary  (informa-
        tional) recipients of the message.

     4.5.3.  BCC / RESENT-BCC

        This field contains the identity of additional  recipients  of
        the  message.   The contents of this field are not included in
        copies of the message sent to the primary and secondary  reci-
        pients.   Some  systems  may choose to include the text of the
        "Bcc" field only in the author(s)'s  copy,  while  others  may
        also include it in the text sent to all those indicated in the
        "Bcc" list.

     4.6.  REFERENCE FIELDS

-More-     4.6.1.  MESSAGE-ID / RESENT-MESSAGE-ID

             This field contains a unique identifier  (the  local-part
        address  unit)  which  refers to THIS version of THIS message.
        The uniqueness of the message identifier is guaranteed by  the
        host  which  generates  it.  This identifier is intended to be
        machine readable and not necessarily meaningful to humans.   A
        message  identifier pertains to exactly one instantiation of a
        particular message; subsequent revisions to the message should


     August 13, 1982              - 23 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        each receive new message identifiers.

     4.6.2.  IN-REPLY-TO

             The contents of this field identify  previous  correspon-
        dence  which this message answers.  Note that if message iden-
        tifiers are used in this  field,  they  must  use  the  msg-id
        specification format.

     4.6.3.  REFERENCES

             The contents of this field identify other  correspondence
        which  this message references.  Note that if message identif-
        iers are used, they must use the msg-id specification format.

     4.6.4.  KEYWORDS

-More-             This field contains keywords  or  phrases,  separated  by
        commas.

     4.7.  OTHER FIELDS

     4.7.1.  SUBJECT

             This is intended to provide a summary,  or  indicate  the
        nature, of the message.

     4.7.2.  COMMENTS

             Permits adding text comments  onto  the  message  without
        disturbing the contents of the message's body.

     4.7.3.  ENCRYPTED

             Sometimes,  data  encryption  is  used  to  increase  the
        privacy  of  message  contents.   If the body of a message has
        been encrypted, to keep its contents private, the  "Encrypted"
        field  can be used to note the fact and to indicate the nature
        of the encryption.  The first <word> parameter  indicates  the
        software  used  to  encrypt the body, and the second, optional
-More-        <word> is intended to  aid  the  recipient  in  selecting  the
        proper  decryption  key.   This  code word may be viewed as an
        index to a table of keys held by the recipient.

        Note:  Unfortunately, headers must contain envelope,  as  well
               as  contents,  information.  Consequently, it is neces-
               sary that they remain unencrypted, so that  mail  tran-
               sport   services   may   access   them.   Since  names,
               addresses, and "Subject"  field  contents  may  contain


     August 13, 1982              - 24 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


               sensitive  information,  this  requirement limits total
               message privacy.

             Names of encryption software are registered with the Net-
        work  Information Center, SRI International, Menlo Park, Cali-
        fornia.

     4.7.4.  EXTENSION-FIELD

             A limited number of common fields have  been  defined  in
        this  document.   As  network mail requirements dictate, addi-
        tional fields may be standardized.   To  provide  user-defined
        fields  with  a  measure  of  safety,  in name selection, such
        extension-fields will never have names  that  begin  with  the
        string "X-".

             Names of Extension-fields are registered with the Network
-More-        Information Center, SRI International, Menlo Park, California.

     4.7.5.  USER-DEFINED-FIELD

             Individual users of network mail are free to  define  and
        use  additional  header  fields.   Such fields must have names
        which are not already used in the current specification or  in
        any definitions of extension-fields, and the overall syntax of
        these user-defined-fields must conform to this specification's
        rules   for   delimiting  and  folding  fields.   Due  to  the
        extension-field  publishing  process,  the  name  of  a  user-
        defined-field may be pre-empted

        Note:  The prefatory string "X-" will never  be  used  in  the
               names  of Extension-fields.  This provides user-defined
               fields with a protected set of names.







-More-










     August 13, 1982              - 25 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     5.  DATE AND TIME SPECIFICATION

     5.1.  SYNTAX

     date-time   =  [ day "," ] date time        ; dd mm yy
                                                 ;  hh:mm:ss zzz

     day         =  "Mon"  / "Tue" /  "Wed"  / "Thu"
                 /  "Fri"  / "Sat" /  "Sun"

     date        =  1*2DIGIT month 2DIGIT        ; day month year
                                                 ;  e.g. 20 Jun 82

     month       =  "Jan"  /  "Feb" /  "Mar"  /  "Apr"
                 /  "May"  /  "Jun" /  "Jul"  /  "Aug"
                 /  "Sep"  /  "Oct" /  "Nov"  /  "Dec"

-More-     time        =  hour zone                    ; ANSI and Military

     hour        =  2DIGIT ":" 2DIGIT [":" 2DIGIT]
                                                 ; 00:00:00 - 23:59:59

     zone        =  "UT"  / "GMT"                ; Universal Time
                                                 ; North American : UT
                 /  "EST" / "EDT"                ;  Eas

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4693 *bbs.tbbs*
10-08-94 11:14:05
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
tern:  - 5/ - 4
                 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
                 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
                 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
                 /  1ALPHA                       ; Military: Z = UT;
                                                 ;  A:-1; (J not used)
                                                 ;  M:-12; N:+1; Y:+12
                 / ( ("+" / "-") 4DIGIT )        ; Local differential
                                                 ;  hours+min. (HHMM)

     5.2.  SEMANTICS

          If included, day-of-week must be the day implied by the date
     specification.

          Time zone may be indicated in several ways.  "UT" is Univer-
     sal  Time  (formerly called "Greenwich Mean Time"); "GMT" is per-
     mitted as a reference to Universal Time.  The  military  standard
-More-     uses  a  single  character for each zone.  "Z" is Universal Time.
     "A" indicates one hour earlier, and "M" indicates 12  hours  ear-
     lier;  "N"  is  one  hour  later, and "Y" is 12 hours later.  The
     letter "J" is not used.  The other remaining two forms are  taken
     from ANSI standard X3.51-1975.  One allows explicit indication of
     the amount of offset from UT; the other uses  common  3-character
     strings for indicating time zones in North America.


     August 13, 1982              - 26 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     6.  ADDRESS SPECIFICATION

     6.1.  SYNTAX

     address     =  mailbox                      ; one addressee
                 /  group                        ; named list

     group       =  phrase ":" [#mailbox] ";"

     mailbox     =  addr-spec                    ; simple address
                 /  phrase route-addr            ; name & addr-spec

     route-addr  =  "<" [route] addr-spec ">"

     route       =  1#("@" domain) ":"           ; path-relative

     addr-spec   =  local-part "@" domain        ; global address
-More-
     local-part  =  word *("." word)             ; uninterpreted
                                                 ; case-preserved

     domain      =  sub-domain *("." sub-domain)

     sub-domain  =  domain-ref / domain-literal

     domain-ref  =  atom                         ; symbolic reference

     6.2.  SEMANTICS

          A mailbox receives mail.  It is a  conceptual  entity  which
     does  not necessarily pertain to file storage.  For example, some
     sites may choose to print mail on their line printer and  deliver
     the output to the addressee's desk.

          A mailbox specification comprises a person, system  or  pro-
     cess name reference, a domain-dependent string, and a name-domain
     reference.  The name reference is optional and is usually used to
     indicate  the  human name of a recipient.  The name-domain refer-
     ence specifies a sequence of sub-domains.   The  domain-dependent
     string is uninterpreted, except by the final sub-domain; the rest
-More-     of the mail service merely transmits it as a literal string.

     6.2.1.  DOMAINS

        A name-domain is a set of registered (mail)  names.   A  name-
        domain  specification  resolves  to  a subordinate name-domain
        specification  or  to  a  terminal  domain-dependent   string.
        Hence,  domain  specification  is  extensible,  permitting any
        number of registration levels.


     August 13, 1982              - 27 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        Name-domains model a global, logical, hierarchical  addressing
        scheme.   The  model is logical, in that an address specifica-
        tion is related to name registration and  is  not  necessarily
        tied  to  transmission  path.   The  model's  hierarchy  is  a
        directed graph, called an in-tree, such that there is a single
        path  from  the root of the tree to any node in the hierarchy.
        If more than one path actually exists, they are considered  to
        be different addresses.

        The root node is common to all addresses; consequently, it  is
        not  referenced.   Its  children  constitute "top-level" name-
        domains.  Usually, a service has access to its own full domain
        specification and to the names of all top-level name-domains.

        The "top" of the domain addressing hierarchy -- a child of the
        root  --  is  indicated  by  the right-most field, in a domain
        specification.  Its child is specified to the left, its  child
-More-        to the left, and so on.

        Some groups provide formal registration services;  these  con-
        stitute   name-domains   that  are  independent  logically  of
        specific machines.  In addition, networks and machines  impli-
        citly  compose name-domains, since their membership usually is
        registered in name tables.

        In the case of formal registration, an organization implements
        a  (distributed)  data base which provides an address-to-route
        mapping service for addresses of the form:

                         person@registry.organization

        Note that "organization" is a logical  entity,  separate  from
        any particular communication network.

        A mechanism for accessing "organization" is universally avail-
        able.   That mechanism, in turn, seeks an instantiation of the
        registry; its location is not indicated in the address specif-
        ication.   It  is assumed that the system which operates under
        the name "organization" knows how to find a subordinate regis-
        try.  The registry will then use the "person" string to deter-
-More-        mine where to send the mail specification.

        The latter,  network-oriented  case  permits  simple,  direct,
        attachment-related address specification, such as:

                              user@host.network

        Once the network is accessed, it is expected  that  a  message
        will  go  directly  to the host and that the host will resolve


     August 13, 1982              - 28 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        the user name, placing the message in the user's mailbox.

     6.2.2.  ABBREVIATED DOMAIN SPECIFICATION

        Since any number of  levels  is  possible  within  the  domain
        hierarchy,  specification  of  a  fully  qualified address can
        become inconvenient.  This standard permits abbreviated domain
        specification, in a special case:

            For the address of  the  sender,  call  the  left-most
            sub-domain  Level  N.   In a header address, if all of
            the sub-domains above (i.e., to the right of) Level  N
            are  the same as those of the sender, then they do not
            have to appear in the specification.   Otherwise,  the
            address must be fully qualified.

            This feature is subject  to  approval  by  local  sub-
-More-            domains.   Individual  sub-domains  may  require their
            member systems, which originate mail, to provide  full
            domain  specification only.  When permitted, abbrevia-
            tions may be present  only  while  the  message  stays
            within the sub-domain of the sender.

            Use of this mechanism requires the sender's sub-domain
            to reserve the names of all top-level domains, so that
            full specifications can be distinguished from abbrevi-
            ated specifications.

        For example, if a sender's address is:

                 sender@registry-A.registry-1.organization-X

        and one recipient's address is:

                recipient@registry-B.registry-1.organization-X

        and another's is:

                recipient@registry-C.registry-2.organization-X

-More-        then ".registry-1.organization-X" need not be specified in the
        the  message,  but  "registry-C.registry-2"  DOES  have  to be
        specified.  That is, the first two addresses may  be  abbrevi-
        ated, but the third address must be fully specified.

        When a message crosses a domain boundary, all  addresses  must
        be  specified  in  the  full format, ending with the top-level
        name-domain in the right-most field.  It is the responsibility
        of  mail  forwarding services to ensure that addresses conform


     August 13, 1982              - 29 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        with this requirement.  In the case of abbreviated  addresses,
        the  relaying  service must make the necessary expansions.  It
        should be noted that it often is difficult for such a  service
        to locate all occurrences of address abbreviations.  For exam-
        ple, it will not be possible to find such abbreviations within
        t

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4696 *bbs.tbbs*
10-08-94 11:14:19
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
he  body  of  the  message.   The "Return-Path" field can aid
        recipients in recovering from these errors.

        Note:  When passing any portion of an addr-spec onto a process
               which  does  not interpret data according to this stan-
               dard (e.g., mail protocol servers).  There must  be  NO
               LWSP-chars  preceding  or  following the at-sign or any
               delimiting period ("."), such as  shown  in  the  above
               examples,   and   only  ONE  SPACE  between  contiguous
               <word>s.

     6.2.3.  DOMAIN TERMS

        A domain-ref must be THE official name of a registry, network,
        or  host.   It  is  a  symbolic  reference, within a name sub-
        domain.  At times, it is necessary to bypass standard  mechan-
        isms  for  resolving  such  references,  using  more primitive
        information, such as a network host address  rather  than  its
-More-        associated host name.

        To permit such references, this standard provides the  domain-
        literal  construct.   Its contents must conform with the needs
        of the sub-domain in which it is interpreted.

        Domain-literals which refer to domains within the ARPA  Inter-
        net  specify  32-bit  Internet addresses, in four 8-bit fields
        noted in decimal, as described in Request for  Comments  #820,
        "Assigned Numbers."  For example:

                                 [10.0.3.19]

        Note:  THE USE OF DOMAIN-LITERALS IS STRONGLY DISCOURAGED.  It
               is  permitted  only  as  a means of bypassing temporary
               system limitations, such as name tables which  are  not
               complete.

        The names of "top-level" domains, and  the  names  of  domains
        under  in  the  ARPA Internet, are registered with the Network
        Information Center, SRI International, Menlo Park, California.

     6.2.4.  DOMAIN-DEPENDENT LOCAL STRING
-More-
        The local-part of an  addr-spec  in  a  mailbox  specification
        (i.e.,  the  host's  name for the mailbox) is understood to be


     August 13, 1982              - 30 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        whatever the receiving mail protocol server allows.  For exam-
        ple,  some systems do not understand mailbox references of the
        form "P. D. Q. Bach", but others do.

        This specification treats periods (".") as lexical separators.
        Hence,  their  presence  in  local-parts which are not quoted-
        strings, is detected.   However,  such  occurrences  carry  NO
        semantics.  That is, if a local-part has periods within it, an
        address parser will divide the local-part into several tokens,
        but  the  sequence  of  tokens will be treated as one uninter-
        preted unit.  The sequence  will  be  re-assembled,  when  the
        address is passed outside of the system such as to a mail pro-
        tocol service.

        For example, the address:

                           First.Last@Registry.Org
-More-
        is legal and does not require the local-part to be  surrounded
        with  quotation-marks.   (However,  "First  Last" DOES require
        quoting.)  The local-part of the address, when passed  outside
        of  the  mail  system,  within  the  Registry.Org  domain,  is
        "First.Last", again without quotation marks.

     6.2.5.  BALANCING LOCAL-PART AND DOMAIN

        In some cases, the boundary between local-part and domain  can
        be  flexible.  The local-part may be a simple string, which is
        used for the final determination of the  recipient's  mailbox.
        All  other  levels  of  reference  are, therefore, part of the
        domain.

        For some systems, in the case of abbreviated reference to  the
        local  and  subordinate  sub-domains,  it  may  be possible to
        specify only one reference within the domain  part  and  place
        the  other,  subordinate  name-domain  references  within  the
        local-part.  This would appear as:

                        mailbox.sub1.sub2@this-domain

-More-        Such a specification would be acceptable  to  address  parsers
        which  conform  to  RFC  #733,  but  do not support this newer
        Internet standard.  While contrary to the intent of this stan-
        dard, the form is legal.

        Also, some sub-domains have a specification syntax which  does
        not conform to this standard.  For example:

                      sub-net.mailbox@sub-domain.domain


     August 13, 1982              - 31 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        uses a different parsing  sequence  for  local-part  than  for
        domain.

        Note:  As a rule,  the  domain  specification  should  contain
               fields  which  are  encoded  according to the syntax of
               this standard and which contain  generally-standardized
               information.   The local-part specification should con-
               tain only that portion of the  address  which  deviates
               from the form or intention of the domain field.

     6.2.6.  MULTIPLE MAILBOXES

        An individual may have several mailboxes and wish  to  receive
        mail  at  whatever  mailbox  is  convenient  for the sender to
        access.  This standard does not provide a means of  specifying
        "any member of" a list of mailboxes.

-More-        A set of individuals may wish to receive mail as a single unit
        (i.e.,  a  distribution  list).  The <group> construct permits
        specification of such a list.  Recipient mailboxes are  speci-
        fied  within  the  bracketed  part (":" - ";").  A copy of the
        transmitted message is to be  sent  to  each  mailbox  listed.
        This  standard  does  not  permit  recursive  specification of
        groups within groups.

        While a list must be named, it is not required that  the  con-
        tents  of  the  list be included.  In this case, the <address>
        serves only as an indication of group distribution  and  would
        appear in the form:

                                    name:;

        Some mail  services  may  provide  a  group-list  distribution
        facility,  accepting  a single mailbox reference, expanding it
        to the full distribution list, and relaying the  mail  to  the
        list's  members.   This standard provides no additional syntax
        for indicating such a  service.   Using  the  <group>  address
        alternative,  while listing one mailbox in it, can mean either
        that the mailbox reference will be expanded to a list or  that
        there is a group with one member.
-More-
     6.2.7.  EXPLICIT PATH SPECIFICATION

        At times, a  message  originator  may  wish  to  indicate  the
        transmission  path  that  a  message  should  follow.  This is
        called source routing.  The normal addressing scheme, used  in
        an  addr-spec,  is  carefully separated from such information;
        the <route> portion of a route-addr is provided for such occa-
        sions.  It specifies the sequence of hosts and/or transmission


     August 13, 1982              - 32 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        services that are  to  be  traversed.   Both  domain-refs  and
        domain-literals may be used.

        Note:  The use of source routing is discouraged.   Unless  the
               sender has special need of path restriction, the choice
               of transmission route should be left to the mail  tran-
               sport service.

     6.3.  RESERVED ADDRESS

          It often is necessary to send mail to a site, without  know-
     ing  any  of its valid addresses.  For example, there may be mail
     system dysfunctions, or a user may wish to find  out  a  person's
     correct address, at that site.

          This standard specifies a single, reserved  mailbox  address
     (local-part)  which  is  to  be valid at each site.  Mail sent to
-More-     that address is to be routed to  a  person  responsible  for  the
     site's mail system or to a person with responsibility for general
     site operation.  The name of the reserved local-part address is:

                                Postmaster

     so that "Postmaster@domain" is required to be valid.

     Note:  This reserved local-part must be  matched  without  sensi-
            tivity to alphabetic case, so that "POSTMASTER", "postmas-
            ter", and even "poStmASteR" is to be accepted.












-More-










     August 13, 1982              - 33 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     7.  BIBLIOGRAPHY


     ANSI.  "USA Standard Code  for  Information  Interchange,"  X3.4.
        American  National Standards Institute: New York (19

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4699 *bbs.tbbs*
10-08-94 11:14:34
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
68).  Also
        in:  Feinler, E.  and J. Postel, eds., "ARPANET Protocol Hand-
        book", NIC 7104.

     ANSI.  "Representations of Universal Time, Local  Time  Differen-
        tials,  and United States Time Zone References for Information
        Interchange," X3.51-1975.  American National Standards  Insti-
        tute:  New York (1975).

     Bemer, R.W., "Time and the Computer."  In:  Interface  Age  (Feb.
        1979).

     Bennett, C.J.  "JNT Mail Protocol".  Joint Network Team,  Ruther-
        ford and Appleton Laboratory:  Didcot, England.

     Bhushan, A.K., Pogran, K.T., Tomlinson,  R.S.,  and  White,  J.E.
        "Standardizing  Network  Mail  Headers,"   ARPANET Request for
        Comments No. 561, Network Information Center  No.  18516;  SRI
-More-        International:  Menlo Park (September 1973).

     Birrell, A.D., Levin, R.,  Needham,  R.M.,  and  Schroeder,  M.D.
        "Grapevine:  An Exercise in Distributed Computing," Communica-
        tions of the ACM 25, 4 (April 1982), 260-274.

     Crocker,  D.H.,  Vittal,  J.J.,  Pogran,  K.T.,  Henderson,  D.A.
        "Standard  for  the  Format  of  ARPA  Network  Text Message,"
        ARPANET Request for  Comments  No.  733,  Network  Information
        Center  No.  41952.   SRI International:  Menlo Park (November
        1977).

     Feinler, E.J. and Postel, J.B.  ARPANET Protocol  Handbook,  Net-
        work  Information  Center  No.  7104   (NTIS AD A003890).  SRI
        International:  Menlo Park (April 1976).

     Harary, F.   "Graph  Theory".   Addison-Wesley:   Reading,  Mass.
        (1969).

     Levin, R. and Schroeder, M.  "Transport  of  Electronic  Messages
        through  a  Network,"   TeleInformatics  79, pp. 29-33.  North
        Holland (1979).  Also  as  Xerox  Palo  Alto  Research  Center
        Technical Report CSL-79-4.
-More-
     Myer, T.H. and Henderson, D.A.  "Message Transmission  Protocol,"
        ARPANET  Request  for  Comments,  No. 680, Network Information
        Center No. 32116.  SRI International:  Menlo Park (1975).


     August 13, 1982              - 34 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     NBS.  "Specification of Message Format for Computer Based Message
        Systems, Recommended Federal Information Processing Standard."
        National  Bureau   of   Standards:    Gaithersburg,   Maryland
        (October 1981).

     NIC.  Internet Protocol Transition Workbook.  Network Information
        Center,   SRI-International,  Menlo  Park,  California  (March
        1982).

     Oppen, D.C. and Dalal, Y.K.  "The Clearinghouse:  A Decentralized
        Agent  for  Locating  Named  Objects in a Distributed Environ-
        ment," OPD-T8103.  Xerox Office Products Division:  Palo Alto,
        CA. (October 1981).

     Postel, J.B.  "Assigned Numbers,"  ARPANET Request for  Comments,
        No. 820.  SRI International:  Menlo Park (August 1982).

-More-     Postel, J.B.  "Simple Mail Transfer  Protocol,"  ARPANET  Request
        for Comments, No. 821.  SRI International:  Menlo Park (August
        1982).

     Shoch, J.F.  "Internetwork naming, addressing  and  routing,"  in
        Proc. 17th IEEE Computer Society International Conference, pp.
        72-79, Sept. 1978, IEEE Cat. No. 78 CH 1388-8C.

     Su, Z. and Postel, J.  "The Domain Naming Convention for Internet
        User  Applications,"  ARPANET  Request  for Comments, No. 819.
        SRI International:  Menlo Park (August 1982).












-More-










     August 13, 1982              - 35 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


                                 APPENDIX


     A.  EXAMPLES

     A.1.  ADDRESSES

     A.1.1.  Alfred Neuman <Neuman@BBN-TENEXA>

     A.1.2.  Neuman@BBN-TENEXA

             These two "Alfred Neuman" examples have identical  seman-
        tics, as far as the operation of the local host's mail sending
        (distribution) program (also sometimes  called  its  "mailer")
        and  the remote host's mail protocol server are concerned.  In
        the first example, the  "Alfred  Neuman"  is  ignored  by  the
        mailer,  as "Neuman@BBN-TENEXA" completely specifies the reci-
-More-        pient.  The second example contains  no  superfluous  informa-
        tion,  and,  again,  "Neuman@BBN-TENEXA" is the intended reci-
        pient.

        Note:  When the message crosses name-domain  boundaries,  then
               these specifications must be changed, so as to indicate
               the remainder of the hierarchy, sarting with  the  top
               level.

     A.1.3.  "George, Ted" <Shared@Group.Arpanet>

             This form might be used to indicate that a single mailbox
        is  shared  by several users.  The quoted string is ignored by
        the originating host's mailer, because  "Shared@Group.Arpanet"
        completely specifies the destination mailbox.

     A.1.4.  Wilt . (the  Stilt) Chamberlain@NBA.US

             The "(the  Stilt)" is a comment, which is NOT included in
        the  destination  mailbox  address  handed  to the originating
        system's mailer.  The local-part of the address is the  string
        "Wilt.Chamberlain", with NO space between the first and second
        words.
-More-
     A.1.5.  Address Lists

     Gourmets:  Pompous Person <WhoZiWhatZit@Cordon-Bleu>,
                Childs@WGBH.Boston, Galloping Gourmet@
                ANT.Down-Under (Australian National Television),
                Cheapie@Discount-Liquors;,
       Cruisers:  Port@Portugal, Jones@SEA;,
         Another@Somewhere.SomeOrg


     August 13, 1982              - 36 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        This group list example points out the use of comments and the
        mixing of addresses and groups.

     A.2.  ORIGINATOR ITEMS

     A.2.1.  Author-sent

             George Jones logs into his host  as  "Jones".   He  sends
        mail himself.

            From:  Jones@Group.Org

        or

            From:  George Jones <Jones@Group.Org>

     A.2.2.  Secretary-sent
-More-
             George Jones logs in as Jones on his  host.   His  secre-
        tary,  who logs in as Secy sends mail for him.  Replies to the
        mail should go to George.

            From:    George Jones <Jones@Group>
            Sender:  Secy@Other-Group

     A.2.3.  Secretary-sent, for user of shared directory

             George Jones' secretary sends mail  for  George.  Replies
        should go to George.

            From:     George Jones<Shared@Group.Org>
            Sender:   Secy@Other-Group

        Note that there need not be a space between  "Jones"  and  the
        "<",  but  adding a space enhances readability (as is the case
        in other examples.

     A.2.4.  Committee activity, with one author

             George is a member of a committee.  He wishes to have any
-More-        replies to his message go to all committee members.

            From:     George Jones <Jones@Host.Net>
            Sender:   Jones@Host
            Reply-To: The Committee: Jones@Host.Net,
                                     Smith@Other.Org,
                                    Doe@Somewhere-Else;

        Note  that  if  George  had  not  included  himself   in   the


     August 13, 1982              - 37 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


        enumeration  of  The  Committee,  he  would not have gotten an
        implicit reply; the presence of the  "Reply-to"  field  SUPER-
        SEDES the sending of a reply to the person named in the "From"
        field.

     A.2.5.  Secretary acting as full agent of author

             George Jones asks his secretary  (Secy@Host)  to  send  a
        message for him in his capacity as Group.  He wants his secre-
        tary to handle all replies.

            From:     George Jones <Group@Host>
            Sender:   Secy@Host
            Reply-To: Secy@Host

     A.2.6.  Agent for user without online mailbox

-More-             A friend  of  George's,  Sarah,  is  visiting.   George's
        secretary  sends  some  mail to a friend of Sarah in computer-
        land.  Replies should go to George, whose mailbox is Jones  at
        Registry.

            From:     Sarah Friendly <Secy@Registry>
            Sender:   Secy-Name <Secy@Registry>
            Reply-To: Jones@Registry.

     A.2.7.  Agent for member of a committee

             George's secretary sends out a message which was authored
        jointly by all the members of a committee.  Note that the name
        of the committee cannot be specified, since <group> names  are
        not permitted in the From field.

            From:   Jones@Host,
                    Smith@Other-Host,
                    Doe@Somewhere-Else
            Sender: Secy@SHost



-More-










     August 13, 1982              - 38 -                 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4702 *bbs.tbbs*
10-08-94 11:14:49
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
     RFC #822


 
     Standard for ARPA Internet Text Messages


     A.3.  COMPLETE HEADERS

     A.3.1.  Minimum required

     Date:     26 Aug 76 1429 EDT        Date:     26 Aug 76 1429 EDT
     From:     Jones@Registry.Org   or   From:     Jones@Registry.Org
     Bcc:                                To:       Smith@Registry.Org

        Note that the "Bcc" field may be empty, while the  "To"  field
        is required to have at least one address.

     A.3.2.  Using some of the additional fields

     Date:     26 Aug 76 1430 EDT
     From:     George Jones<Group@Host>
     Sender:   Secy@SHOST
     To:       "Al Neuman"@Mad-Host,
-More-               Sam.Irving@Other-Host
     Message-ID:  <some.string@SHOST>

     A.3.3.  About as complex as you're going to get

     Date     :  27 Aug 76 0932 PDT
     From     :  Ken Davis <KDavis@This-Host.This-net>
     Subject  :  Re: The Syntax in the RFC
     Sender   :  KSecy@Other-Host
     Reply-To :  Sam.Irving@Reg.Organization
     To       :  George Jones <Group@Some-Reg.An-Org>,
                 Al.Neuman@MAD.Publisher
     cc       :  Important folk:
                   Tom Softwood <Balsa@Tree.Root>,
                   "Sam Irving"@Other-Host;,
                 Standard Distribution:
                   /main/davis/people/standard@Other-Host,
                   "<Jones>standard.dist.3"@Tops-20-Host>;
     Comment  :  Sam is away on business. He asked me to handle
                 his mail for him.  He'll be able to provide  a
                 more  accurate  explanation  when  he  returns
                 next week.
     In-Reply-To: <some.string@DBM.Group>, George's message
-More-     X-Special-action:  This is a sample of user-defined field-
                 names.  There could also be a field-name
                 "Special-action", but its name might later be
                 preempted
     Message-ID: <4231.629.XYzi-What@Other-Host>






     August 13, 1982              - 39 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     B.  SIMPLE FIELD PARSING

          Some mail-reading software systems may wish to perform  only
     minimal  processing,  ignoring  the internal syntax of structured
     field-bodies and treating them the  same  as  unstructured-field-
     bodies.  Such software will need only to distinguish:

         o   Header fields from the message body,

         o   Beginnings of fields from lines which continue fields,

         o   Field-names from field-contents.

          The abbreviated set of syntactic rules  which  follows  will
     suffice  for  this  purpose.  It describes a limited view of mes-
     sages and is a subset of the syntactic rules provided in the main
     part of this specification.  One small exception is that the con-
-More-     tents of field-bodies consist only of text:

     B.1.  SYNTAX


     message         =   *field *(CRLF *text)

     field           =    field-name ":" [field-body] CRLF

     field-name      =  1*<any CHAR, excluding CTLs, SPACE, and ":">

     field-body      =   *text [CRLF LWSP-char field-body]


     B.2.  SEMANTICS

          Headers occur before the message body and are terminated  by
     a null line (i.e., two contiguous CRLFs).

          A line which continues a header field begins with a SPACE or
     HTAB  character,  while  a  line  beginning a field starts with a
     printable character which is not a colon.

-More-          A field-name consists of one or  more  printable  characters
     (excluding  colon,  space, and control-characters).  A field-name
     MUST be contained on one line.  Upper and lower case are not dis-
     tinguished when comparing field-names.







     August 13, 1982              - 40 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     C.  DIFFERENCES FROM RFC #733

          The following summarizes the differences between this  stan-
     dard  and the one specified in Arpanet Request for Comments #733,
     "Standard for the Format of ARPA  Network  Text  Messages".   The
     differences  are  listed  in the order of their occurrence in the
     current specification.

     C.1.  FIELD DEFINITIONS

     C.1.1.  FIELD NAMES

        These now must be a sequence of  printable  characters.   They
        may not contain any LWSP-chars.

     C.2.  LEXICAL TOKENS

-More-     C.2.1.  SPECIALS

        The characters period ("."), left-square  bracket  ("["),  and
        right-square  bracket ("]") have been added.  For presentation
        purposes, and when passing a specification to  a  system  that
        does  not conform to this standard, periods are to be contigu-
        ous with their surrounding lexical tokens.   No  linear-white-
        space  is  permitted  between them.  The presence of one LWSP-
        char between other tokens is still directed.

     C.2.2.  ATOM

        Atoms may not contain SPACE.

     C.2.3.  SPECIAL TEXT

        ctext and qtext have had backslash ("\") added to the list  of
        prohibited characters.

     C.2.4.  DOMAINS

        The lexical tokens  <domain-literal>  and  <dtext>  have  been
        added.
-More-
     C.3.  MESSAGE SPECIFICATION

     C.3.1.  TRACE

        The "Return-path:" and "Received:" fields have been specified.





     August 13, 1982              - 41 -                      RC #822


 
     Standard for ARPA Internet Text Messages


     C.3.2.  FROM

        The "From" field must contain machine-usable addresses  (addr-
        spec).   Multiple  addresses may be specified, but named-lists
        (groups) may not.

     C.3.3.  RESENT

        The meta-construct of prefacing field names  with  the  string
        "Resent-"  has been added, to indicate that a message has been
        forwarded by an intermediate recipient.

     C.3.4.  DESTINATION

        A message must contain at least one destination address field.
        "To" and "CC" are required to contain at least one address.

-More-     C.3.5.  IN-REPLY-TO

        The field-body is no longer a comma-separated list, although a
        sequence is still permitted.

     C.3.6.  REFERENCE

        The field-body is no longer a comma-separated list, although a
        sequence is still permitted.

     C.3.7.  ENCRYPTED

        A field has been specified that permits  senders  to  indicate
        that the body of a message has been encrypted.

     C.3.8.  EXTENSION-FIELD

        Extension fields are prohibited from beginning with the  char-
        acters "X-".

     C.4.  DATE AND TIME SPECIFICATION

     C.4.1.  SIMPLIFICATION
-More-
        Fewer optional forms are permitted  and  the  list  of  three-
        letter time zones has been shortened.

     C.5.  ADDRESS SPECIFICATION






     August 13, 1982              - 42 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     C.5.1.  ADDRESS

        The use of quoted-string, and the ":"-atom-":" construct, have
        been  removed.   An  address  now  is  either a single mailbox
        reference or is a named list of addresses.  The  latter  indi-
        cates a group distribution.

     C.5.2.  GROUPS

        Group lists are now required to to have a name.   Group  lists
        may not be nested.

     C.5.3.  MAILBOX

        A mailbox specification  may  indicate  a  person's  name,  as
        before.   Such  a  named  list  no longer may specify multiple
        mailboxes and may not be nested.
-More-
     C.5.4.  ROUTE ADDRESSING

        Addresses now are taken to be absolute, global specifications,
        independent  of transmission paths.  The <oute> construct has
        been provided, to permit explicit specification  of  transmis-
        sion  path.   RFC  #733's  use  of multiple at-signs ("@") was
        intended as a general syntax  for  indicating  routing  and/or
        hierarchical addressing.  The current standard separates these
        specifications and only one at-sign is permitted.

     C.5.5.  AT-SIGN

        The string " at " no longer is used as an  address  delimiter.
        Only at-sign ("@") serves the function.

     C.5.6.  DOMAINS

        Hierarchical, logical name-domains have been added.

     C.6.  RESERVED ADDRESS

     The local-part "Postmaster" has been reserved, so that users  can
-More-     be guaranteed at least one valid address at a site.










     August 13, 1982              - 43 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     D.  ALPHABETICAL LISTING OF SYNTAX RULES

     address     =  mailbox                      ; one addressee
                 /  group                        ; named list
     addr-spec   =  local-part "@" domain        ; global address
     ALPHA       =  <any ASCII alphabetic character>
                                                 ; (101-132, 65.- 90.)
                                                 ; (141-172, 97.-122.)
     atom        =  1*<any CHAR except specials, SPACE

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4705 *bbs.tbbs*
10-08-94 11:15:04
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
 and CTLs>
     authentic   =   "From"       ":"   mailbox  ; Single author
                 / ( "Sender"     ":"   mailbox  ; Actual submittor
                     "From"       ":" 1#mailbox) ; Multiple authors
                                                 ;  or not sender
     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
     comment     =  "(" *(ctext / quoted-pair / comment) ")"
     CR          =  <ASCII CR, carriage return>  ; (     15,      13.)
     CRLF        =  CR LF
     ctext       =  <any CHAR excluding "(",     ; => may be folded
                     ")", "\" & CR, & including
                     linear-white-space>
     CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
                     character and DEL>          ; (    177,     127.)
     date        =  1*2DIGIT month 2DIGIT        ; day month year
                                                 ;  e.g. 20 Jun 82
     dates       =   orig-date                   ; Original
                   [ resent-date ]               ; Forwarded
-More-     date-time   =  [ day "," ] date time        ; dd mm yy
                                                 ;  hh:mm:ss zzz
     day         =  "Mon"  / "Tue" /  "Wed"  / "Thu"
                 /  "Fri"  / "Sat" /  "Sun"
     delimiters  =  specials / linear-white-space / comment
     destination =  "To"          ":" 1#address  ; Primary
                 /  "Resent-To"   ":" 1#address
                 /  "cc"          ":" 1#address  ; Secondary
                 /  "Resent-cc"   ":" 1#address
                 /  "bcc"         ":"  #address  ; Blind carbon
                 /  "Resent-bcc"  ":"  #address
     DIGIT       =  <any ASCII decimal digit>    ; ( 60- 71, 48.- 57.)
     domain      =  sub-domain *("." sub-domain)
     domain-literal =  "[" *(dtext / quoted-pair) "]"
     domain-ref  =  atom                         ; symbolic reference
     dtext       =  <any CHAR excluding "[",     ; => may be folded
                     "]", "\" & CR, & including
                     linear-white-space>
     extension-field =
                   <Any field which is defined in a document
                    published as a formal extension to this
                    specification; none will have names beginning
                    with the string "X-">
-More-

     August 13, 1982              - 44 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     field       =  field-name ":" [ field-body ] CRLF
     fields      =    dates                      ; Creation time,
                      source                     ;  author id & one
                    1*destination                ;  address required
                     *optional-field             ;  others optional
     field-body  =  field-body-contents
                    [CRLF LWSP-char field-body]
     field-body-contents =
                   <the ASCII characters making up the field-body, as
                    defined in the following sections, and consisting
                    of combinations of atom, quoted-string, and
                    specials tokens, or else consisting of texts>
     field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">
     group       =  phrase ":" [#mailbox] ";"
     hour        =  2DIGIT ":" 2DIGIT [":" 2DIGIT]
                                                 ; 00:00:00 - 23:59:59
     HTAB        =  <ASCII HT, horizontal-tab>   ; (     11,       9.)
-More-     LF          =  <ASCII LF, linefeed>         ; (     12,      10.)
     linear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE
                                                 ; CRLF => folding
     local-part  =  word *("." word)             ; uninterpreted
                                                 ; case-preserved
     LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE
     mailbox     =  addr-spec                    ; simple address
                 /  phrase route-addr            ; name & addr-spec
     message     =  fields *( CRLF *text )       ; Everything after
                                                 ;  first null line
                                                 ;  is message body
     month       =  "Jan"  /  "Feb" /  "Mar"  /  "Apr"
                 /  "May"  /  "Jun" /  "Jul"  /  "Aug"
                 /  "Sep"  /  "Oct" /  "Nov"  /  "Dec"
     msg-id      =  "<" addr-spec ">"            ; Unique message id
     optional-field =
                 /  "Message-ID"        ":"   msg-id
                 /  "Resent-Message-ID" ":"   msg-id
                 /  "In-Reply-To"       ":"  *(phrase / msg-id)
                 /  "References"        ":"  *(phrase / msg-id)
                 /  "Keywords"          ":"  #phrase
                 /  "Subject"           ":"  *text
                 /  "Comments"          ":"  *text
-More-                 /  "Encrypted"         ":" 1#2word
                 /  extension-field              ; To be defined
                 /  user-defined-field           ; May be pre-empted
     orig-date   =  "Date"        ":"   date-time
     originator  =   authentic                   ; authenticated addr
                   [ "Reply-To"   ":" 1#address] )
     phrase      =  1*word                       ; Sequence of words




     August 13, 1982              - 45 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     qtext       =  <any CHAR excepting <">,     ; => may be folded
                     "\" & CR, and including
                     linear-white-space>
     quoted-pair =  "\" CHAR                     ; may quote any char
     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
                                                 ;   quoted chars.
     received    =  "Received"    ":"            ; one per relay
                       ["from" domain]           ; sending host
                       ["by"   domain]           ; receiving host
                       ["via"  atom]             ; physical path
                      *("with" atom)             ; link/mail protocol
                       ["id"   msg-id]           ; receiver msg id
                       ["for"  addr-spec]        ; initial form
                        ";"    date-time         ; time received

     resent      =   resent-authentic
                   [ "Resent-Reply-To"  ":" 1#address] )
-More-     resent-authentic =
                 =   "Resent-From"      ":"   mailbox
                 / ( "Resent-Sender"    ":"   mailbox
                     "Resent-From"      ":" 1#mailbox  )
     resent-date =  "Resent-Date" ":"   date-time
     return      =  "Return-path" ":" route-addr ; return address
     route       =  1#("@" domain) ":"           ; path-relative
     route-addr  =  "<" [route] addr-spec ">"
     source      = [  trace ]                    ; net traversals
                      originator                 ; original mail
                   [  resent ]                   ; forwarded
     SPACE       =  <ASCII SP, space>            ; (     40,      32.)
     specials    =  "(" / ")" / "<" / ">" / "@"  ; Must be in quoted-
                 /  "," / ";" / ":" / "\" / <">  ;  string, to use
                 /  "." / "[" / "]"              ;  within a word.
     sub-domain  =  domain-ref / domain-literal
     text        =  <any CHAR, including bare    ; => atoms, specials,
                     CR & bare LF, but NOT       ;  comments and
                     including CRLF>             ;  quoted-strings are
                                                 ;  NOT recognized.
     time        =  hour zone                    ; ANSI and Military
     trace       =    return                     ; path to sender
                    1*received                   ; receipt tags
-More-     user-defined-field =
                   <Any field which has not been defined
                    in this specification or published as an
                    extension to this specification; names for
                    such fields must be unique and may be
                    pre-empted by published extensions>
     word        =  atom / quoted-string




     August 13, 1982              - 46 -                      RFC #822


 
     Standard for ARPA Internet Text Messages


     zone        =  "UT"  / "GMT"                ; Universal Time
                                                 ; North American : UT
                 /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
                 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
                 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
                 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
                 /  1ALPHA                       ; Military: Z = UT;
     <">         =  <ASCII quote mark>           ; (     42,      34.)









-More-






















-More-










     August 13, 1982              - 47 -                      RFC #822



-- 
alan@manawatu.planet.co.nz==alan@m

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4708 *bbs.tbbs*
10-08-94 11:15:19
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
anawatu.gen.nz~~brown_a@kosmos.wcc.govt.nz
Manawatu Internet Services,   "We should grant power over our affairs only to
Box 678, Palmerston North,     those who are reluctant to hold it and then only
New Zealand +64 25 480-204     under conditions that increase the reluctance."


<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4711 *bbs.tbbs*
10-08-94 11:15:40
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 976
From: alan@papaioea.manawatu.planet.co.nz (Alan Brown)
Newsgroups:
comp.bbs.tbbs,comp.bbs.misc,comp.bbs.majorbbs,alt.bbs,alt.bbs.allsysop,alt.bbs.a
miga.excelsior,alt.bbs.cnet,alt.bbs.first-class,alt.bbs.metal,alt.bbs.pcboard,al
t.bbs.renegade,alt.bbs.searchlight,alt.bbs.wildcat,alt.bbs.watergate
Subject: Re: Network specifications: RFC 976
Date: 8 Oct 1994 20:25:29 +1300
Organization: PlaNet (Manawatu) Palmerston North, New Zealand
Reply-To: /dev/null



Network Working Group                                    Mark. R. Horton
Request for Comments: 976                              Bell Laboratories
                                                           February 1986

                 UUCP Mail Interchange Format Standard

-More-
Status of This Memo

   In response to the need for maintenance of current information about
   the status and progress of various projects in the ARPA-Internet
   community, this RFC is issued for the benefit of community members.
   The information contained in this document is accurate as of the date
   of publication, but is subject to change. Subsequent RFCs will
   reflect such changes.

   This document defines the standard format for the transmission of
   mail messages between machines in the UUCP Project.  It does not
   address the format for storage of messages on one machine, nor the
   lower level transport mechanisms used to get the data from one
   machine to the next.  It represents a standard for conformance by
   hosts in the UUCP zone.  Distribution of this memo is unlimited.

1.  Introduction

   This document is intended to define the standard format for the
   transmission of mail messages between machines in the UUCP Project.
   It does not address the format for storage of messages on one
   machine, nor the lower level transport mechanisms used to get the
-More-   data from one machine to the next.  We assume remote execution of the
   rmail command (or equivalent) as the UUCP network primitive
   operation.

   The general philosophy is that, if we were to invent a new standard,
   we would make ourselves incompatible with existing systems.  There
   are already too many (incompatible) standards in the world, resulting
   in ambiguities such as a!b@c.d which is parsed a!(b@c.d) in the old
   UUCP world, and (a!b)@c.d in the Internet world.  (Neither standard
   allows parentheses, and in adding them we would be compatible with
   neither.  There would also be serious problems with the shell and
   with the UUCP transport mechanism.)

   Having an established, well documented, and extensible family of
   standards already defined by the ARPA community, we choose to adopt
   these standards for the UUCP zone as well.  (The UUCP zone is that
   subset of the community connected by UUCP which chooses to register
   with the UUCP project.  It represents an administrative entity.)
   While the actual transport mechanism is up to the two hosts to
   arrange, and might include UUCP, SMTP, MMDF, or some other facility,
   we adopt RFC-920 (domains) and RFC-822 (mail format) as UUCP zone
   standards.  All mail transmitted between systems should conform to

-More-
Horton                                                          [Page 1]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   those two standards.  In addition, should the ARPA community change
   these standards at a later time, we intend to change our standards to
   remain compatible with theirs, given a reasonable time to upgrade
   software.

   This document specifies an interpretation of RFC-822 and RFC-920 in
   the UUCP world.  It shows how the envelope should be encoded, and how
   UUCP routing is accomplished in an environment of mixed
   implementations.

2.  Basics

   Messages can be divided into two parts: the envelope and the message.
   The envelope contains information needed by the mail transport
   services, and the message contains information useful to the sender
   and receiver.  The message is divided into the header and the body.
-More-   Sometimes an intermediate host will add to the message (e.g. a
   Received line) but, except in the case of a gateway which must
   translate formats, it is not expected that intermediate hosts will
   change the message itself.  In the UUCP world, the envelope consists
   of the "destination addresses" (normally represented as the argument
   or arguments to the rmail command) and the "source path" (normally
   represented in one or more lines at the beginning of the message
   beginning either "From " or ">From ", sometimes called "From_
   lines".)  The RFC-822 header lines (including "From:" and "To:") are
   part of the message, as is the text of the message body itself.

   UUCP uses short host names, such as "ucbvax", at and below the
   transport layer.  We refer to these names as "6 letter names",
   because all implementations of UUCP consider at least the first 6
   letters significant.  (Some consider the first 7 or the first 14
   significant, but we must use the lowest common denominator.) UUCP
   names may be longer than 6 characters, but all such names much be
   unique in their first 6 letters.  RFC-920 domain names, such as
   "ucbvax.Berkeley.EDU", are called "domain names." The two names are
   different.  Upper and lower case are usually considered different in
   6 letter names, but are considered equivalent in domain names.  Names
   such as "ucbvax.UUCP", consisting of a 6 letter name followed by
   ".UUCP", previously were domain style references to a host with a
-More-   given 6 letter name.  Such names are being phased out in favor of
   organizational domain names such as "ucbvax.Berkeley.EDU"








Horton                                                          [Page 2]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


2.1  Hybrid Addresses

   There are (among others) two major kinds of mailing address syntax
   used in the UUCP world.  The a!b!c!user ("bang paths") is used by
   older UUCP software to explicitly route mail to the destination.  The
   user@domain ("domain") syntax is used in conformance to RFC-822.
   Under most circumstances, it is possible to look at a given address
   and determine which sort of address it is.  However, a hybrid address
   with a ! to the left of an @, such as a!b@c, is ambiguous: it could
   be interpreted as (a!b)@c.d or a!(b@c.d).  Both interpretations can
   be useful.  The first interpretation is required by RFC-822, the
   second is a de-facto standard in the UUCP software.

   Because of the confusion surrounding hybrid addresses, we recommend
   that all transport layer software avoid the use of hybrid addresses
   at all times.  A pure bang syntax can be used to disambiguate, being
-More-   written c.d!a!b in the first case above, and a!c.d!b in the second.
   We recommend that all implementations use this "bang domain" syntax
   unless they are sure of what is running on the next machine.

   In conformance with RFC-822 and the AT&T Message Transfer
   Architecture, we recommand that any host that accepts hybrid
   addresses apply the (a!b)@c.d interpretation.

2.2  Transport

   Since SMTP is not available to much of the UUCP domain, we define the
   method to be used for "remote execution" based transport mechanisms.
   The command to be "remotely executed" should read

      rmail user@domain ...

   with the message on the standard input of the command.  The
   "user@domain" argument must conform to RFC-920 and RFC-822.  More
   than one address argument is allowed, in order to save transmission
   costs for multiple recipients of the same message.

   An alternative form that may be used is

-More-      rmail domain!user

   where "domain" contains at least one period and no !'s.  This is to
   be interpreted exactly the same as user@domain, and can be used to
   transport a message across old UUCP hosts without fear that they
   might change the address.  The "user" string can contain any
   characters except "@".  This char

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4714 *bbs.tbbs*
10-08-94 11:15:54
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 976
acter is forbidden because it is
   unknown what an intermediate host might do to it. (It is also


Horton                                                          [Page 3]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   recommended that the "%" character be avoided, since some hosts treat
   "%" as a synonym for "@".) However, to route across hosts that don't
   understand domains, the following is possible

      rmail a!b!c!domain!user

   A "domain" can be distinguished from a 6 letter UUCP site name
   because a domain will contain at least one period.  (In the case of
   single level domains with no periods, a period should be added to the
   end, e.g. Mark.Horton@att becomes "att.!Mark.Horton".  A translator
   from ! to @ format should remove a trailing dot at the end of the
   domain, if one is present.) We don't expect this to happen, except
   for local networks using addresses like "user@host".

   A simple implementation can always generate domain!user syntax
   (rather than user@domain) since it is safe to assume that gateways
-More-   are class 3 (Classes are explained in section 3.5).

2.3  Batch SMTP

   Standard conforming implementations may optionally support a protocol
   called "Batch SMTP".  SMTP (Simple Mail Transfer Protocol) is the
   ARPA community standard mail transfer protocol (RFC-821). It is also
   used on BITNET and Mailnet.  While SMTP was designed to be
   interactive, it is possible to batch up a series of commands and send
   them off to a remote machine for batch execution.  This is used on
   BITNET, and is appropriate for UUCP.  One advantage to BSMTP is that
   the UNIX shell does not get involved in the interpretation of
   messages, so it becomes possible to include special characters such
   as space and parentheses in electronic messages.  (Such characters
   are expected to be popular in X.400 addresses.)

   To support BSMTP on UNIX, a conforming host should arrange that mail
   to the user "b-smtp" is interpreted as Batch SMTP commands.  (We use
   b-smtp instead of bsmtp because bsmtp might conflict with a login
   name.) Since many mail systems treat lines consisting of a single
   period as an "end of file" flag, and since SMTP uses the period as a
   required end of file flag, and to strip off headers, we put an extra
   "#" at the beginning of each BSMTP line.  On a sendmail system, an
-More-   easy way to implement this is to include the alias

      b-smtp: "|egrep '^#' | sed 's/^#//' | /usr/lib/sendmail -bs"

   which will feed the commands to an SMTP interpreter.  A better
   solution would appropriately check for errors and send back an error
   message to the sender.



Horton                                                          [Page 4]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   An example BSMTP message from seismo.CSS.GOV to cbosgd.ATT.COM is
   shown here.  This sample is the file shipped over the UUCP link for
   in put to the command "rmail b-smtp".  Note that the RFC- 822 message
   is between the DATA line and the period line.  The envelope
   information is passed in the MAIL FROM and RCPT TO lines.  The name
   of the sending system is in the HELO line.  The actual envelope
   information (above the # lines) is ignored and need not be present.

      From foo!bar Sun Jan 12 23:59:00 1986 remote from seismo Date:
      Tue, 18 Feb 86 13:07:36 EST
      From: mark@ucbvax.Berkeley.EDU
      Message-Id: <8602181807.AA10228@mark@ucbvax.Berkeley.EDU> To:
      b-smtp@cbosgd.ATT.COM

      #HELO seismo.CSS.GOV
      #MAIL FROM:<mark@ucbvax.Berkeley.EDU>
-More-      #RCPT TO:<mark@cbosgd.ATT.COM>
      #DATA
      #Date: Tue, 18 Feb 86 13:07:36 EST
      #From: mark@ucbvax.Berkeley.EDU
      #Message-Id: <8602181807.AA10228@mark@ucbvax.Berkeley.EDU> #To:
      mark@cbosgd.ATT.COM
      #
      #This is a sample message.
      #.
      #QUIT

2.4  Envelope

   The standard input of the command should begin with a single line

      From domain!user date remote from system

   followed immediately by the RFC-822 format headers and body of the
   message.  It is possible that there will be additional From_ lines
   preceding this line - these lines may be added, one line for each
   system the message passes through.  It is also possible that the
   "system" fields will be stacked into a single line, with many !'s in
   the "user" string.  The ">" character may precede the "From".  In
-More-   general, this is the "envelope" information, and should follow the
   same conventions that previous UUCP mail has followed.  The primary
   difference is that, when the system names are stacked up, if
   previously the result would have been a!b!c!mysys!me, the new result
   will be a!b!c!mysys!domain!me, where domain will contain at least one
   period, and "mysys" is often the 6 letter UUCP name for the same




Horton                                                          [Page 5]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   system named by "domain".  If the "domain!" is redundant, it may be
   omitted from the envelope, either in the source path or in the
   destination address.

   The receiving system may discard extra "From_" lines if it folds the
   information into a a single From_ line. It passes the
   path!domain!user along as the "envelope" information containing the
   address of the sender of the message, and possibly preserves the
   forwarding date and system in a newly generated header line, such as
   Received or Sent-By.  (Adding Received using this information is
   discouraged, since the line appears to have been added on a different
   system than the one actually adding it.  That other system may have
   actually included a Received line too! The Sent-By line is similar to
   Received, but the date need not be converted into RFC-822 format, and
   the line is not claimed to have been added by the system whose name
   is mentioned.)
-More-
   If the receiving system passes the message along to another system,
   it will add a "From_" line to the front, giving the same user@domain
   address for the sender, and its own name for the system.  If the
   receiving system stores the message in a local mailbox, it is
   recommended that a single "From_" line be generated at the front of
   the message, keeping the date (in the same format, since certain mail
   reading programs are sensitive to this format), and not using the
   "remote from system" syntax.

   Note - if an intermediate system adds text such as "system!" to the
   front of a "user@domain" syntax address, either in the envelope or
   the body, this is a violation of this standard and of RFC-822.

2.5  Routing

   In order to properly route mail, it is sometimes necessary to know
   what software a destination or intermediate machine is running, or
   what conventions it follows.  We have tried to minimize the amount of
   this information that is necessary, but the support of subdomains may
   require that different methods are used in different situations.  For
   purposes of predicting the behavior of other hosts, we divide hosts
   into three classes. These classes are:
-More-
   Class 1   old-style UUCP ! routing only.  We assume that the host
             understands local user names:

                  rmail user





Horton                                                          [Page 6]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


             and bang paths

                  rmail host1!host2!user

             but we assume nothing more about the host.  If we have
             no information about a host, we can treat it as class 1
             with no problems, since we make no assumptions about
             how it will handle hybrid addresses.

   Class 2   Old style UUCP ! routing, and 4.2BSD style domain
             parsing.  We assume the capabilities of class 1, plus
             the ability to understand

                  rmail user@domain

             if the "domain" is one outside the UUCP zone which
-More-             the host knows about.  Class 2 hosts do not necessarily
             understand domain!user or have routers.  Hosts in non-

             UUCP RFC-920 domains are considered class 2, even though
             they may not understand host!user.

   Class 3   All class 1 and 2 features are present.  In addition,
             class 3 hosts must be able to route UUCP mail for hosts
             that are not immediately adjacent and also understand
             the syntax

                  rmail domain!user

             as described above.  All gateways into UUCP must be
             class 3.

   This document describes what class 3 hosts must be able to process.
   Classes 1 and 2 already exist, and will conti

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4717 *bbs.tbbs*
10-08-94 11:16:09
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 976
nue to exist for a long
   time, but are viewed as "older systems" that may eventually be
   upgraded to class 3 status.

3.  Algorithm

   The algorithm for delivering a message to an address "user@domain"
   over UUCP links can be summarized as follows:

      a.  If the address is actually of the form @domain1:user@domain2,
          the "domain" used for the remainder should be "domain1"
          instead of "domain2", and the bang form reads
          domain1!domain2!user.



Horton                                                          [Page 7]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


      b.  Determine d: the most specific part of "domain" that is
          recognized locally.  This part will be a suffix of "domain".
          This can be done by scanning through a table with entries that
          go from specific to general, comparing entries with "domain"
          to see if the entries are at the tail of "domain".  For
          example, with the address "mark@osgd.cb.att.com", if the local
          host recognizes "uucp" and "att.com", d would be "att.com".
          The final entry in the table will be the null string, matching
          any completely unrecognized domain.

      c.  Look in the found table entry for g: the name of the
          "gateway", and for r: a UUCP !-style route to reach g.  G is
          not necessarily directly connected to the local host, but
          should be viewed as a gateway into the d domain.  (The values
          of g and r for a given d may be different on different hosts,
          although g will often be the same.)
-More-
      d.  Look at the beginning of r to find the "next hop" host n. N
          will always be directly connected to the local host.

      e.  Determine, if possible, the class of g and n.

      f.  Create an appropriate destination string s to be interpreted
          by n.  (See below.)

      g.  Pass the message off to n with destination information s.

      In an environment with other types of networks that do not use
      UUCP !  parsing, the table will probably contain additional
      information, such as which type of link to use.  The path
      information may be replaced in other environments by information
      specific to the network.

      The first entries in the table mentioned in part (b) are normally
      very specific, and allow well known routes to be constructed
      directly instead of routing through the domain tree.  The domain
      tree should be reserved for cases where no better information is
      available, or where traffic is very light, or where the default
      route is the best available.  If a better route is available, that
-More-      information can be put in the table.  If a host has any
      significant amount of traffic sent to a second host, it is
      normally expected that the two hosts will set up a direct UUCP
      link and make an entry in their tables to send mail directly, even
      if they are in separate domains.  Routing tables should be
      constructed to try to keep paths short and inexpensive for as much
      traffic as possible.



Horton                                                          [Page 8]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


      Here are some hints for the construction of the destination string
      n (step f above.) The "envelope recipient" information (the
      argument(s) to rmail) may be in either domain ! form
      (host.com!user) or domain @ form (user@host.com) as long as the
      sending site is sure the next hop is class 3.  If the next hop is
      not class 3, or the sending site is not sure, the ! form should be
      used, if possible, since it is hard to predict what the next hop
      would do with a hybrid address.

      If the gateway is known to be class 3, domain ! form may be used,
      but if the sending site is not sure, and the entire destination
      string was matched in the lookup (rather than some parent domain),
      the 6 letter ! form should be used: r!user, for example:
      dumbhost!host!user.  If the gateway appears to actually be a
      gateway for a subdomain, e.g. because a parent domain was matched,
      (such as the address user@host.gateway.com, where host.gateway.com
-More-      was not found but gateway.com was) it can be assumed to be at
      class 3.  This allows routes such as
      dumbhost!domain!host.domain.com!user to be used with a reasonable
      degree of safety.  If a direct link exists to the destination
      host, the user@domain syntax or the domain!user syntax may be
      used.

      All hosts conforming to this standard are class 3, and all
      subdomain gateways must be class 3 hosts.

4.  Example

   Suppose host A.D.COM sends mail to host C.D.COM.  Let's suppose that
   the 6 letter names for these hosts are aname and dname, and that the
   intermediate host to be routed through has name bname.

   The user on A types

      mail user@c.d.com

   The user interface creates a file such as

      Date:  9 Jan 1985   8:39 EST
-More-      From: myname@A.D.COM (My Name)
      Subject: sample message
      To: user@c.d.com

      This is a sample message

   and passes it to the transport mechanism with a command such as



Horton                                                          [Page 9]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


      sendmail user@c.d.com < file

   The transport mechanism looks up a route to c.d.com.  It does not
   find c.d.com in its database, so it looks up d.com, and finds that
   the path is bname!dname!%s, and that c.d.com is a class 3 host.
   Plugging in c.d.com!user, it gets the path bname!dname!c.d.com!user.
   (If it had found c.d.com with path bname!cname!%s, it would have
   omitted the domain from the resulting path: bname!cname!user, since
   it is not sure whether the destination host is class 1, 2, or 3.)

   It prepends a From_ line and passes it to uux:

      uux - bname!rmail dname!c.d.com!user < file2

   where file2 contains

-More-      From A.D.COM!user Wed Jan  9 12:43:35 1985 remote from aname Date:
      9 Jan 1985   8:39 EST
      From: myname@A.D.COM (My Name)
      Subject: sample message
      To: user@c.d.com

      This i a sample message

   (Note the blank line at the end of the message - at least one blank
   line is required.) This results in the command

      rmail dname!c.d.com!user

   running on B.  B prepends its own from line and passes the mail
   along:

      uux - dname!rmail c.d.com!user < file3

   where file3 contains

      From nuucp Wed Jan  9 12:43:35 1985 remote from bname >From
      A.D.COM!user Wed Jan  9 11:21:48 1985 remote from aname Date:  9
      Jan 1985   8:39 EST
-More-      From: myname@A.D.COM (My Name)
      Subject: sample message
      To: user@c.d.com

      This is a sample message





Horton                                                         [Page 10]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   The command

      rmail c.d.com!user

   is run on C, which stacks the From_ lines

      From bname!aname!A.D.COM!user Wed Jan  9 12:43:35 1985 Date:  9
      Jan 1985   8:39 EST
      From: myname@A.D.COM (My Name)
      Subject: sample message
      To: user@c.d.com

      This is a sample message

   and stores the message locally, probably in this same format.

-More-5.  Summary

   Hosts conforming to this standard should accept all of the following
   forms:

      rmail localuser               (no !%@ in user)
      rmail hosta!hostb!user        (no !%@ in user)
      rmail user@domain             (only . in domain)
      rmail domain!user             (at least 1 . in domain)
      rmail domain.!user            (in case domain has no dots)

   The "envelope" portion of the message ("From_" lines) should conform
   to existing conventions, using ! routing.  The "heading" portion of
   the message (the Word: lines such as Date:, From:, To:, and Subject:)
   must conform to RFC-822.  All header addresses must be in the @ form.
   The originating site should ensure that the addresses conform to
   RFC-822, since no requirement is placed on forwarding sites or
   gateways to transform addresses into legal RFC-822 format.  (Such
   forwarding sites and gateways should NOT, however, change a legal
   RFC-822 address such as user@domain into an illegal RFC-822 address
   such as gateway!user@domain, even if forwarding to a class 1 UUCP
   host.)

-More-6.  References

   [1]  Postel, J., "Simple Mail Transfer Protocol", RFC-821,
        USC/Information Sciences Institute, August, 1982.

   [2]  Crocker, D., "Standard for the Format of ARPA Internet Text
        Messages", RFC-822, Department of Electrical Engineering,
        University 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4720 *bbs.tbbs*
10-08-94 11:16:24
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 976
of Delaware, August, 1982.


Horton                                                         [Page 11]



RFC 976                                                    February 1986
UUCP Mail Interchange Format Standard


   [3]  Postel, J., and J. K. Reynolds, "Domain Requirements", RFC-920,
        USC/Information Sciences Institute, October, 1984.














-More-






















-More-









Horton                                                         [Page 12]


-- 
alan@manawatu.planet.co.nz==alan@manawatu.gen.nz~~brown_a@kosmos.wcc.govt.nz
Manawatu Internet Services,   "We should grant power over our affairs only to
Box 678, Palmerston North,     those who are reluctant to hold it and then only
New Zealand +64 25 480-204     under conditions that increase the reluctance."


<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4723 *bbs.tbbs*
10-08-94 11:16:44
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
From: alan@papaioea.manawatu.planet.co.nz (Alan Brown)
Newsgroups:
comp.bbs.tbbs,comp.bbs.misc,comp.bbs.majorbbs,alt.bbs,alt.bbs.allsysop,alt.bbs.a
miga.excelsior,alt.bbs.cnet,alt.bbs.first-class,alt.bbs.metal,alt.bbs.pcboard,al
t.bbs.renegade,alt.bbs.searchlight,alt.bbs.wildcat,alt.bbs.watergate
Subject: Re: Network specifications: RFC 1036
Date: 8 Oct 1994 20:27:04 +1300
Organization: PlaNet (Manawatu) Palmerston North, New Zealand
Reply-To: /dev/null







Network Working Group                                          M. Horton
Request for Comments:  1036                       AT&T Bell Laboratories
-More-Obsoletes: RFC-850                                              R. Adams
                                              Center for Seismic Studies
                                                           December 1987


              Standard for Interchange of USENET Messages



STATUS OF THIS MEMO

    This document defines the standard format for the interchange of
    network News messages among USENET hosts.  It updates and replaces
    RFC-850, reflecting version B2.11 of the News program.  This memo is
    disributed as an RFC to make this information easily accessible to
    the Internet community.  It does not specify an Internet standard.
    Distribution of this memo is unlimited.

1.  Introduction

    This document defines the standard format for the interchange of
    network News messages among USENET hosts.  It describes the format
    for messages themselves and gives partial standards for transmission
-More-    of news.  The news transmission is not entirely in order to give a
    good deal of flexibility to the hosts to choose transmission
    hardware and software, to batch news, and so on.

    There are five sections to this document.  Section two defines the
    format.  Section three defines the valid control messages.  Section
    four specifies some valid transmission methods.  Section five
    describes the overall news propagation algorithm.

2.  Message Format

    The primary consideration in choosing a message format is that it
    fit in with existing tools as well as possible.  Existing tools
    include implementations of both mail and news.  (The notesfiles
    system from the University of Illinois is considered a news
    implementation.)  A standard format for mail messages has existed
    for many years on the Internet, and this format meets most of the
    needs of USENET.  Since the Internet format is extensible,
    extensions to meet the additional needs of USENET are easily made
    within the Internet standard.  Therefore, the rule is adopted that
    all USENET news messages must be formatted as valid Internet mail
    messages, according to the Internet standard RFC-822.  The USENET
    News standard is more restrictive than the Internet standard,
-More-


Horton & Adams                                                  [Page 1]

RFC 1036              Standard for USENET Messages         December 1987


    placing additional requirements on each message and forbidding use
    of certain Internet features.  However, it should always be possible
    to use a tool expecting an Internet message to process a news
    message.  In any situation where this standard conflicts with the
    Internet standard, RFC-822 should be considered correct and this
    standard in error.

    Here is an example USENET message to illustrate the fields.

              From: jerry@eagle.ATT.COM (Jerry Schwarz)
              Path: cbosgd!mhuxj!mhuxt!eagle!jerry
              Newsgroups: news.announce
              Subject: Usenet Etiquette -- Please Read
              Message-ID: <642@eagle.ATT.COM>
              Date: Fri, 19 Nov 82 16:14:55 GMT
              Followup-To: news.misc
              Expires: Sat, 1 Jan 83 00:00:00 -0500
              Organization: AT&T Bell Laboratories, Murray Hill

-More-              The body of the message comes here, after a blank line.

      Here is an example of a message in the old format (before the
      existence of this standard). It is recommended that
      implementations also accept messages in this format to ease upward
      conversion.

               From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
               Newsgroups: news.misc
               Title: Usenet Etiquette -- Please Read
               Article-I.D.: eagle.642
               Posted: Fri Nov 19 16:14:55 1982
               Received: Fri Nov 19 16:59:30 1982
               Expires: Mon Jan 1 00:00:00 1990

               The body of the message comes here, after a blank line.

      Some news systems transmit news in the A format, which looks like
      this:

                Aeagle.642
                news.misc
                cbosgd!mhuxj!mhuxt!eagle!jerry
-More-                Fri Nov 19 16:14:55 1982
                Usenet Etiquette - Please Read
                The body of the message comes here, with no blank line.

    A standard USENET message consists of several header lines, followed
    by a blank line, followed by the body of the message.  Each header



Horton & Adams                                                  [Page 2]

RFC 1036              Standard for USENET Messages         December 1987


    line consist of a keyword, a colon, a blank, and some additional
    information.  This is a subset of the Internet standard, simplified
    to allow simpler software to handle it.  The "From" line may
    optionally include a full name, in the format above, or use the
    Internet angle bracket syntax.  To keep the implementations simple,
    other formats (for example, with part of the machine address after
    the close parenthesis) are not allowed.  The Internet convention of
    continuation header lines (beginning with a blank or tab) is
    allowed.

    Certain headers are required, and certain other headers are
    optional.  Any unrecognized headers are allowed, and will be passed
    through unchanged.  The required header lines are "From", "Date",
    "Newsgroups", "Subject", "Message-ID", and "Path".  The optional
    header lines are "Followup-To", "Expires", "Reply-To", "Sender",
    "References", "Control", "Distribution", "Keywords", "Summary",
    "Approved", "Lines", "Xref", and "Organization".  Each of these
    header lines will be described below.

-More-2.1.  Required Header lines

2.1.1.  From

    The "From" line contains the electronic mailing address of the
    person who sent the message, in the Internet syntax.  It may
    optionally also contain the full name of the person, in parentheses,
    after the electronic address.  The electronic address is the same as
    the entity responsible for originating the message, unless the
    "Sender" header is present, in which case the "From" header might
    not be verified.  Note that in all host and domain names, upper and
    lower case are considered the same, thus "mark@cbosgd.ATT.COM",
    "mark@cbosgd.att.com", and "mark@CBosgD.ATt.COm" are all equivalent.
    User names may or may not be case sensitive, for example,
    "Billy@cbosgd.ATT.COM" might be different from
    "BillY@cbosgd.ATT.COM".  Programs should avoid changing the case of
    electronic addresses when forwarding news or mail.

    RFC-822 specifies that all text in parentheses is to be interpreted
    as a comment.  It is common in Internet mail to place the full name
    of the user in a comment at the end of the "From" line.  This
    standard specifies a more rigid syntax.  The full name is not
    considered a comment, but an optional part of the header line.
-More-    Either the full name is omitted, or it appears in parentheses after
    the electronic address of the person posting the message, or it
    appears before an electronic address which is enclosed in angle
    brackets.  Thus, the three permissible forms are:





Horton & Adams                                                  [Page 3]

RFC 1036              Standard for USENET Messages         December 1987


              From: mark@cbosgd.ATT.COM
              From: mark@cbosgd.ATT.COM (Mark Horton)
              From: Mark Hort

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4726 *bbs.tbbs*
10-08-94 11:16:59
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
on <mark@cbosgd.ATT.COM>

    Full names may contain any printing ASCII characters from space
    through tilde, except that they may not contain "(" (left
    parenthesis), ")" (right parenthesis), "<" (left angle bracket), or
    ">" (right angle bracket).  Additional restrictions may be placed on
    full names by the mail standard, in particular, the characters ","
    (comma), ":" (colon), "@" (at), "!" (bang), "/" (slash), "="
    (equal), and ";" (semicolon) are inadvisable in full names.

2.1.2.  Date

    The "Date" line (formerly "Posted") is the date that the message was
    originally posted to the network.  Its format must be acceptable
    both in RFC-822 and to the getdate(3) routine that is provided with
    the Usenet software.  This date remains unchanged as the message is
    propagated throughout the network.  One format that is acceptable to
    both is:
-More-
                      Wdy, DD Mon YY HH:MM:SS TIMEZONE

    Several examples of valid dates appear in the sample message above.
    Note in particular that ctime(3) format:

                          Wdy Mon DD HH:MM:SS YYYY

    is not acceptable because it is not a valid RFC-822 date.  However,
    since older software still generates this format, news
    implementations are encouraged to accept this format and translate
    it into an acceptable format.

    There is no hope of having a complete list of timezones.  Universal
    Time (GMT), the North American timezones (PST, PDT, MST, MDT, CST,
    CDT, EST, EDT) and the +/-hhmm offset specifed in RFC-822 should be
    supported.  It is recommended that times in message headers be
    transmitted in GMT and displayed in the local time zone.

2.1.3.  Newsgroups

    The "Newsgroups" line specifies the newsgroup or newsgroups in which
    the message belongs.  Multiple newsgroups may be specified,
-More-    separated by a comma.  Newsgroups specified must all be the names of
    existing newsgroups, as no new newsgroups will be created by simply
   posting to them.





Horton & Adams                                                  [Page 4]

RFC 1036              Standard for USENET Messages         December 1987


    Wildcards (e.g., the word "all") are never allowed in a "News-
    groups" line.  For example, a newsgroup comp.all is illegal,
    although a newsgroup rec.sport.football is permitted.

    If a message is received with a "Newsgroups" line listing some valid
    newsgroups and some invalid newsgroups, a host should not remove
    invalid newsgroups from the list.  Instead, the invalid newsgroups
    should be ignored.  For example, suppose host A subscribes to the
    classes btl.all and comp.all, and exchanges news messages with host
    B, which subscribes to comp.all but not btl.all.  Suppose A receives
    a message with Newsgroups: comp.unix,btl.general.

    This message is passed on to B because B receives comp.unix, but B
    does not receive btl.general.  A must leave the "Newsgroups" line
    unchanged.  If it were to remove btl.general, the edited header
    could eventually re-enter the btl.all class, resulting in a message
    that is not shown to users subscribing to btl.general.  Also,
    follow-ups from outside btl.all would not be shown to such users.

-More-2.1.4.  Subject

    The "Subject" line (formerly "Title") tells what the message is
    about.  It should be suggestive enough of the contents of the
    message to enable a reader to make a decision whether to read the
    message based on the subject alone.  If the message is submitted in
    response to another message (e.g., is a follow-up) the default
    subject should begin with the four characters "Re:", and the
    "References" line is required.  For follow-ups, the use of the
    "Summary" line is encouraged.

2.1.5.  Message-ID

    The "Message-ID" line gives the message a unique identifier.  The
    Message-ID may not be reused during the lifetime of any previous
    message with the same Message-ID.  (It is recommended that no
    Message-ID be reused for at least two years.)  Message-ID's have the
    syntax:

                     <string not containing blank or ">">

    In order to conform to RFC-822, the Message-ID must have the format:

-More-                          <unique@full_domain_name>

    where full_domain_name is the full name of the host at which the
    message entered the network, including a domain that host is in, and
    unique is any string of printing ASCII characters, not including "<"
    (left angle bracket), ">" (right angle bracket), or "@" (at sign).



Horton & Adams                                                  [Page 5]

RFC 1036              Standard for USENET Messages         December 1987


    For example, the unique part could be an integer representing a
    sequence number for messages submitted to the network, or a short
    string derived from the date and time the message was created.  For
    example, a valid Message-ID for a message submitted from host ucbvax
    in domain "Berkeley.EDU" would be "<4123@ucbvax.Berkeley.EDU>".
    Programmers are urged not to make assumptions about the content of
    Message-ID fields from other hosts, but to treat them as unknown
    character strings.  It is not safe, for example, to assume that a
    Message-ID will be under 14 characters, that it is unique in the
    first 14 characters, nor that is does not contain a "/".

    The angle brackets are considered part of the Message-ID.  Thus, in
    references to the Message-ID, such as the ihave/sendme and cancel
    control messages, the angle brackets are included.  White space
    characters (e.g., blank and tab) are not allowed in a Message-ID.
    Slashes ("/") are strongly discouraged.  All characters between the
    angle brackets must be printing ASCII characters.

2.1.6.  Path
-More-
    This line shows the path the message took to reach the current
    system.  When a system forwards the message, it should add its own
    name to the ist of systems in the "Path" line.  The names may be
    separated by any punctuation character or characters (except "."
    which is considered part of the hostname).  Thus, the following are
    valid entries:

                   cbosgd!mhuxj!mhuxt
                   cbosgd, mhuxj, mhuxt
                   @cbosgd.ATT.COM,@mhuxj.ATT.COM,@mhuxt.ATT.COM
                   teklabs, zehntel, sri-unix@cca!decvax

    (The latter path indicates a message that passed through decvax,
    cca, sri-unix, zehntel, and teklabs, in that order.) Additional
    names should be added from the left.  For example, the most recently
    added name in the fourth example was teklabs.  Letters, digits,
    periods and hyphens are considered part of host names; other
    punctuation, including blanks, are considered separators.

    Normally, the rightmost name will be the name of the originating
    system.  However, it is also permissible to include an extra entry
    on the right, which is the name of the sender.  This is for upward
-More-    compatibility with older systems.

    The "Path" line is not used for replies, and should not be taken as
    a mailing address.  It is intended to show the route the message
    traveled to reach the local host.  There are several uses for this
    information.  One is to monitor USENET routing for performance



Horton & Adams                                                  [Page 6]

RFC 1036              Standard for USENET Messages         December 1987


    reasons.  Another is to establish a path to reach new hosts.
    Perhaps the most important use is to cut down on redundant USENET
    traffic by failing to forward a message to a host that is known to
    have already received it.  In particular, when host A sends a
    message to host B, the "Path" line includes A, so that host B will
    not immediately send the message back to host A.  The name each host
    uses to identify itself should be the same as the name by which its
    neighbors know it, in order to make this optimization possible.

    A host adds its own name to the front of a path when it receives a
    message from another host.  Thus, if a message with path "A!X!Y!Z"
    is passed from host A to host B, B will add its own name to the path
    when it receives the message from A, e.g., "B!A!X!Y!Z".  If B then
    passes the message on to C, the message sent to C will contain the
    path "B!A!X!Y!Z", and when C receives it, C will change it to
    "C!B!A!X!Y!Z".

    Special upward compatibility note:  Since the "From", "Sender", and
    "Reply-To" lines are in Internet format, and since many USENET hosts
-More-    do not yet have mailers capable of understanding Internet format, it
    would break the reply capability to completely sever the connection
    between the "Path" header and the reply function.  It is recognized
    that the path is not always a valid reply string in older
    implementations, and no requirement to fix this problem is placed on
   

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4729 *bbs.tbbs*
10-08-94 11:17:16
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
 implementations.  However, the existing convention of placing the
    host name and an "!"  at the front of the path, and of starting the
    path with the host name, an "!", and the user name, should be
    maintained when possible.

2.2.  Optional Headers

2.2.1.  Reply-To

    This line has the same format as "From".  If present, mailed replies
    to the author should be sent to the name given here.  Otherwise,
    replies are mailed to the name on the "From" line. (This does not
    prevent additional copies from being sent to recipients named by the
    replier, or on "To" or "Cc" lines.)  The full name may be optionally
    given, in parentheses, as in the "From" line.

2.2.2.  Sender

-More-    This field is present only if the submitter manually enters a "From"
    line.  It is intended to record the entity responsible for
    submitting the message to the network.  It should be verified by the
    softwae at the submitting host.





Horton & Adams                                                  [Page 7]

RFC 1036              Standard for USENET Messages         December 1987


    For example, if John Smith is visiting CCA and wishes to post a
    message to the network, using friend Sarah Jones' account, the
    message might read:

              From: smith@ucbvax.Berkeley.EDU (John Smith)
              Sender: jones@cca.COM (Sarah Jones)

    If a gateway program enters a mail message into the network at host
    unix.SRI.COM, the lines might read:

              From: John.Doe@A.CS.CMU.EDU
              Sender: network@unix.SRI.COM

    The primary purpose of this field is to be able to track down
    messages to determine how they were entered into the network.  The
    full name may be optionally given, in parentheses, as in the "From"
    line.

2.2.3.  Followup-To
-More-
    This line has the same format as "Newsgroups".  If present, follow-
    up messages are to be posted to the newsgroup or newsgroups listed
    here.  If this line is not present, follow-ups are posted to the
    newsgroup or newsgroups listed in the "Newsgroups" line.

    If the keyword poster is present, follow-up messages are not
    permitted.  The message should be mailed to the submitter of the
    message via mail.

2.2.4.  Expires

    This line, if present, is in a legal USENET date format.  It
    specifies a suggested expiration date for the message.  If not
    present, the local default expiration date is used.  This field is
    intended to be used to clean up messages with a limited usefulness,
    or to keep important messages around for longer than usual.  For
    example, a message announcing an upcoming seminar could have an
    expiration date the day after the seminar, since the message is not
    useful after the seminar is over.  Since local hosts have local
    policies for expiration of news (depending on available disk space,
    for instance), users are discouraged from providing expiration dates
    for messages unless there is a natural expiration date associated
-More-    with the topic.  System software should almost never provide a
    default "Expires" line.  Leave it out and allow local policies to be
    used unless there is a good reason not to.






Horton & Adams                                                  [Page 8]

RFC 1036              Standard for USENET Messages         December 1987


2.2.5.  References

    This field lists the Message-ID's of any messages prompting the
    submission of this message.  It is required for all follow-up
    messages, and forbidden when a new subject is raised.
    Implementations should provide a follow-up command, which allows a
    user to post a follow-up message.  This command should generate a
    "Subject" line which is the same as the original message, except
    that if the original subject does not begin with "Re:" or "re:", the
    four characters "Re:" are inserted before the subject.  If there is
    no "References" line on the original header, the "References" line
    should contain the Message-ID of the original message (including the
    angle brackets).  If the original message does have a "References"
    line, the follow-up message should have a "References" line
    containing the text of the original "References" line, a blank, and
    the Message-ID of the original message.

    The purpose of the "References" header is to allow messages to be
    grouped into conversations by the user interface program.  This
-More-    allows conversations within a newsgroup to be kept together, and
    potentially users might shut off entire conversations without
    unsubscribing to a newsgroup.  User interfaces need not make use of
    this header, but all automatically generated follow-ups should
    generate the "References" line for the benefit of systems that do
    use it, and manually generated follow-ups (e.g., typed in well after
    the original message has been printed by the machine) should be
    encouraged to include them as well.

    It is permissible to not include the entire previous "References"
    line if it is too long.  An attempt should be made to include a
    reasonable number of backwards references.

2.2.6.  Control

    If a message contains a "Control" line, the message is a control
    message.  Control messages are used for communication among USENET
    host machines, not to be read by users.  Control messages are
    distributed by the same newsgroup mechanism as ordinary messages.
    The body of the "Control" header line is the message to the host.

    For upward compatibility, messages that match the newsgroup pattern
    "all.all.ctl" should also be interpreted as control messages.  If no
-More-    "Control" header is present on such messages, the subject is used as
    the control message.  However, messages on newsgroups matching this
    pattern do not conform to this standard.






Horton & Adams                                                  [Page 9]

RFC 1036              Standard for USENET Messages         December 1987


    Also for upward compatibility, if the first 4 characters of the
    "Subject:" line are "cmsg", the rest of the "Subject:" line should
    be interpreted as a control message.

2.2.7.  Distribution

    This line is used to alter the distribution scope of the message.
    It is a comma separated list similar to the "Newsgroups" line.  User
    subscriptions are still controlled by "Newsgroups", but the message
    is sent to all systems subscribing to the newsgroups on the
    "Distribution" line in addition to the "Newsgroups" line.  For the
    message to be transmitted, the receiving site must normally receive
    one of the specified newsgroups AND must receive one of the
    specified distributions.  Thus, a message concerning a car for sale
    in New Jersey might have headers including:

                   Newsgroups: rec.auto,misc.forsale
                   Distribution: nj,ny

-More-    so that it would only go to persons subscribing to rec.auto or misc.
    for sale within New Jersey or New York.  The intent of this header
    is to restrict the distribution of a newsgroup further, not to
    increase it.  A local newsgroup, such as nj.crazy-eddie, will
    probably not be propagated by hosts outside New Jersey that do not
    show such a newsgroup as valid.  A follow-up message should default
    to the same "Distribution" line as the original message, but the
    user can change it to a more limited one, or escalate the
    distribution if it was originally restricted and a more widely
    distributed reply is appropriate.

2.2.8.  Organization

    The text of this line is a short phrase describing the organization
    to which the sender belongs, or to which the machine belongs.  The
    intent of this line is to help identify the person posting the
    message, since host names are often cryptic enough to make it hard
    to recognize the organization by the electronic address.

2.2.9.  Keywords

    A few well-selected keywords identifying the message should be on
    this line.  This is used as an aid in determining if this message is
-More-    interesting to the reader.

2.2.10.  Summary

    This line should contain a brief summary of the message.  It is
    usually used as part of a follow-up to another message.  Again, it



Horton & Adams                                                 [Page 10]

RFC 1036              Standard for USENET Messages         December 1987


    is very useful to the reader in determining whether to read the
    message.

2.2.11.  Approved

    This line is required for any message posted to a moderated
    newsgroup.  It should be added by the moderator and consist of his
    mail address.  It is also required with certain control messages.

2.2.12.  Lines

    This contains a count of the number of lines in the body of the
    message.

2.2.13.  Xref

    This line contains the name of the host (with domains omitted) and a
    white space separated list of colon

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4732 *bbs.tbbs*
10-08-94 11:17:31
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
-separated pairs of newsgroup
    names and message numbers.  These are the newsgroups listed in the
    "Newsgroups" line and the corresponding message numbers from the
    spool directory.

    This is only of value to the local system, so it should not be
    transmitted.  For example, in:

               Path: seismo!lll-crg!lll-lcc!pyramid!decwrl!reid
               From: reid@decwrl.DEC.COM (Brian Reid)
               Newsgroups: news.lists,news.groups
               Subject: USENET READERSHIP SUMMARY REPORT FOR SEP 86
               Message-ID: <5658@decwrl.DEC.COM>
               Date: 1 Oct 86 11:26:15 GMT
               Organization: DEC Western Research Laboratory
               Lines: 441
               Approved: reid@decwrl.UUCP
               Xref: seismo news.lists:461 news.groups:6378
-More-
    the "Xref" line shows that the message is message number 461 in the
    newsgroup news.lists, and message number 6378 in the newsgroup
    news.groups, on host seismo.  This information may be used by
    certain user interfaces.

3.  Control Messages

    This section lists the control messages currently defined.  The body
    of the "Control" header line is the control message.  Messages are a
    sequence of zero or more words, separated by white space (blanks or
    tabs).  The first word is the name of the control message, remaining
    words are parameters to the message.  The remainder of the header



Horton & Adams                                                 [Page 11]

RFC 1036              Standard for USENET Messages         December 1987


    and the body of the message are also potential parameters; for
    example, the "From" line might suggest an address to which a
    response is to be mailed.

    Implementors and administrators may choose to allow control messages
    to be carried out automatically, or to queue them for annual
    processing.  However, manually processed messages should be dealt
    with promptly.

    Failed control messages should NOT be mailed to the originator of
    the message, but to the local "usenet" account.

3.1.  Cancel

                     cancel <Message-ID>


    If a message with the given Message-ID is present on the local
    system, the message is cancelled.  This mechanism allows a user to
-More-    cancel a message after the message has been distributed over the
    network.

    If the system is unable to cancel the message as requested, it
    should not forward the cancellation request to its neighbor systems.

    Only the author of the message or the local news administrator is
    allowed to send this message.  The verified sender of a message is
    the "Sender" line, or if no "Sender" line is present, the "From"
    line.  The verified sender of the cancel message must be the same as
    either the "Sender" or "From" field of the original message.  A
    verified sender in the cancel message is allowed to match an
    unverified "From" in the original message.

3.2.  Ihave/Sendme

                   ihave <Message-ID list> [<remotesys>]
                   sendme <Message-ID list> [<remotesys>]

    This message is part of the ihave/sendme protocol, which allows one
    host (say A) to tell another host (B) that a particular message has
    been received on A.  Suppose that host A receives message
    "<1234@ucbvax.Berkeley.edu>", and wishes to transmit the message to
-More-    host B.

    A sends the control message "ihave <1234@ucbvax.Berkeley.edu> A" to
    host B (by posting it to newsgroup to.B).  B responds with the
    control message "sendme <1234@ucbvax.Berkeley.edu> B" (on newsgroup
    to.A), if it has not already received the message.  Upon receiving



Horton & Adams                                                 [Page 12]

RFC 1036              Standard for USENET Messages         December 1987


    the sendme message, A sends the message to B.

    This protocol can be used to cut down on redundant traffic between
    hosts.  It is optional and should be used only if the particular
    situation makes it worthwhile.  Frequently, the outcome is that,
    since most original messages are short, and since there is a high
    overhead to start sending a new message with UUCP, it costs as much
    to send the ihave as it would cost to send the message itself.

    One possible solution to this overhead problem is to batch requests.
    Several Message-ID's may be announced or requested in one message.
    If no Message-ID's are listed in the control message, the body of
    the message should be scanned for Message-ID's, one per line.

3.3.  Newgroup

                      newgroup <groupname> [moderated]

    This control message creates a new newsgroup with the given name.
-More-    Since no messages may be posted or forwarded until a newsgroup is
    created, this message is required before a newsgroup can be used.
    The body of the message is expected to be a short paragraph
    describing the intended use of the newsgroup.

    If the second argument is present and it is the keyword moderated,
    the group should be created moderated instead of the default of
    unmoderated.  The newgroup message should be ignored unless there is
    an "Approved" line in the same message header.

3.4.  Rmgroup

                            rmgroup <groupname>

    This message removes a newsgroup with the given name.  Since the
    newsgroup is removed from every host on the network, this command
    should be used carefully by a responsible administrator.  The
    rmgroup message should be ignored unless there is an "Approved:"
    line in the same message header.




-More-








Horton & Adams                                                 [Page 13]

RFC 1036              Standard for USENET Messages         December 1987


3.5.  Sendsys
                           sendsys (no arguments)

    The sys file, listing all neighbors and th newsgroups to be sent to
    each neighbor, will be mailed to the author of the control message
    ("Reply-To", if present, otherwise "From").  This information is
    considered public information, and it is a requirement of membership
    in USENET that this information be provided on request, either
    automatically in response to this control message, or manually, by
    mailing the requested information to the author of the message.
    This information is used to keep the map of USENET up to date, and
    to determine where netnews is sent.

    The format of the file mailed back to the author should be the same
    as that of the sys file.  This format has one line per neighboring
    host (plus one line for the local host), containing four colon
    separated fields.  The first field has the host name of the
    neighbor, the second field has a newsgroup pattern describing the
    newsgroups sent to the neighbor.  The third and fourth fields are
-More-    not defined by this standard.  The sys file is not the same as the
    UUCP L.sys file.  A sample response is:

      From: cbosgd!mark  (Mark Horton)
      Date: Sun, 27 Mar 83 20:39:37 -0500
      Subject: response to your sendsys request
      To: mark@cbosgd.ATT.COM

      Responding-System: cbosgd.ATT.COM
      cbosgd:osg,cb,btl,bell,world,comp,sci,rec,talk,misc,news,soc,to,
            test
      ucbvax:world,comp,to.ucbvax:L:
      cbosg:world,comp,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews
            /cbosg
      cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
      sescent:world,comp,bell,btl,cb,to.sescent:F:/usr/spool/outnews
            /sescent
      npois:world,comp,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
      mhuxi:world,comp,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi

3.6.  Version

                           version (no arguments)
-More-
    The name and version of the software running on the local system is
    to be mailed back to the author of the message ("Reply-to" if
    present, otherwise "From").

3.7.  Checkgroups



Horton & Adams                                                 [Page 14]

RFC 1036              Standard for USENET Messages         December 1987


    The message body is a list of "official" newsgroups and their
    description, one group per line.  They are compared against the list
    of active newsgroups on the current host.  The names of any obsolete
    or new newsgroups are mailed to the user "usenet" and descriptions
    of the new newsgroups are added to the help file used when posting
    news.

4.  Transmission Methods

    USENET is not a physical network, but rather a logical network
    resting on top of several existing physical networks.  These
    networks include, but are not limited to, UUCP, the Internet, an
    Ethernet, the BLICN network, an NSC Hyperchannel, and a BERKNET.
    What is important is that two neighboring systems on USENET have
    some method to get a new message, in the format listed here, from
    one system to the other, and once on the receiving system, processed
    by the netnews softwar

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4735 *bbs.tbbs*
10-08-94 11:17:48
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
e on that system.  (On UNIX systems, this
    usually means the rnews program being run with the message on the
    standard input. <1>)

    It is not a requirement that USENET hosts have mail systems capable
    of understanding the Internet mail syntax, but it is strongly
    recommended.  Since "From", "Reply-To", and "Sender" lines use the
    Internet syntax, replies will be difficult or impossible without an
    Internet mailer.  A host without an Internet mailer can attempt to
    use the "Path" header line for replies, but this field is not
    guaranteed to be a working path for replies.  In any event, any host
    generating or forwarding news messages must have an Internet address
    that allows them to receive mail from hosts with Internet mailers,
    and they must include their Internet address on their From line.

4.1.  Remote Execution

    Some networks permit direct remote command execution.  On these
-More-    networks, news may be forwarded by spooling the rnews command with
    the message on the standard input.  For example, if the remote
    system is called remote, news would be sent over a UUCP link
    with the command:

                              uux - remote!rnews

    and on a Berknet:

                              net -mremote rnews







Horton & Adams                                                 [Page 15]

RFC 1036              Standard for USENET Messages         December 1987


    It is important that the message be sent via a reliable mechanism,
    normally involving the possibility of spooling, rather than direct
    real-time remote execution.  This is because, if the remote system
    is down, a direct execution command will fail, and the message will
    never be delivered.  If the message is spooled, it will eventually
    be delivered when both systems are up.

4.2.  Transfer by Mail

    On some systems, direct remote spooled execution is not possible.
    However, most systems support electronic mail, and a news message
    can be sent as mail.  One approach is to send a mail message which
    is identical to the news message: the mail headers are the news
    headers, and the mail body is the news body.  By convention, this
    mail is sent to the user newsmail on the remote machine.

    One problem with this method is that it may not be possible to
    convince the mail system that the "From" line of the message is
    valid, since the mail message was generated by a program on a
-More-    system different from the source of the news message.  Another
    problem is that error messages caused by the mail transmission
    would be sent to the originator of the news message, who has no
    control over news transmission between two cooperating hosts
    and does not know whom to contact.  Transmission error messages
    should be directed to a responsible contact person on the
    sending machine.

    A solution to this problem is to encapsulate the news message into a
    mail message, such that the entire message (headers and body) are
    part of the body of the mail message.  The convention here is that
    such mail is sent to user rnews on the remote system.  A mail
    message body is generated by prepending the letter N to each line of
    the news message, and then attaching whatever mail headers are
    convenient to generate.  The N's are attached to prevent any special
    lines in the news message from interfering with mail transmission,
    and to prevent any extra lines inserted by the mailer (headers,
    blank lines, etc.) from becoming part of the news message.  A
    program on the receiving machine receives mail to rnews, extracting
    the message itself and invoking the rnews program.  An example in
    this format might look like this:


-More-








Horton & Adams                                                 [Page 16]

RFC 1036              Standard for USENET Messages         December 1987


                Date: Mon, 3 Jan 83 08:33:47 MST
                From: news@cbosgd.ATT.COM
                Subject: network news message
                To: rnews@npois.ATT.COM

                NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
                NFrom: derek@sask.UUCP (Derek Andrew)
                NNewsgroups: misc.test
                NSubject: necessary test
                NMessage-ID: <176@sask.UUCP>
                NDate: Mon, 3 Jan 83 00:59:15 MST
                N
                NThis really is a test.  If anyone out there more than 6
                Nhops away would kindly confirm this note I would
                Nappreciate it.  We suspect that our news postings
                Nare not getting out into the world.
                N

    Using mail solves the spooling problem, since mail must always be
-More-    spooled if the destination host is down.  However, it adds more
    overhead to the transmission process (to encapsulate and extract the
    message) and makes it harder for software to give different
    priorities to news and mail.

4.3.  Batching

    Since news messages are usually short, and since a large number of
    messages are often sent between two hosts in a day, it may make
    sense to batch news messages.  Several messages can be combined into
    one large message, using conventions agreed upon in advance by the
    two hosts.  One such batching scheme is described here; its use is
    highly recommended.

    News messages are combined into a script, separated by a header of
    the form:


                   #! rnews 1234

    where 1234 is the length of the message in bytes.  Each such line is
    followed by a message containing the given number of bytes.  (The
    newline at the end of each line of the message is counted as one
-More-    byte, for purposes of this count, even if it is stored as <CARRIAGE
    RETURN><LINE FEED>.)  For example, a batch of message might look
    like this:






Horton & Adams                                                 [Page 17]

RFC 1036              Standard for USENET Messages         December 1987


                #! rnews 239
                From: jerry@eagle.ATT.COM (Jerry Schwarz)
                Path: cbosgd!mhuxj!mhuxt!eagle!jerry
                Newsgroups: news.announce
                Subject: Usenet Etiquette -- Please Read
                Message-ID: <642@eagle.ATT.COM>
                Date: Fri, 19 Nov 82 16:14:55 EST
                Approved: mark@cbosgd.ATT.COM

                Here is an important message about USENET Etiquette.
                #! rnews 234
                From: jerry@eagle.ATT.COM (Jerry Schwarz)
                Path: cbosgd!mhuxj!mhuxt!eagle!jerry
                Newsgroups: news.announce
                Subject: Notes on Etiquette message
                Message-ID: <643@eagle.ATT.COM>
                Date: Fri, 19 Nov 82 17:24:12 EST
                Approved: mark@cbosgd.ATT.COM

-More-                There was something I forgot to mention in the last
                message.

    Batched news is recognized because the first character in the
    message is #.  The message is then passed to the unbatcher for
    interpretation.

    The second argument (in this example rnews) determines which
    batching scheme is being used.  Cooperating hosts may use whatever
    scheme is appropriate for them.

5.  The News Propagation Algorithm

    This section describes the overall scheme of USENET and the
    algorithm followed by hosts in propagating news to the entire
    logical network.  Since all hosts are affected by incorrectly
    formatted messages and by propagation errors, it is important
    for the method to be standardized.

    USENET is a directed graph.  Each node in the graph is a host
    computer, and each arc in the graph is a transmission path from
    one host to another host.  Each arc is labeled with a newsgroup
    pattern, specifying which newsgroup classes are forwarded along
-More-    that link.  Most arcs are bidirectional, that is, if host A
    sends a class of newsgroups to host B, then host B usually sends
    the same class of newsgroups to host A.  This bidirectionality
    is not, however, required.

    USENET is made up of many subnetworks.  Each subnet has a name, such



Horton & Adams                                                 [Page 18]

RFC 1036              Standard for USENET Messages         December 1987


    as comp or btl.  Each subnet is a connected graph, that is, a path
    exists from every node to every other node in the subnet.  In
    addition, the entire graph is (theoretically) connected.  (In
    practice, some political considerations have caused some hosts to be
    unable to post messages reaching the rest of the network.)

    A message is posted on one machine to a list of newsgroups. That
    machine accepts it locally, then forwards it to all its neighbors
    that are interested in at least one of the newsgroups of the
    message.  (Site A deems host B to be "interested" in a newsgroup if
    the newsgroup 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4738 *bbs.tbbs*
10-08-94 11:18:03
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: RFC 1036
matches the pattern on the arc from A to B.  This
    pattern is stored in a file on the A machine.)  The hosts receiving
    the incoming message examine it to make sure they really want the
    message, accept it locally, and then in turn forward the message to
    all their interested neighbors.  This process continues until the
    entire network has seen the message.

    An important part of the algorithm is the prevention of loops.  The
    above process would cause a message to loop along a cycle forever.
    In particular, when host A sends a message to host B, host B will
    send it back to host A, which will send it to host B, and so on.
    One solution to this is the history mechanism.  Each host keeps
    track of all messages it has seen (by their Message-ID) and
    whenever a message comes in that it has already seen, the incoming
    message is discarded immediately.  This solution is sufficient to
    prevent loops, but additional optimizations can be made to avoid
    sending messages to hosts that will simply throw them away.

-More-    One optimization is that a message should never be sent to a machine
    listed in the "Path" line of the header.  When a machine name is
    in the "Path" line, the message is known to have passed through the
    machine.  Another optimization is that, if the message originated
    on host A, then host A has already seen the message.  Thus, if a
    message is posted to newsgroup misc.misc, it will match the pattern
    misc.all (where all is a metasymbol that matches any string), and
    will be forwarded to all hosts that subscribe to misc.all (as
    determined by what their neighbors send them).  These hosts make up
    the misc subnetwork.  A message posted to btl.general will reach all
    hosts receiving btl.all, but will not reach hosts that do not get
    btl.all.  In effect, the messages reaches the btl subnetwork.  A
    messages posted to newsgroups misc.misc,btl.general will reach all
    hosts subscribing to either of the two classes.

Notes

    <1>  UNIX is a registered trademark of AT&T.





-More-Horton & Adams                                                 [Page 19]

-- 
alan@manawatu.planet.co.nz==alan@manawatu.gen.nz~~brown_a@kosmos.wcc.govt.nz
Manawatu Internet Services,   "We should grant power over our affairs only to
Box 678, Palmerston North,     those who are reluctant to hold it and then only
New Zealand +64 25 480-204     under conditions that increase the reluctance."


<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4755 *bbs.tbbs*
10-08-94 11:26:33
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
From: alan@papaioea.manawatu.planet.co.nz (Alan Brown)
Newsgroups:
comp.bbs.tbbs,comp.bbs.misc,comp.bbs.majorbbs,alt.bbs,alt.bbs.allsysop,alt.bbs.a
miga.excelsior,alt.bbs.cnet,alt.bbs.first-class,alt.bbs.metal,alt.bbs.pcboard,al
t.bbs.renegade,alt.bbs.searchlight,alt.bbs.wildcat,alt.bbs.watergate
Subject: Re: Network specifications: Son-of-RFC 1036 (draft under development)
Date: 8 Oct 1994 20:31:02 +1300
Organization: PlaNet (Manawatu) Palmerston North, New Zealand
Reply-To: /dev/null




          INTERNET DRAFT to be        NEWS                      sec. -




-More-
                      News Article Format and Transmission

                                 Henry Spencer



          Status of this Memo

          This  document  is  intended  to  become  an Internet Draft.
          Internet Drafts are working documents of the Internet  Engi-
          neering  Task  Force  (IETF),  its  Areas,  and  its Working
          Groups.  Note that other groups may also distribute  working
          documents as Internet Drafts.

          Internet  Drafts  are draft documents valid for a maximum of
          six months.  Internet Drafts may be  updated,  replaced,  or
          obsoleted  by other documents at any time.  It is not appro-
          priate to use Internet Drafts as reference  material  or  to
          cite  them  other  than  as  a  "working  draft" or "work in
          progress".

          Please check the I-D  abstract  listing  contained  in  each
-More-          Internet Draft directory to learn the current status of this
          or any other Internet Draft.  (Actually, this  draft  is  at
          too early a stage to even be listed there yet.)

          It is hoped that a later version of this Draft will obsolete
          RFC 1036 and will become an Internet standard.

          References to the "successor to this  Draft"  refer  not  to
          later  versions  of this draft, but to a hypothetical future
          rewrite of this Draft (in the same way that this Draft is  a
          rewrite of RFC 1036).

          Distribution of this memo is unlimited.


          Abstract

          This Draft defines the format and procedures for interchange
          of network news articles.  It is hoped that a later  version
          of this Draft will obsolete RFC 1036, reflecting more recent
          experience and accommodating future directions.

          Network news articles resemble mail messages but are  broad-
-More-          cast  to potentially-large audiences, using a flooding algo-
          rithm that propagates one copy to each interested  host  (or
          group thereof), typically stores only one copy per host, and
          does not require any central  administration  or  systematic
          registration  of  interested users.  Network news originated
          as the medium  of  communication  for  Usenet,  circa  1980.



          2 June 1994                 - 1 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. -


          Since  then  Usenet has grown explosively, and many Internet
          sites participate in it.  In addition, the  news  technology
          is now in widespread use for other purposes, on the Internet
          and elsewhere.

-More-          This Draft primarily codifies and organizes  existing  prac-
          tice.   A few small extensions have been added in an attempt
          to solve problems that are considered serious.  Major exten-
          sions (e.g. cryptographic authentication) that need signifi-
          cant development effort are left to be undertaken  as  inde-
          pendent efforts.


          Table of Contents

          TBW


          1. Introduction

          Network  news articles resemble mail messages but are broad-
          cast to potentially-large audiences, using a flooding  algo-
          rithm  that  propagates one copy to each interested host (or
          groups thereof), typically stores only one  copy  per  host,
          and  does  not require any central administration or system-
          atic registration of interested users.  Network news  origi-
          nated as the medium of communication for Usenet, circa 1980.
          Since then Usenet has grown explosively, and  many  Internet
-More-          sites  participate  in it.  In addition, the news technology
          is now in widespread use for other purposes, on the Internet
          and elsewhere.

          The  earliest  news  interchange used the so-called "A News"
          article  format.   Shortly  thereafter,  an  article  format
          vaguely  resembling  Internet  mail  was  devised  and  used
          briefly.  Both of those  formats  are  completely  obsolete;
          they  are  documented  in  appendix A for historical reasons
          only.  With publication of RFC 850 [rrr] in 1983, news arti-
          cles  came  to closely resemble Internet mail messages, with
          some restrictions and some  additional  headers.   RFC  1036
          [rrr]  in 1987 updated RFC 850 without making major changes.

          In the intervening five years, the RFC 1036  article  format
          has  proven  quite  satisfactory,  although minor extensions
          appear desirable to match recent developments in areas  such
          as  multi-media  mail.  RFC 1036 itself has not proven quite
          so satisfactory.  It is often  rather  vague  and  does  not
          address  some  issues  at  all;  this has caused significant
          interoperability problems at times, and implementations have
          diverged  somewhat.  Worse, although it was intended primar-
          ily to document existing  practice,  it  did  not  precisely
-More-          match  existing  practice even at the time it was published,
          and the deviations have grown since.




          2 June 1994                 - 2 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. 1


          This Draft attempts to specify the format of  articles,  and
          the  procedures  used  to exchange them and process them, in
          sufficient detail to allow full interoperability.  In  addi-
          tion,  some  tentative suggestions are made about directions
          for future development, in an attempt to  avert  unnecessary
          divergence  and  consequent loss of interoperability.  Major
          extensions (e.g.  cryptographic  authentication)  that  need
          significant  development effort are left to be undertaken as
-More-          independent efforts.

               NOTE: One question this all may raise is:  why  is
               there  no  News-Version header, analogous to MIME-
               Version, specifying a version number corresponding
               to  this specification?  The answer is: it doesn't
               appear  to  be  useful,  given  news's   backward-
               compatibility  constraints.   The  major  use of a
               version number  is  indicating  which  of  several
               INCOMPATIBLE  interpretations  is  relevant.   The
               impossibility of orchestrating any sort of  simul-
               taneous change over news's installed base makes it
               necessary to avoid such incompatible  changes  (as
               opposed  to extensions) entirely.  MIME has a ver-
               sion number mostly because it introduced incompat-
               ible  changes  to  the  interpretation  of several
               "Content-"  headers.   This  Draft   attempts   no
               changes  in interpretation and it appears doubtful
               that future Drafts will find it feasible to intro-
               duce any.

               UNRESOLVED  ISSUE:  Should  this  be reconsidered?
               Only if the header has SPECIFIC IDENTIFIABLE  uses
-More-               today.  Otherwise it's just useless added bulk.

          As  in  this  Draft's  predecessors, the exact means used to
          transmit articles from one host to another is not speci

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4758 *bbs.tbbs*
10-08-94 11:26:48
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
fied.
          NNTP  [rrr]  is probably the most common transmission method
          on the Internet, but a number of others are known to  be  in
          use,  including  the UUCP protocol [rrr] extensively used in
          the early days of Usenet and still much used on its  fringes
          today.

          Several  of  the mechanisms described in this Draft may seem
          somewhat strange or even bizarre at first reading.  As  with
          Internet  mail, there is no reasonable possibility of updat-
          ing the entire installed base of news software promptly,  so
          interoperability  with  old  software  is  crucial  and will
          remain so.  Compatibility with existing practice and robust-
          ness  in  an  imperfect world necessarily take priority over




-More-




          2 June 1994                 - 3 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. 1


          elegance.


          2. Definitions, Notations, and Conventions


          2.1. Textual Notations

          Throughout this Draft, "MAIL" is short for "RFC 822 [rrr] as
-More-          amended  by  RFC  1123  [rrr]".   (RFC 1123's amendments are
          mostly relatively small, but they  are  not  insignificant.)
          See  also  the  discussion  in  section 3 about this Draft's
          relationship to MAIL.  "MIME" is short for  "RFCs  1341  and
          1342" (or their updated replacements).

               UNRESOLVED ISSUE: Update these numbers.

          "ASCII"  is  short  for "the ANSI X3.4 character set" [rrr].
          While "ASCII" is often misused to refer to various character
          sets  somewhat similar to X3.4, in this Draft, "ASCII" means
          X3.4 and only X3.4.

               NOTE: The name is traditional (to the point  where
               the  ANSI standard sanctions it) even though it is
               no longer an acronym for the name of the standard.

               NOTE:  ASCII,  X3.4,  contains 128 characters, not
               all of them printable.  Character sets  with  more
               characters   are  not  ASCII,  although  they  may
               include it as a subset.

          Certain words used to define the significance of  individual
-More-          requirements are capitalized.  "MUST" means that the item is
          an absolute  requirement  of  the  specification.   "SHOULD"
          means that the item is a strong recommendation: there may be
          valid reasons to ignore it  in  unusual  circumstances,  but
          this  should  be  done  only after careful study of the full
          implications and a firm conclusion  that  it  is  necessary,
          because  there are serious disadvantages to doing so.  "MAY"
          means that the item is truly optional, and implementors  and
          users  are warned that conformance is possible but not to be
          relied on.

          The term "compliant", applied to implementations etc., indi-
          cates  satisfaction  of  all  relevant  "MUST"  and "SHOULD"
          requirements.  The term "conditionally compliant"  indicates
          satisfaction  of all relevant "MUST" requirements but viola-
          tion of at least one relevant "SHOULD" requirement.

          This Draft contains explanatory notes  using  the  following
          format.   These  may be skipped by persons interested solely
          in the content of the specification.   The  purpose  of  the
          notes  is to explain why choices were made, to place them in
          context, or to suggest possible implementation techniques.

-More-

          2 June 1994                 - 4 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 2.1


               NOTE: While such explanatory notes may seem super-
               fluous  in  principle,  they  often help the less-
               than-omniscient reader grasp the  purpose  of  the
               specification and the constraints involved.  Given
               the limitations of natural language  for  descrip-
               tive  purposes, this improves the probability that
               implementors and users will  understand  the  true
               intent  of  the  specification  in cases where the
               wording is not entirely clear.

          All numeric values are given  in  decimal  unless  otherwise
          indicated.   Octets  are  assumed  to be unsigned values for
-More-          this purpose.  Large numbers are  written  using  the  North
          American  convention, in which "," separates groups of three
          digits but otherwise has no significance.


          2.2. Syntax Notation

          Although the mechanisms specified  in  this  Draft  are  all
          described  in prose, most are also described formally in the
          modified BNF notation of RFC 822.  Implementors will need to
          be  familiar  with  this  notation  to fully understand this
          specification, and are referred to RFC 822  for  a  complete
          explanation  of  the modified BNF notation.  Here is a brief
          illustrative example:

               sentence  = clause *( punct clause ) "."
               punct     = ":" / ";"
               clause    = 1*word [ "(" clause ")" / "," 1*word ]
               word      = <any English word>

          This defines a sentence as some clauses separated by  puncts
          and  ended  by  a period, a punct as a colon or semicolon, a
          clause as at least one <word> optionally followed by  either
-More-          a  parenthesized  clause  or  a  comma and at least one more
          <word>, and a <word> as (informally) any English  word.   <>
          are  used to enclose names when (and only when) distinguish-
          ing them from surrounding text is useful.  The full form  of
          the  repetition  notation  is <m>"*"<n><thing>, denoting <m>
          through <n> repetitions of <thing>; <m>  defaults  to  zero,
          <n>  to  infinity, and the "*" and <n> can be omitted if <m>
          and <n> are equal, so 1*word is one or more  words,  1*5word
          is one through five words, and 2word is exactly two words.

          The  character  "\"  is not special in any way in this nota-
          tion.

          This Draft is intended  to  be  self-contained;  all  syntax
          rules  used in it are defined within it, and a rule with the
          same name as one found in MAIL does not necessarily have the
          same  definition.   The lexical layer of MAIL is NOT, repeat
          NOT, used in this  Draft,  and  its  presence  must  not  be
          assumed;  notably,  this  Draft  spells out all places where



          2 June 1994                 - 5 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                    sec. 2.2


          white space is permitted/required and all places where  con-
          structs resembling MAIL comments can occur.

               NOTE:  News  parsers  historically  have been much
               less permissive than MAIL parsers.


          2.3. Definitions

          The term "character set", wherever it is used in this Draft,
          refers to a coded character set, in the sense of ISO charac-
          ter set standardization work, and must not be misinterpreted
          as meaning merely "a set of characters".

          In this Draft, ASCII character 32 is referred to as "blank";
-More-          the word "space" has a more generic meaning.

          An "article" is the unit of news, analogous to a MAIL  "mes-
          sage".

          A "poster" is a human being (or software equivalent) submit-
          ting a  possibly-compliant  article  to  be  "posted":  made
          available  for  reading  on  all relevant hosts.  A "posting
          agent" is software that assists posters to prepare articles,
          including  determining  whether the final article is compli-
          ant, passing it on to a  relayer  for  posting  if  so,  and
          returning  it  to  the poster with an explanation if not.  A
          "relayer" is  software  which  receives  allegedly-compliant
          articles  from  posting  agents and/or other relayers, files
          copies in a "news database", and possibly passes  copies  on
          to other relayers.

               NOTE:  While  the  same software may well function
               both as a relayer and as part of a posting  agent,
               the  two  functions are distinct and should not be
               confused.  The  posting  agent's  purpose  is  (in
               part) to validate an article, supply header infor-
               mation that can or should  be  supplied  automati-
-More-               cally, and generally take reasonable actions in an
               attempt to transform the poster's submission  into
               a  compliant article.  The relayer's 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4761 *bbs.tbbs*
10-08-94 11:27:02
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
purpose is to
               move already-compliant articles around efficiently
               without damaging them.

          A "reader" is a human being reading news articles.  A "read-
          ing agent" is software which presents articles to a  reader.

               NOTE:  Informal usage often uses "reader" for both
               these meanings, but this  introduces  considerable
               potential  for  confusion and misunderstanding, so
               this Draft takes care to make the distinction.

          A "newsgroup" is a single news  forum,  a  logical  bulletin
          board,  having a name and nominally intended for articles on



          2 June 1994                 - 6 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                    sec. 2.3


          a specific topic.  An article is "posted to" a single  news-
          group  or  several newsgroups.  When an article is posted to
          more than one newsgroup, it is said  to  be  "cross-posted";
          note that this differs from posting the same text as part of
          each of several articles, one per newsgroup.  A  "hierarchy"
          is  the set of all newsgroups whose names share a first com-
          ponent (see the name syntax in section 5.5).

          A newsgroup may be "moderated", in  which  case  submissions
          are  not  posted  directly,  but mailed to a "moderator" for
          consideration and possible posting.   Moderators  are  typi-
          cally  human but may be implemented partially or entirely in
          software.

          A "followup" is an article containing a response to the con-
-More-          tents of an earlier article (the followup's "precursor").  A
          "followup agent" is a combination of reading agent and post-
          ing agent that aids in the preparation and posting of a fol-
          lowup.

          Text  comparisons  are  "case-sensitive"  if  they  consider
          uppercase  letters  (e.g. "A") different from lowercase let-
          ters (e.g. "a"), and "case-insensitive" if letters differing
          only  in  case  (e.g. "A" and "a") are considered identical.
          Categories of text are said to be case-(in)sensitive if com-
          parisons of such texts to others are case-(in)sensitive.

          A  "cooperating  subnet"  is  a set of news-exchanging hosts
          which is sufficiently well-coordinated (typically via a cen-
          tral  administration of some sort) that stronger assumptions
          can be made about hosts in the set than about news hosts  in
          general.  This is typically used to relax restrictions which
          are otherwise required for worst-case interoperability; mem-
          bers  of  a cooperating subnet MAY interchange articles that
          do not conform to this Draft's specifications, provided  all
          members  have  agreed  to this and provided the articles are
          not permitted to leak out of the subnet.  The word  "subnet"
          is  used to emphasize that a cooperating subnet is typically
-More-          not an isolated universe; care must be  taken  that  traffic
          leaving  the  subnet  complies  with the restrictions of the
          larger net, not just those of the cooperating subnet.

          A "message ID" is a unique identifier for an  article,  usu-
          ally supplied by the posting agent which posted it.  It dis-
          tinguishes the article from every other article ever  posted
          anywhere (in theory).  Articles with the same message ID are
          treated as identical copies of the same article even if they
          are not in fact identical.

          A  "gateway"  is  software  which receives news articles and
          converts them to messages of some other kind (e.g. mail to a
          mailing list), or vice-versa; in essence it is a translating
          relayer that straddles boundaries between different  methods
          of  message  exchange.   The  most  common  type  of gateway



          2 June 1994                 - 7 -       expires 15 July 1994



-More-

          INTERNET DRAFT to be        NEWS                    sec. 2.3


          connects newsgroup(s) to mailing list(s),  either  unidirec-
          tionally  or  bidirectionally,  but  there are also gateways
          between news networks using this  Draft's  news  format  and
          those using other formats.

          A  "control  message"  is an article which is marked as con-
          taining control information; a  relayer  receiving  such  an
          article  will  (subject  to  permissions  etc.) take actions
          beyond just filing and passing on the article.

               NOTE: "Control article" would be  more  consistent
               terminology, but "control message" is already well
               established.

          An article's "reply address" is the address to which  mailed
          replies  should  be  sent.  This is the address specified in
          the article's From header (see section 5.2), unless it  also
          has a Reply-To header (see section 6.3).
-More-
          The  notation  (e.g.)  "(ASCII  17)"  following a name means
          "this name refers to the ASCII character having  value  17".
          An  "ASCII printable character" is an ASCII character in the
          range 33-126.  An "ASCII  control  character"  is  an  ASCII
          character  in the  range  0-31, or the character DEL (ASCII
          127).  A "non-ASCII character" is a character having a value
          exceeding 127.

               NOTE: Blank is neither an "ASCII printable charac-
               ter" nor an "ASCII control character".


          2.4. End Of Line

          How the end of a text line is  represented  depends  on  the
          context  and  the implementation.  For Internet transmission
          via protocols such as SMTP [rrr], an  end-of-line  is  a  CR
          (ASCII  13)  followed  by an LF (ASCII 10).  ISO C [rrr] and
          many modern operating systems indicate  end-of-line  with  a
          single  character,  typically  ASCII LF (aka "newline"), and
          this is the normal convention when news is  transmitted  via
          UUCP.  A variety of other methods are in use, including out-
-More-          of-band methods in which there is no specific character that
          means end-of-line.

          This Draft does not constrain how end-of-line is represented
          in news, except that characters other than CR  and  LF  MUST
          not  be  usurped  for  use  in  end-of-line representations.
          Also, obviously, all software dealing with a particular copy
          of  an  article  must  agree  on  the convention to be used.
          "EOL" is used to mean "whatever  end-of-line  representation
          is  appropriate";  it  is  not  necessarily  a  character or
          sequence of characters.





          2 June 1994                 - 8 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 2.4
-More-

               NOTE: If faced with picking an EOL  representation
               in the absence of other constraints, use of a sin-
               gle character simplifies processing, and the ASCII
               standard  [rrr] specifies that if one character is
               to be used for  this  purpose,  it  should  be  LF
               (ASCII 10).

               NOTE:  Inside  MIME encodings, use of the Internet
               canonical EOL representation (CR followed  by  LF)
               is mandatory.  See [rrr].


          2.5. Case-Sensitivity

          Text  in  newsgroup  names, header parameters, etc. is case-
          sensitive unless stated otherwise.

               NOTE: This is at  variance  with  MAIL,  which  is
               case-insensitive  unless  stated otherwise, but is
               consistent  with  news  historical  practice   and
               existing news software.  See the comments on back-
-More-               ward compatibility in section 1.


          2.6. Language

          Various constant strings in this Draft, such as header names
          and  month  names,  are derived from English words.  Despite
          their derivation, these words do NOT change when the  poster
          or  reader employing them is interacting in a language other
          than English.  Posting and reading agents  SHOULD  translate
          as  appropriate  in  their  interaction  with  the poster or
          reader, but the forms that actually appear in  articles  are
          always the English-derived ones defined in this Draft.


          3. Relation To MAIL (RFC 822 etc.)

          The  primary  intent of this Draft is to completely describe
          the news article format as a subset of MAIL's message format
          augmented by some new headers.  Unless explicitly noted oth-
          erwise, the intent throu

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4764 *bbs.tbbs*
10-08-94 11:27:17
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
ghout is that an article  MUST  also
          be a valid MAIL message.

               NOTE:  Despite  obvious  similarities between news
               and mail, opinions vary on whether it is  possible
               or  desirable to unify them into a single service.
               However, it is unquestionably  both  possible  and
               useful to employ some of the same tools for manip-
               ulating both mail messages and news  articles,  so
               there  is specific advantage to be had in defining
               them compatibly.  Furthermore, there is no  appar-
               ent need to re-invent the wheel when slight exten-
               sions to an existing definition will suffice.



          2 June 1994                 - 9 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                      sec. 3


          Given that this Draft  attempts  to  be  self-contained,  it
          inevitably  contains  considerable repetition of information
          found in MAIL.  This raises the possibility of unintentional
          conflicts.  Unless specifically noted otherwise, any wording
          in this Draft which  permits  behavior  that  is  not  MAIL-
          compliant  is  erroneous  and should be followed only to the
          extent that the result remains compliant with MAIL.

               NOTE: RFC 1036 said "where this standard conflicts
               with  [RFC 822], RFC-822 should be considered cor-
               rect and this standard in  error".   Taken  liter-
               ally, this was obviously incorrect, since RFC 1036
               imposed a number of restrictions not found in  RFC
               822.   The  intent,  however,  was  reasonable: to
               indicate  that  UNINTENTIONAL   differences   were
               errors in RFC 1036.
-More-
          Implementors and users should note that MAIL is deliberately
          an extensible standard, and most extensions devised for mail
          are  also relevant to (and compatible with) news.  Note par-
          ticularly MIME [rrr],  summarized  briefly  in  appendix  B,
          which extends MAIL in a number of useful ways that are defi-
          nitely relevant to news.   Also  of  note  is  the  work  in
          progress  on  reconciling  PEM (Privacy Enhanced Mail, which
          defines extensions for  authentication  and  security)  with
          MIME, after which this may also be relevant to news.

               UNRESOLVED ISSUE: Update the MIME/PEM information.

          Similarly, descriptions here of MIME  facilities  should  be
          considered  correct  only  to  the  extent  that they do not
          require or legitimize practices  that  would  violate  those
          RFCs.   (Note that this Draft does extend the application of
          some MIME facilities, but this is an extension  rather  than
          an alteration.)


          4. Basic Format

-More-
          4.1. Overall Syntax

          The overall syntax of a news article is:














          2 June 1994                - 10 -       expires 15 July 1994




-More-
          INTERNET DRAFT to be        NEWS                    sec. 4.1


               article         = 1*header separator body
               header          = start-line *continuation
               start-line      = header-name ":" space [ nonblank-text ] eol
               continuation    = space nonblank-text eol
               header-name     = 1*name-character *( "-" 1*name-character )
               name-character  = letter / digit
               letter          = <ASCII letter A-Z or a-z>
               digit           = <ASCII digit 0-9>
               separator       = eol
               body            = *( [ nonblank-text / space ] eol )
               eol             = <EOL>
               nonblank-text   = [ space ] text-character *( space-or-text )
               text-character  = <any ASCII character except NUL (ASCII 0),
                                   HT (ASCII 9), LF (ASCII 10), CR (ASCII 13),
                                   or blank (ASCII 32)>
               space           = 1*( <HT (ASCII 9)> / <blank (ASCII 32)> )
               space-or-text   = space / text-character

          An  article consists of some headers followed by a body.  An
-More-          empty line separates the two.  The  headers  contain  struc-
          tured information about the article and its transmission.  A
          header begins with a header name identifying it, and can  be
          continued  onto  subsequent lines by beginning the continua-
          tion line(s) with white space.   (Note  that  section  4.2.3
          adds some restrictions to the header syntax indicated here.)
          The body is largely-unstructured text  significant  only  to
          the poster and the readers.

               NOTE:  Terminology here follows the current custom
               in the news community, rather than the  MAIL  con-
               vention  of  (sometimes) referring to what is here
               called a "header" as a "header field" or  "field".

          Note that the separator line must be truly empty, not just a
          line containing white space.  Further empty lines  following
          it  are  part  of the body, as are empty lines at the end of
          the article.

               NOTE: Some systems  make  no  distinction  between
               empty lines and lines consisting entirely of white
               space;  indeed,  some  systems  cannot   represent
               entirely  empty  lines.  The grammar's requirement
-More-               that header continuation lines contain some print-
               able  text is meant to ensure that the empty/space
               distinction cannot confuse identification  of  the
               separator line.

               NOTE:  It  is tempting to authorize posting agents
               to strip empty lines at the beginning and  end  of
               the  body,  but such empty lines could possibly be
               part of a preformatted document.

          Implementors are warned that trailing white  space,  whether
          alone  on  the  line or not, MAY be significant in the body,



          2 June 1994                - 11 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.1

-More-
          notably in early versions of  the  "uuencode"  encoding  for
          binary  data.  Trailing white space MUST be preserved unless
          the article is known to have originated within a cooperating
          subnet  that  avoids using significant trailing white space,
          and SHOULD be preserved regardless.   Posters  SHOULD  avoid
          using  conventions  or  encodings  which make trailing white
          space significant;  for  encoding  of  binary  data,  MIME's
          "base64"  encoding  is recommended.  Implementors are warned
          that ISO C implementations  are  not  required  to  preserve
          trailing  white space, and special precautions may be neces-
          sary in implementations which do not.

               NOTE: Unfortunately, the signature-delimiter  con-
               vention (described in section 4.3.2) does use sig-
               nificant trailing white space.  It's too  late  to
               fix  this;  there  is work underway on defining an
               organized signature convention as  part  of  MIME,
               which is a preferable solution in the long run.

          Posters  are warned that some very old relayer software mis-
          behaves when the first non-empty line  of  an  article  body
          begins with white space.
-More-

          4.2. Headers


          4.2.1. Names and Contents

          Despite  the  restrictions  on header-name syntax imposed by
          the grammar, relayers and  reading  agents  SHOULD  tolerate
          header  names containing any ASCII printable character other
          than colon (":", ASCII 58).

               NOTE: MAIL header  names  can  contain  any  ASCII
               printable  character (other than colon) in theory,
               but in practice, arbitrary header names are  known
               to  cause trouble for some news software.  Section
               4.1's restriction to alphanumeric sequences  sepa-
               rated by hyphens is believed to permit all widely-
               used header names without causing problems for any
               widely-used  software.   Software  is nevertheless
               encouraged to cope correctly with the  full  range
               of  possibilities,  since aberrations are known to
               occur.
-More-
          Relayers MUST disregard headers not described in this  Draft
          (that  is,  with  header names not mentioned in this Draft),
          and pass them on unaltered.

          Posters wishing to convey non-standard information in  head-
      

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4767 *bbs.tbbs*
10-08-94 11:27:32
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
    ers  SHOULD  use header names beginning with "X-".  No stan-
          dard header name will ever be of this form.  Reading  agents
          SHOULD  ignore  "X-"  headers,  or  at least treat them with



          2 June 1994                - 12 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.2.1


          great care.

          The order of headers in an article is not significant.  How-
-More-          ever, posting agents are encouraged to put mandatory headers
          (see section 5) first, followed  by  optional  headers  (see
          section 6), followed by headers not defined in this Draft.

               NOTE:  While  relayers  and reading agents must be
               prepared to handle any order, having the  signifi-
               cant  headers (the precise definition of "signifi-
               cant" depends on  context)  first  can  noticeably
               improve  efficiency,  especially in memory-limited
               environments where it is difficult to buffer up an
               arbitrary  quantity of headers while searching for
               the few that matter.

          Header names are case-insensitive.   There  is  a  preferred
          case  convention,  which  posters  and posting agents SHOULD
          use: each hyphen-separated "word" has its initial letter (if
          any)  in  uppercase  and  the rest in lowercase, except that
          some abbreviations have all letters  uppercase  (e.g.  "Mes-
          sage-ID"  and "MIME-Version").  The forms used in this Draft
          are the preferred forms for the  headers  described  herein.
          Relayers  and  reading agents are warned that articles might
          not obey this convention.

-More-               NOTE: Although software must be prepared  for  the
               possibility  of random use of case in header names
               (and other case-independent text), establishing  a
               preferred  convention reduces pointless diversity,
               and may permit optimized software that  looks  for
               the  preferred  forms  before  resorting  to less-
               efficient case-insensitive searches.

          In general, a header can consist of several lines, with each
          continuation line beginning with white space.  The EOLs pre-
          ceding continuation lines are ignored when processing such a
          header, effectively combining the start-line and the contin-
          uations into a single logical line.  The logical line,  less
          the  header  name,  colon, and any white space following the
          colon, is the "header content".


          4.2.2. Undesirable Headers

          A header whose content is empty  is  said  to  be  an  empty
          header.   Relayers  and  reading  agents SHOULD not consider
          presence or absence of an empty header to alter  the  seman-
          tics  of  an  article  (although  syntactic  rules,  such as
-More-          requirements that certain header names appear at  most  once
          in  an  article,  MUST  still be satisfied).  Posting agents
          SHOULD delete empty headers  from  articles  before  posting
          them.




          2 June 1994                - 13 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.2.2


          Headers  that merely state defaults explicitly (e.g., a Fol-
          lowup-To header with the  same  content  as  the  Newsgroups
          header,   or   a  MIME  Content-Type  header  with  contents
          "text/plain; charset=us-ascii") or  state  information  that
          reading  agents  can  typically  determine easily themselves
          (e.g. the length of the body in octets) are redundant,  con-
-More-          veying no information whatsoever.  Headers that state infor-
          mation which cannot possibly be of use to a significant num-
          ber  of relayers, reading agents, or readers (e.g., the name
          of the software package used as the posting agent) are  use-
          less and pointless.  Posters and posting agents SHOULD avoid
          including redundant or useless headers in articles.

               NOTE: Information that someone,  somewhere,  might
               someday  find useful is best omitted from headers.
               (There's quite enough of it  in  article  bodies.)
               Headers  should contain information of known util-
               ity only.  This is not meant to preclude inclusion
               of  information  primarily meant for news-software
               debugging, but such information should be included
               only  if there is real reason, preferably based on
               experience, to suspect that it  may  be  genuinely
               useful.  Articles passing through gateways are the
               only obvious case  where  inclusion  of  debugging
               information appears clearly legitimate.  (See sec-
               tion 10.1.)

               NOTE: A useful rule of thumb for  software  imple-
               mentors  is:  "if  I had to pay a dollar a day for
-More-               the transmission of this  header,  would  I  still
               think it worthwhile?".


          4.2.3. White Space and Continuation

          The  colon  following the header name on the start-line MUST
          be followed by white space, even if the header is empty.  If
          the  header  is not empty, at least some of the content MUST
          appear on the start-line.  Posting agents MUST enforce these
          restrictions,  but  relayers (etc.) SHOULD accept even arti-
          cles that violate them.

               NOTE: MAIL does not require white space after  the
               colon,  but  it  is  usual.  RFC 1036 required the
               white space,  even  in  empty  headers,  and  some
               existing   software  demands  it.   In  MAIL,  and
               arguably in RFC  1036  (although  the  wording  is
               vague), it is technically legitimate for the white
               space to be part of  a  continuation  line  rather
               than the start-line, but not all existing software
               will accept  this.   Deleting  empty  headers  and
               placing some content on the start-line avoids this
-More-               issue...  which  is  desirable  because   trailing
               blanks,  easily  deleted by accident, are best not



          2 June 1994                - 14 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.2.3


               made significant in headers.

          In general, posters and  posting  agents  SHOULD  use  blank
          (ASCII  32), not tab (ASCII 9), where white space is desired
          in headers.  Existing software does not consistently  accept
          tab  as  synonymous with blank in all contexts.  In particu-
          lar, RFC 1036 appeared to specify that the character immedi-
          ately  following  the colon after a header name was required
          to be a blank, and some news software insists  on  that,  so
-More-          this  character MUST be a blank.  Again, posting agents MUST
          enforce these restrictions but relayers SHOULD be more  tol-
          erant.

          Since  the white space beginning a continuation line remains
          a part of the logical line, headers  can  be  "broken"  into
          multiple  lines  only at white space.  Posting agents SHOULD
          not break headers unnecessarily.  Relayes  SHOULD  preserve
          existing header breaks, and SHOULD not introduce new breaks.
          Breaking headers SHOULD be a last resort; relayers and read-
          ing agents SHOULD handle long header lines gracefully.  (See
          the discussion of size limits in section 4.6.)


          4.3. Body

          Although the article body is unstructured for  most  of  the
          purposes  of  this  Draft, structure MAY be imposed on it by
          other means, notably MIME headers (see appendix B).


          4.3.1. Body Format Issues

-More-          The body of an article MAY be empty, although posting agents
          SHOULD  consider this an error condition (meriting returning
          the article to the poster for revision).   A  posting  agent
          which does not reject such an article SHOULD issue a warning
          message to the poster and supply  a  non-empty  body.   Note
          that  the separator line MUST be present even if the body is
          empty.

               NOTE: An empty body is  probably  a  poster  error
               except, arguably, for some control messages... and
               even they really ought to have a  body  explaining
               the  reason  for  the  control  message.  Some old
               reading agents are known to generate empty  bodies
               for  "cancel"

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4770 *bbs.tbbs*
10-08-94 11:27:47
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
  control messages, so posting agents
               might opt not to reject body-less articles in such
               cases  (although  it  would  be  better to fix the
               reading agents to request a body).  However,  some
               existing  news software is known to react badly to
               body-less articles, hence the request for  posting
               agents to insert a body in such cases.





          2 June 1994                - 15 -       expires 15 July 1994





-More-          INTERNET DRAFT to be        NEWS                  sec. 4.3.1


               NOTE:  A possible posting-agent-supplied body text
               (already used by one widespread posting agent)  is
               "This  article  was  probably generated by a buggy
               news reader.".  (The use of "reader" to  refer  to
               the  reading  agent  is traditional, although this
               Draft uses more precise terminology.)

               NOTE: The requirement for the separator line  even
               in  a bodyless article is inherited from MAIL, and
               also distinguishes legitimately-bodyless  articles
               from articles accidentally truncated in the middle
               of the headers.

          Note that an article body is a sequence of lines  terminated
          by  EOLs,  not  arbitrary  binary data, and in particular it
          MUST end with an EOL.  However, relayers  SHOULD  treat  the
          body  of  an  article as an uninterpreted sequence of octets
          (except as mandated by changes of EOL representation and  by
          control-message  processing)  and SHOULD avoid imposing con-
          straints on it.  See also section 4.6.
-More-

          4.3.2. Body Conventions

          Although body lines can in principle be very long (see  sec-
          tion  4.6  for  some  discussion  of length limits), posters
          SHOULD restrict body line lengths to circa 70-75 characters.
          On  systems  where  text  is conventionally stored with EOLs
          only at paragraph breaks and  other  "hard  return"  points,
          with  software  breaking lines as appropriate for display or
          manipulation, posting agents SHOULD insert EOLs as necessary
          so that posted articles comply with this restriction.

               NOTE:  News  originated in environments where line
               breaks in plain text files were  supplied  by  the
               user, not the software.  Be this good or bad, much
               reading-agent and posting-agent  software  assumes
               that  news  articles follow this convention, so it
               is often inconvenient to read or respond to  arti-
               cles  which  violate it.  The "70-75" number comes
               from the widespread use of display  devices  which
               are 80 columns wide, and the desire to leave a bit
               of margin for quoting etc. (see below).
-More-
          Reading agents confronted with body lines much  longer  than
          the  available  output-device  width  SHOULD  break lines as
          appropriate.  Posters are warned that such  breaks  may  not
          occur exactly where the poster intends.

               NOTE:  "As  appropriate"  would  typically include
               breaking lines wen supplying the text of an arti-
               cle to be quoted in a reply or followup, something
               that line-breaking reading agents often neglect to
               do now.



          2 June 1994                - 16 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.3.2


-More-          Although  styles  vary widely, for plain text it is usual to
          use no left margin, leave the right edge ragged, use a  sin-
          gle  empty  line  to  separate paragraphs, and employ normal
          natural-language usage on matters such  as  upper/lowercase.
          (In  particular,  articles SHOULD not be written entirely in
          uppercase.  In environments where posters have  access  only
          to  uppercase,  posting agents SHOULD translate it to lower-
          case.)

               NOTE: Most people find substantial bodies of  text
               entirely  in  uppercase  relatively  hard to read,
               while all-lowercase  text  merely  looks  slightly
               odd.   The  common  association  of uppercase with
               strong emphasis adds to this.

          Tone of voice does not carry well in written text, and  mis-
          understandings are common when sarcasm, parody, or exaggera-
          tion for humorous effect is attempted without explicit warn-
          ing.   It has become conventional to use the sequence ":-)",
          which (on most output devices) resembles a  rotated  "smiley
          face"  symbol,  as  a  marker for text not meant to be taken
          literally, especially when humor is intended.  This practice
          aids  communication  and averts unintended ill-will; posters
-More-          are urged to use it.  A variety of analogous  sequences  are
          used with less-standardized meanings [Sanderson].

          The  order  of arrival of news articles at a particular host
          depends somewhat on  transmission  paths,  and  occasionally
          articles are lost for various reasons.  When responding to a
          previous article, posters SHOULD not assume that all readers
          understand the exact context.  It is common to quote some of
          the previous article to establish context.  This  SHOULD  be
          done  by  prefacing  each  quoted line (even if it is empty)
          with the character ">".  This will result in multiple levels
          of ">" when quoted context itself contains quoted context.

               NOTE:  It  may seem superfluous to put a prefix on
               empty lines, but it simplifies  implementation  of
               functions  such as "skip all quoted text" in read-
               ing agents.

          Readability is enhanced if quoted text and new text are sep-
          arated by an empty line.

          Posters  SHOULD  edit  quoted context to trim it down to the
          minimum  necessary.   However,  posting  agents  SHOULD  not
-More-          attempt  to enforce this by imposing overly-simplistic rules
          like "no more than 50% of the lines should be quotes".

               NOTE: While encouraging trimming is desirable, the
               50%  rule  imposed  by  some old posting agents is
               both inadequate and counterproductive.  Posters do
               not  respond  to  it by being more selective about
               quoting; they respond by padding short  responses,



          2 June 1994                - 17 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.3.2


               or  by  using  different  quoting styles to defeat
               automatic analysis.  The former  adds  unnecessary
               noise  and  volume,  while the latter also defeats
-More-               more useful forms of automatic analysis that read-
               ing agents might wish to do.

               NOTE:  At  the  very  least, if a minimum-unquoted
               quota is being set, article  bodies  shorter  than
               (say)  20  lines, or perhaps articles which exceed
               the quota by only a few lines, should  be  exempt.
               This  avoids the ridiculous situation of complain-
               ing about a 5-line response to a 6-line quote.

               NOTE: A more subtle posting-agent rule,  suggested
               for  experimental  use, is to reject articles that
               appear to contain quoted signatures  (see  below).
               This  is almost certainly the result of a careless
               poster not bothering to trim down quoted  context.
               Also,  if  a  posting agent or followup agent pre-
               sents an article template to the poster for  edit-
               ing,  it  really  should  take note of whether the
               poster actually made any changes, and refrain from
               posting an unmodified template.

          Some  followup  agents supply "attribution" lines for quoted
          context, indicating where it first appeared and under  whose
-More-          name.   When  multiple  levels  of  quoting  are present and
          quoted context is edited for  brevity,  "inner"  attribution
          lines  are not always retained.  The editing process is also
          somewhat error-prone.   Reading  agents  (and  readers)  are
          warned not to assume that attributions are accurate.

               UNRESOLVED  ISSUE:  Should  a  standard format for
               attribution lines be defined?   There  is  already
               considerable diversity... but automatic news anal-
               ysis would be

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4773 *bbs.tbbs*
10-08-94 11:28:02
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 substantially aided  by  a  standard
               convention.

          Early  difficulties in inferring return addresses from arti-
          cle headers led to "signatures": short closing texts,  auto-
          matically  added  to  the end of articles by posting agents,
          identifying the poster and giving his network addresses etc.
          If  a  poster or posting agent does append a signature to an
          article, the signature SHOULD be preceded with  a  delimiter
          line  containing  (only)  two hyphens (ASCII 45) followed by
          one blank (ASCII  32).   Posting  agents  SHOULD  limit  the
          length  of  signatures,  since  verbose  excess bordering on
          abuse is common if no restraint is imposed;  4  lines  is  a
          common limit.

               NOTE:  While  signatures  are  arguably a blemish,
               they are a well-understood convention, and convey-
               ing  the same information in headers exposes it to
-More-               mangling and makes it rather less conspicuous.   A



          2 June 1994                - 18 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                  sec. 4.3.2


               standard  delimiter  line  makes  it  possible for
               reading agents to handle signatures  specially  if
               desired.    (This  is  unfortunately  hampered  by
               extensive misunderstanding of, and misuse of,  the
               delimiter.)

               NOTE: The choice of delimiter is somewhat unfortu-
               nate, since it relies on preservation of  trailing
               white  space,  but  it  is too well-established to
               change.  There is work underway to define  a  more
-More-               sophisticated  signature  scheme  as part of MIME,
               and this will  presumably  supersede  the  current
               convention in due time.

               NOTE:  Four  75-column  lines of signature text is
               300 characters, which is ample to convey name  and
               mail-address  information  in  all  but  the  most
               bizarre situations.


          4.4. Characters And Character Sets

          Header and body lines MAY contain any ASCII characters other
          than CR (ASCII 13), LF (ASCII 10), and NUL (ASCII 0).

               NOTE:  CR  and  LF are excluded because they clash
               with common  EOL  conventions.   NUL  is  excluded
               because  it  clashes with the C end-of-string con-
               vention, which is  significant  to  most  existing
               news   software.    These   three  characters  are
               unlikely to be transmitted successfully.

          However, posters SHOULD avoid using ASCII control characters
-More-          except for tab (ASCII 9), formfeed (ASCII 12), and backspace
          (ASCII 8).  Tab signifies sufficient horizontal white  space
          to  reach  the next of a set of fixed positions; posters are
          warned that there is no standard set of positions,  so  tabs
          should be avoided if precise spacing is essential.  Formfeed
          signifies a point at which a reading agent SHOULD pause  and
          await  reader  interaction  before  displaying further text.
          Backspace SHOULD be used only for  underlining,  done  by  a
          sequence of underscores (ASCII 95) followed by an equal num-
          ber of backspaces, signifying that the same number  of  text
          characters  following  are  to  be  underlined.  Posters are
          warned that underlining  is  not  available  on  all  output
          devices  and  is  best  not relied on for essential meaning.
          Reading agents SHOULD recognize underlining and translate it
          to the appropriate commands for devices that support it.

               NOTE: Interpretation of almost all control charac-
               ters  is  device-specific  to  some  degree,   and
               devices  differ.   Tabs  and  underlining are sup-
               ported, to some extent, by most modern devices and
               reading  agents, hence the cautious exemptions for


-More-
          2 June 1994                - 19 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.4


               them.  The underlining method is specified because
               the  inverse method, text and then underscores, is
               tempting to the naive... but if sent unaltered  to
               a  device  that shows only the most recent of sev-
               eral overstruck characters rather than  a  compos-
               ite, the result can be utterly unreadable.

               NOTE: A common interpretation of tab is that it is
               a request to space forward to  the  next  position
               whose  number  is  one  more than a multiple of 8,
               with positions numbered sequentially  starting  at
               1.  (So tab positions are 9, 17, 25, ...)  Reading
               agents not constrained by existing system  conven-
-More-               tions might wish to use this interpretation.

               NOTE: It will typically be necessary for a reading
               agent to catch and interpret  formfeed,  not  just
               send  it  to  the output device.  The actions per-
               formed by typical output devices  on  receiving  a
               formfeed  are neither adequate for nor appropriate
               to the pause-for-interaction meaning.

          Cooperating subnets which wish to employ non-ASCII character
          sets  by using escape sequences (employing, e.g., ESC (ASCII
          27), SO (ASCII 14), and SI (ASCII 15)) to alter the  meaning
          of  superficially-ASCII  characters  MAY do so, but MUST use
          MIME headers to alert reading agents to the particular char-
          acter  set(s)  and escape sequences in use.  A reading agent
          SHOULD not pass such an escape sequence through,  unaltered,
          to  the  output  device  unless  the agent confirms that the
          sequence is one used to affect character sets and has reason
          to  believe  that the device is capable of interpreting that
          particular sequence properly.

               NOTE:  Cooperating-subnet  organizers  are  warned
               that  some very old relayers strip certain control
-More-               characters out of articles they pass  along.   ESC
               is known to be among the affected characters.

               NOTE:  There  are  now standard Internet encodings
               for Japanese [rrr] and Vietnamese [rrr] in partic-
               ular.

          Articles  MUST  not  contain  any octet with value exceeding
          127, i.e. any octet that is not an ASCII character.

               NOTE: This rule, like others, may  be  relaxed  by
               unanimous  consent of the members of a cooperating
               subnet, provided suitable precautions are taken to
               ensure  that  rule-violating  articles do not leak
               out of the subnet.  (This has already been done in
               many  areas  where  ASCII  is not adequate for the
               local language(s).)  Beware that articles contain-
               ing non-ASCII octets in headers are a violation of



          2 June 1994                - 20 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                    sec. 4.4


               the MAIL specifications and  are  not  valid  MAIL
               messages.   MIME  offers a way to encode non-ASCII
               characters in ASCII for use in headers;  see  sec-
               tion 4.5.

               NOTE: While there is great interest in using 8-bit
               character sets, not all software  can  yet  handle
               them  correctly.  Hence the restriction to cooper-
               ating subnets.  MIME  encodings  can  be  used  to
               transmit  such  characters  while remaining within
               the octet restriction.

          In anticipation of the day when it is possible to  use  non-
          ASCII  characters  safely  anywhere,  and to provide for the
          (substantial) cooperating subnets  that  are  already  using
          them, transmission paths SHOULD treat news articles as unin-
-More-          terpreted sequences of octets (except perhaps for  transfor-
          mations  between  EOL  representations)  and relayers SHOULD
          treat non-ASCII characters in articles as  ordinary  charac-
          ters.

               NOTE:  8-bit  enthusiasts  are warned that not all
               software conforms to  these  recommendations  yet.
               In particular, standard NNTP [rrr] is a 7-bit pro-
               tocol, and  there  may  be  implementations  which
               enforce  this rule.  Be warned, also, tha

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4776 *bbs.tbbs*
10-08-94 11:28:16
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
t it will
               never be safe to send raw binary data in the  body
               of news articles, because changes of EOL represen-
               tation may (will!) corrupt it.

          Except  where  cooperating  subnets   permit   more   direct
          approaches,  MIME [rrr] headers and encodings SHOULD be used
          to transmit non-ASCII content using  ASCII  characters;  see
          section  4.5, appendix B, and the MIME RFCs for details.  If
          article content can be expressed in  ASCII,  it  SHOULD  be.
          Failing  that, the order of preference for character sets is
          that described in MIME [rrr].

               NOTE: Using the MIME facilities, it is possible to
               transmit ANY character set, and ANY form of binary
               data, using only ASCII characters.  Equally impor-
               tant,  such  articles  are self-describing and the
               reading agent can tell which octet-to-symbol  map-
-More-               ping  is  intended!  Designation of some preferred
               character sets is intended to minimize the  number
               of character sets that a reading agent must under-
               stand in order to display most articles  properly.

          Articles  containing  non-ASCII  characters,  articles using
          ASCII characters (values 0 through 127)  to  refer  to  non-
          ASCII  symbols, and articles using escape sequences to shift
          character sets SHOULD include MIME headers indicating  which
          character set(s) and conventions are being used, and MUST do
          so  unless  such  articles  are  strictly  confined   to   a



          2 June 1994                - 21 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.4


-More-          cooperating subnet which has its own pre-agreed conventions.
          MIME encodings are preferred over all these techniques.   If
          it  comes to a relayer's attention that it is being asked to
          pass an article using such techniques outward across what it
          knows  to  be  the boundary of such a cooperating subnet, it
          MUST report this error to its administrator, and MAY  refuse
          to  pass the article beyond the subnet boundary.  If it does
          pass the article, it MUST re-encode it with  MIME  encodings
          to make it conform to this Draft.

               NOTE:  Such re-encoding is a non-trivial task, due
               to MIME rules such as the  prohibition  of  nested
               encodings.   It's not just a matter of pouring the
               body through a simple filter.

          Reading agents SHOULD note MIME headers and attempt to  show
          the   reader  the  closest  possible  approximation  to  the
          intended content.  They SHOULD not just send the  octets  of
          the  article to the output device unaltered, unless there is
          reason to believe that the output device will indeed  inter-
          pret  them  correctly.   Reading  agents MUST not pass ASCII
          control characters or escape sequences, other than  as  dis-
          cussed above, unaltered to the output device; only by chance
-More-          would the result be the desired one, and  there  is  serious
          potential  for  harmful  side  effects, either accidental or
          malicious.

               NOTE: Exactly what to  do  with  unwanted  control
               characters/sequences  depends on the philosophy of
               the reading agent, but passing  them  straight  to
               the  output device is almost always wrong.  If the
               reading agent wants to mark the presence of such a
               character/sequence  in  circumstances  where  only
               ASCII printable characters are  available,  trans-
               lating  it  to "#" might be a suitable method; "#"
               is a conspicuous character seldom used  in  normal
               text.

               NOTE: Reading agents should be aware that many old
               output devices (or the transmission paths to them)
               zero out the top bit of octets sent to them.  This
               can transform non-ASCII characters into ASCII con-
               trol characters.

          Followup  agents MUST be careful to apply appropriate trans-
          formations of representation to  the  outbound  followup  as
-More-          well  as  the  inbound  precursor.  A followup to an article
          containing non-ASCII material is very likely to contain non-
          ASCII material itself.








          2 June 1994                - 22 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.5


          4.5. Non-ASCII Characters In Headers

          All  octets found in headers MUST be ASCII characters.  How-
-More-          ever, it is desirable to have a way  of  encoding  non-ASCII
          characters,  especially  in "human-readable" headers such as
          Subject.  MIME [rrr]  provides  a  way  to  do  this.   Full
          details  may be found in the MIME specifications; herewith a
          quick summary to alert software authors to the issues...

               encoded-word  = "=?" charset "?" encoding "?" codes "?="
               charset       = 1*tag-char
               encoding      = 1*tag-char
               tag-char      = <ASCII printable character except
!()<>@,;:\"[]/?=>
               codes         = 1*code-char
               code-char     = <ASCII printable character except ?>

          An encoded word is a sequence of ASCII printable  characters
          that  specifies the character set, encoding method, and bits
          of (potentially) non-ASCII characters.   Encoded  words  are
          allowed  only in certain positions in certain headers.  Spe-
          cific headers impose restrictions on the content of  encoded
          words beyond that specified in this section.  Posting agents
          MUST ensure that any material  resembling  an  encoded  word
          (complete  with  all delimiters), in a context where encoded
          words may appear, really is an encoded word.
-More-
               NOTE: The  syntax  is  a  bit  ugly,  but  it  was
               designed  to  minimize  chances  of confusion with
               legitimate header contents, and to satisfy  diffi-
               cult constraints on use within existing headers.

          An  encoded word MUST not be more than 75 octets long.  Each
          line of a header containing encoded word(s) MUST be at  most
          76 octets long, not counting the EOL.

               NOTE:  These  limits are meant to bound the looka-
               head needed to determine whether text that  begins
               "=?" is really an encoded word.

          The  details  of  charsets and encodings are defined by MIME
          [rrr]; the sequence of preferred character sets is the  same
          as  MIME's.   Encoded  words  SHOULD not be used for content
          expressible in ASCII.

          When an encoded word is used, other than in a newsgroup name
          (see  section  5.5),  it MUST be separated from any adjacent
          non-space characters  (including  other  encoded  words)  by
          white  space.   Reading  agents  displaying  the contents of
-More-          encoded words (as opposed  to  their  encoded  form)  should
          ignore white space adjacent to encoded words.

               UNRESOLVED  ISSUE:  Should this section be deleted
               entirely, or made much more terse?   The  material
               is relevant, but too complex to discuss fully.



          2 June 1994                - 23 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.5


               NOTE: The deletion of intervening white space per-
               mits using multiple encoded words, implicitly con-
               catenated  by  the  deletion,  to encode text that
               will not fit within a single 75-character  encoded
               word.
-More-
          Reading-agent  implementors  are  warned  that although this
          Draft completely specifies where encoded words may appear in
          the  headers  it  defines, there are other headers (e.g. the
          MIME Content-Description header) that MAY contain them.


          4.6. Size Limits

          Implementations SHOULD avoid fixed constraints on the  sizes
          of  lines  within  an  article and on the size of the entire
          article.

          Relayers SHOULD treat the body of an article as an  uninter-
          preted  sequence of octets (except as mandated by changes of
          EOL representation and processing of control messages),  not
          to be altered or constrained in any way.

          If  it  is  absolutely  necessary  for  an implementation to
          impose a

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4779 *bbs.tbbs*
10-08-94 11:28:31
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 limit on the length of header lines, body lines, or
          header  logical  lines,  that  limit  shall be at least 1000
          octets, including EOL representations.  Relayers and  trans-
          mission  paths  confronted  with lines beyond their internal
          limits (if any)  MUST  not  simply  inject  EOLs  at  random
          places;  they MAY break headers (as described in 4.2.3) as a
          last resort, and otherwise they MUST either  pass  the  long
          lines  through  unaltered,  or refuse to pass the article at
          all (see section 9.1 for further discussion).

               NOTE: The limit here is essentially the same mini-
               mum  as  that  specified  for SMTP mail in RFC 821
               [rrr].  Implementors are  warned  that  Path  (see
               section  5.6)  and  References  (see  section 6.5)
               headers, in particular, often become several  hun-
               dred  characters  long,  so  1000 is not an overly
               generous limit.

-More-          All implementations  MUST  be  able  to  handle  an  article
          totalling  at least 65,000 octets, including headers and EOL
          representations, gracefully and efficiently.  All  implemen-
          tations  SHOULD  be  able  to handle an article totalling at
          least 1,000,000 (one million) octets, including headers  and
          EOL  representations,  gracefully  and efficiently.  "Grace-
          fully and efficiently" is  intended  to  preclude  not  only
          failures,  but also major loss of performance, serious prob-
          lems in error recovery, or resource consumption beyond  what
          is reasonably necessary.





          2 June 1994                - 24 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.6

-More-
               NOTE:  The intent here is to prohibit lowering the
               existing  de-facto  limit   any   further,   while
               strongly  encouraging  movement  towards  a higher
               one.  Actually, although improvements  are  desir-
               able  in some cases, much news software copes rea-
               sonably well with very large articles.   The  same
               cannot  be said of the communications software and
               protocols used to transmit news from one  host  to
               another, especially when slow communications links
               are  involved.   Occasional  huge  articles   that
               appear now (by accident or through ignorance) typ-
               ically leave trails of  failing  software,  system
               problems,  and irate administrators in their wake.

               NOTE: It is intended that the  successor  to  this
               Draft will raise the "MUST" limit to 1,000,000 and
               the "SHOULD" limit still further.

          Posters SHOULD limit  posted  articles  to  at  most  60,000
          octets,  including  headers  and EOL representations, unless
          the articles are being posted only within a cooperating sub-
          net which is known to be capable of handling larger articles
-More-          gracefully.  Posting agents presented with a  large  article
          SHOULD warn the poster and request confirmation.

               NOTE:  The difference between this and the earlier
               "MUST" limit is margin for header growth,  differ-
               ing  EOL  representations,  and transmission over-
               heads.

               NOTE: Disagreeable though these limits are, it  is
               a fact that in current networks, an article larger
               than 64K (after header growth etc.) simply is  not
               transmitted  reliably.   Note  also  the  comments
               above on the trauma caused  by  single  extremely-
               large articles now; the problems are real and cur-
               rent.  These problems arguably  should  be  fixed,
               but this will not happen network-wide in the imme-
               diate future.  Hence  the  restriction  of  larger
               articles to cooperating subnets, for now.

          Posters  using  non-ASCII characters in their text MUST take
          into account the overhead involved in MIME encoding,  unless
          the  article's  propagation  will  be  entirely limited to a
          cooperating subnet which does not  use  MIME  encodings  for
-More-          non-ASCII  characters.   For  example,  MIME base64 encoding
          involves growth by a factor  of  approximately  4/3,  so  an
          article  which would likely have to use this encoding should
          be at most about 45,000 octets before encoding.

          Posters SHOULD use  MIME  "message/partial"  conventions  to
          facilitate  automatic  reassembly  of a large document split
          into smaller pieces for posting.  It is recommended that the
          content identifier used should be a message ID, generated by



          2 June 1994                - 25 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 4.6


          the same means as article message IDs (see section 5.3), and
          that  all  parts  should have a See-Also header (see section
-More-          6.16) giving the message IDs of at least the previous  parts
          and preferably all the parts.

               NOTE:  See-Also  is  more correct for this purpose
               than References, although References is in  common
               use  today  (with  less-formal reassembly arrange-
               ments).  MIME reassemblers should probably examine
               articles  suggested  by References headers if See-
               Also headers  are  not  present  to  indicate  the
               whereabouts   of   the   other   parts   of  "mes-
               sage/partial" articles.

          To repeat: implementations SHOULD avoid fixed constraints on
          the  sizes of lines within an article and on the size of the
          entire article.


          4.7. Example

          Here is a sample article:

               From: jerry@eagle.ATT.COM (Jerry Schwarz)
               Path: cbosgd!mhuxj!mhuxt!eagle!jerry
-More-               Newsgroups: news.announce
               Subject: Usenet Etiquette -- Please Read
               Message-ID: <642@eagle.ATT.COM>
               Date: Mon, 17 Jan 1994 11:14:55 -0500 (EST)
               Followup-To: news.misc
               Expires: Wed, 19 Jan 1994 00:00:00 -0500
               Organization: AT&T Bell Laboratories, Murray Hill

               body
               body
               body



          5. Mandatory Headers

          An article MUST have one, and only one, of each of the  fol-
          lowing headers: Date, From, Message-ID, Subject, Newsgroups,
          Path.

               NOTE: MAIL specifies (if read most carefully) that
               there  must be exactly one Date header and exactly
               one From header, but otherwise does  not  restrict
-More-               multiple  appearances  of  headers.   (Notably, it
               permits  multiple   Message-ID   headers!)    This
               appears  singularly  useless,  or even harmful, in
               the context of news, and much current  news  soft-
               ware  will  not  tolerate  multiple appearances of
               andatory headers.



          2 June 1994                - 26 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. 5


          Note also that there are situations, discussed in the  rele-
          vant  parts  of  section  6,  where  References,  Sender, or
          Approved headers are mandatory.

          In the discussions of the individual headers, the content of
-More-          each is specified using the syntax notation.  The convention
          used is that the content of, for example, the Subject header
          is defined as <Subject-content>.


          5.1. Date

          The  Date header contains the date and time when the article
          was submitted for transmission:

               Date-content  = [ weekday "," space ] date space time
               weekday       = "Mon" / "Tue" / "Wed" / "Thu"
                             / "Fri" / "Sat" / "Sun"
               date          = day space month space year
               day           = 1*2digit
               month         = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun"
                             / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec"
               year          = 4digit / 2digit
               time          = hh ":" mm [ ":" ss ] space timezone
               timezone      = "UT" / "GMT"
                             / ( "+" / "-" ) hh mm [ space "(" zone-name ")" ]
               hh            = 2digit
               mm            = 2digit
-More-               ss            = 2digit
               zone-name 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4782 *bbs.tbbs*
10-08-94 11:28:46
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 103 (DRAFT
    = 1*( <ASCII printable character except ()\> / space )

          This is a restricted subset of the MAIL date format.

          If a weekday is given, it MUST be consistent with the  date.
          The  modern  Gregorian  calendar  is used, and dates MUST be
          consistent with its usual conventions; for example,  if  the
          month  is  May,  the day must be between 1 and 31 inclusive.
          The year SHOULD be given as four digits, and posting  agents
          SHOULD  enforce this; however, relayers MUST accept the two-
          digit form, and MUST interpret it  as  having  the  implicit
          prefix "19".

               NOTE: Two-digit year numbers can, should, and must
               be phased out by 1999.

          The time is given on  the  24-hour  clock,  e.g.  two  hours
          before  midnight  is  "22:00" or "22:00:00".  The hh must be
-More-          between 00 and 23 inclusive, the mm between 0 and 59  inclu-
          sive, and the ss between 0 and 61 inclusive.

              NOTE:  Leap  seconds  very  occasionally result in
               minutes that are 61 or 62 seconds long.





          2 June 1994                - 27 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.1


          The date and time SHOULD be  given  in  the  poster's  local
          timezone,  including  a  specification of that timezone as a
          numeric offset (which SHOULD include the timezone name, e.g.
          "EST",  supplied  in  parentheses  like a MAIL comment).  If
-More-          not, they MUST be given in Universal Time (abbreviated "UT";
          "GMT"  is a historical synonym for "UT").  The timezone name
          in parentheses, if present,  is  a  comment;  software  MUST
          ignore  it, except that reading agents might wish to display
          it to the reader.  Timezone names other than "UT" and  "GMT"
          MUST appear only in the comment.

               NOTE: Attempts to deal with a full set of timezone
               names have all foundered on  the  vast  number  of
               such  names in use and the duplications (for exam-
               ple, there are at least FIVE  different  timezones
               called  "EST"  by somebody).  Even the limited set
               of North American zone names authorized by MAIL is
               subject to confusion and misinterpretation.  Hence
               the flat ban on non-UT timezone  names  except  as
               comments.

               NOTE:  RFC 1036 specified that use of GMT (aka UT,
               UTC) was preferred.  However, the local  time  (in
               the  poster's timezone) is arguably information of
               possible interest to the reader, and this requires
               some indication of the poster's timezone.  Numeric
               offsets are an unambiguous way of doing this,  and
-More-               their  use was indeed sanctioned by RFC 1036 (that
               is, this is a change of preference only).

               NOTE:  There  is  frequent  confusion,   including
               errors  in  some news software, regarding the sign
               of numeric timezones.   Zones  west  of  Greenwich
               have  negative offsets.  For example, North Ameri-
               can Eastern Standard Time is zone -0500 and  North
               American Eastern Daylight Time is zone -0400.

               NOTE:  Implementors  are  warned  that the hh in a
               timezone can go up to about 14; it is not  limited
               to  12.   This  is  because the International Date
               Line does  not  run  exactly  along  the  boundary
               between zone -1200 and zone +1200.

               NOTE: The comments in section 2.6 regarding trans-
               lation to other languages are relevant here.   The
               Date-content format, and the spellings of its com-
               ponents, as  found  in  articles  themselves,  are
               always as defined in this Draft, regardless of the
               language  used  to  interact  with   readers   and
               posters.  Reading and posting agents should trans-
-More-               late  as  appropriate.   Actually,  even  English-
               language  reading and posting agents will probably
               want to do some degree of translation on dates, if
               only   to   abbreviate   the  lengthy  format  and



          2 June 1994                - 28 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.1


               (perhaps) translate to and from the reader's time-
               zone.


          5.2. From

          The  From header contains the electronic address, and possi-
-More-          bly the full name, of the article's author:

               From-content  = address [ space "(" paren-phrase ")" ]
                             /  [ plain-phrase space ] "<" address ">"
               paren-phrase  = 1*( paren-char / space / encoded-word )
               paren-char    = <ASCII printable character except ()<>\>
               plain-phrase  = plain-word *( space plain-word )
               plain-word    = unquoted-word / quoted-word / encoded-word
               unquoted-word = 1*unquoted-char
               unquoted-char = <ASCII printable character except
!()<>@,;:\".[]>
               quoted-word   = quote 1*( quoted-char / space ) quote
               quote         = <" (ASCII 34)>
               quoted-char   = <ASCII printable character except "()<>\>
               address       = local-part "@" domain
               local-part    = unquoted-word *( "." unquoted-word )
               domain        = unquoted-word *( "." unquoted-word )

          (Encoded words are described in section 4.5.)  The full name
          is  distinguished  from  the  electronic  address  either by
          enclosing the former in parentheses (making  it  resemble  a
          MAIL  comment, after the address) or by enclosing the latter
          in angle brackets.  The second form is  preferred.   In  the
-More-          first  form, encoded words inside the full name MUST be com-
          posed  entirely  of  <paren-char>s.   In  the  second  form,
          encoded  words  inside the full name may not contain charac-
          ters other than letters (of either case),  digits,  and  the
          characters "!", "*", "+", "-", "/", "=", and "_".  The local
          part is case-sensitive (except that all case counterparts of
          "postmaster"  are  deemed  equivalent),  the domain is case-
          insensitive, and all other parts of  the  From  content  are
          comments  which  MUST  be  ignored  by news software (except
          insofar as reading agents may wish to display  them  to  the
          reader).   Posters  and  posting  agents MUST restrict them-
          selves to this subset of the MAIL From syntax; relayers  MAY
          accept  a  broader subset, but see the discussion in section
          9.1.

               NOTE: The syntax here is a  restricted  subset  of
               the  MAIL  From  syntax, with quoting particularly
               restricted, for simple  parsing.   In  particular,
               the  presence of "<" in the From content indicates
               that the second form is being used, otherwise  the
               first  form is being used.  The major restrictions
               here are those already de-facto imposed by  exist-
               ing software.
-More-




          2 June 1994                - 29 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.2


               NOTE: Overly-lenient posting agents sometimes per-
               mit the second form with a  full  name  containing
               "("  or  ")",  but it is extremely rare for a full
               name to contain "<" or ">" even in mail.   Accord-
               ingly,  reading  agents wishing to robustly deter-
               mine which form is in use in a particular  article
               should  key on the presence or absence of "<", not
               the presence or absence of "(".

-More-          The address SHOULD be a valid and complete  Internet  domain
          address,  capable  of  being  successfully  mailed  to by an
          Internet host (possibly via an MX record and  a  forwarder).
          The  pseudo-domain  ".uucp" MAY be used for hosts registered
          in the UUCP maps (e.g. name "xyz.uucp" for  registered  site
          "xyz"), but such hosts SHOULD discontinue this usage (either
          by arranging a proper Internet address and forwarder, or  by
          using  the "% hack" (see below)), as soon as possible.  Bit-
          net hosts SHOULD use Internet addresses, avoiding the  obso-
          lescent  ".bitnet"  pseudo-domain.   Other  forms of address
          MUST not be used.

            

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4785 *bbs.tbbs*
10-08-94 11:29:01
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
   NOTE: "Other forms" specifically include  UK-style
               "backward"  domains  ("uk.oxbridge.cs"  is  in the
               Czech Republic, not the UK), pure-UUCP  addressing
               ("knee!shin!foot"            instead            of
               "foot%shin@knee.uucp"),  and  abbreviated  domains
               ("zebra.zoo"  instead of "zebra.zoo.toronto.edu").

          If it is necessary to use the local part to specify a  rout-
          ing relative to the nearest Internet host, this MUST be done
          using the "% hack", using "%" as a secondary "@".  For exam-
          ple, to specify that mail to the address should go to Inter-
          net host "foo.bar.edu", then  to  non-Internet  host  "ein",
          then  to  non-Internet  host  "deux",  for delivery there to
          mailbox "fred", a suitable address would be:

               fred%deux%ein@foo.bar.edu

          Analogous forms using "!" in the  local  part  MUST  not  be
-More-          used, as they are ambiguous; they should be expressed in the
          "%" form.

               NOTE: "a!b@c" can be interpreted as either "b%c@a"
               or  "b%a@c",  and there is no consistency in which
               choice is made.  Such addresses  consequently  are
               unreliable.   The  "%"  form  does not suffer from
               this problem, and although its use  is  officially
               discouraged,  it  is  a  de-facto standard, to the
               point that MAIL recognizes it.

          Relayers MUST not, repeat MUST not, repeat MUST not, rewrite
          From  lines,  in any way, however minor or innocent-seeming.
          Trying to "fix" a non-conforming address  has  a  very  high
          probability  of  making  things worse.  Either pass it along



          2 June 1994                - 30 -       expires 15 July 1994




-More-
          INTERNET DRAFT to be        NEWS                    sec. 5.2


          unchanged, or reject the article.

               NOTE: An additional reason for banning the use  of
               "!" addressing is that it has a much higher proba-
               bility of being rewritten into mangled unrecogniz-
               ability by old relayers.

          Posters  and  posting agents SHOULD avoid use of the charac-
          ters "!" and "@" in full names, as they may trigger unwanted
          header rewriting by old, simple-minded news software.

               NOTE: Also, the characters "." and ",", not infre-
               quently found in names (e.g., "John  W.  Campbell,
               Jr."), are NOT, repeat NOT, allowed in an unquoted
               word.  A From header like the following  MUST  not
               be written without the quotation marks:

                    From: "John W. Campbell, Jr." <editor@analog.com>

-More-

          5.3. Message-ID

          The  Message-ID  header contains the article's message ID, a
          unique identifier  distinguishing  the  article  from  every
          other article:

               Message-ID-content  = message-id
               message-id          = "<" local-part "@" domain ">"

          As  with  From addresses, a message ID's local part is case-
          sensitive and its domain is case-insensitive.  The  "<"  and
          ">"  are  parts  of the message ID, not peculiarities of the
          Message-ID header.

               NOTE: News message IDs are a restricted subset  of
               MAIL message IDs.  In particular, no existing news
               software copes properly with MAIL quoting  conven-
               tions  within  the local part, so they are forbid-
               den.  This is unfortunate, particularly for  X.400
               gateways  that  often  wish  to include characters
               which are not legal in unquoted message  IDs,  but
-More-               it  is  impossible to fix net-wide.  See the notes
               on gatewaying in section 10.

          The domain in the message ID SHOULD  be  the  full  Internet
          domain name of the posting agent's host.  Use of the ".uucp"
          pseudo-domain (for hosts registered in the UUCP maps) or the
          ".bitnet"  pseudo-domain  (for Bitnet hosts) is permissible,
          but SHOULD be avoided.

          Posters and posting agents MUST generate the local part of a
          message ID using an algorithm which obeys the specified syn-
          tax (words separated by ".",  with  certain  characters  not



          2 June 1994                - 31 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.3

-More-
          permitted)  (see  section  5.2  for  details),  and will not
          repeat itself (ever).  The  algorithm  SHOULD  not  generate
          message  IDs which differ only in case of letters.  Note the
          specification in section 6.5 of a recommended convention for
          indicating  subject  changes.  Otherwise the algorithm is up
          to the implementor.

               NOTE: The crucial use of message IDs is to distin-
               guish  circulating  articles  from  each other and
               from articles circulated recently.  They are  also
               potentially  useful  as  permanent  indexing keys,
               hence the requirement for permanent  uniqueness...
               but   indexers  cannot  absolutely  rely  on  this
               because the earlier RFCs  urged  it  but  did  not
               demand  it.  All major implementations have always
               generated  permanently-unique   message   IDs   by
               design,  but  in  some  cases this is sensitive to
               proper administration,  and  duplicates  may  have
               occurred by accident.

               NOTE:  The most popular method of generating local
               parts is to use the date and time, plus  some  way
-More-               of distinguishing between simultaneous postings on
               the same host (e.g. a process number), and  encode
               them  in a suitably-restricted alphabet.  An older
               but now  less-popular  alternative  is  to  use  a
               sequence  number,  incremented  each time the host
               generates a new message ID; this is workable,  but
               requires  careful  design  to  cope  properly with
               simultaneous  posting  attempts,  and  is  not  as
               robust  in  the presence of crashes and other mal-
               functions.

               NOTE: Some buggy news software  considers  message
               IDs  completely case-insensitive, hence the advice
               to  avoid  relying  on  case  distinctions.    The
               restrictions  placed  on  the  "alphabet" of local
               parts and domains in section 5.2 have  the  useful
               side effect of making it unnecessary to parse mes-
               sage IDs in complex ways to break them into  case-
               sensitive and case-insensitive portions.

          The  local  part of a message ID MUST not be "postmaster" or
          any other string that would compare equal to "postmaster" in
          a  case-insensitive  comparison.   Message  IDs  MUST  be no
-More-          longer than 250 octets, including the "<" and ">".

               NOTE: "Postmaster"  is  an  irksome  exception  to
               case-sensitivity  in  local  parts, inherited from
               MAIL, and simply avoiding it is the  best  way  to
               deal  with it (not that it's likely, but the issue
               needs to be dealt  with).   The  length  limit  is
               undesirable,  but is present in widely-used exist-
               ing software.  The limit is actually  255,  but  a



          2 June 1994                - 32 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.3


               small safety margin is wise.

-More-
          5.4. Subject

          The  Subject header's content (the "subject" of the article)
          is a short phrase describing the topic of the article:

               Subject-content  = [ "Re: " ] nonblank-text

          Encoded words MAY appear in this heder.

          If the article is a followup, the subject SHOULD begin  with
          "Re: "  (a  "back reference").  If the article is not a fol-
          lowup, the subject MUST not begin  with  a  back  reference.
          Back references are case-insensitive, although "Re: " is the
          preferred form.  A followup  agent  assisting  a  poster  in
          preparing a followup SHOULD prepend a back reference, UNLESS
          the subject already begins with one.  If the  poster  deter-
          mines  that  the topic of the followup differs significantly
          from what is described in the subject, a new, more  descrip-
          tive,  subject  SHOULD  be  substituted (with no back refer-
          ence)

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4788 *bbs.tbbs*
10-08-94 11:29:16
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
.  An article whose subject begins with a  back  refer-
          ence  MUST  have a References header referencing the precur-
          sor.

               NOTE: A back reference  is  FOUR  characters,  the
               fourth being a blank.  RFC 1036 was confused about
               this.  Observe also that only ONE  back  reference
               should be present.

               NOTE:  There  is a semi-standard convention, often
               used, in which a subject change is flagged by mak-
               ing the new Subject-content of the form:

                    new topic (was: old topic)

               possibly  with  "old  topic"  somewhat  truncated.
               Posters wishing to  do  something  like  this  are
               urged  to  use  this exact form, to simplify auto-
-More-               mated analysis.

          For historical reasons, the  subject  MUST  not  begin  with
          "cmsg " (note that this sequence ends with a blank).

               NOTE:  Some  old  news  software  takes  a subject
               beginning with "cmsg " as an indication  that  the
               article is a control message (see sections 6.6 and
               7).  This mechanism is obsolete  and  undesirable,
               but accidental triggering of it is still possible.

          The subject SHOULD be terse.  Posters SHOULD avoid trying to
          cram  their  entire  article into the headers; even the sim-
          plest query usually benefits  from  a  sentence  or  two  of



          2 June 1994                - 33 -       expires 15 July 1994





-More-          INTERNET DRAFT to be        NEWS                    sec. 5.4


          elaboration  and  context, and the details of header display
          vary widely among reading agents.

               NOTE: All-in-the-subject  articles  are  sometimes
               the  result of misunderstandings over the interac-
               tion protocol of a posting agent.  Posting  agents
               might wish to give special attention to the possi-
               bility that a poster specifying a very  long  sub-
               ject  might have thought he was typing the body of
               the article.


          5.5. Newsgroups

          The Newsgroups header's content specifies which newsgroup(s)
          the article is posted to:

               Newsgroups-content  = newsgroup-name *( ng-delim newsgroup-name
)
               newsgroup-name      = plain-component *( "." component )
-More-               component           = plain-component / encoded-word
               plain-component     = component-start *13component-rest
               component-start     = lowercase / digit
               lowercase           = <letter a-z>
               component-rest      = component-start / "+" / "-" / "_"
               ng-delim            = ","

          Encoded words used in newsgroup names MUST not contain char-
          acters other than letters, digits, "+", "-", "/", "_",  "=",
          and "?"  (although they may encode them).

          A  newsgroup  name consists of one or more components, which
          may be plain components or (except for  the  first)  encoded
          words.   A plain component MUST contain at least one letter,
          MUST begin with a letter or digit, and MUST  not  be  longer
          than  14  characters.  The first component MUST begin with a
          letter; subsequent components SHOULD begin  with  a  letter.
          Newsgroup  names  MUST not contain uppercase letters, except
          where required by encodings in encoded words.  The sequences
          "all" and "ctl" MUST not be used as components.

               NOTE:  The  alphabet  and  syntax specified encom-
               passes all  existing  names  of  widespread  news-
-More-               groups,  while  avoiding  various  forms  that are
               known to cause problems.  Important existing soft-
               ware  uses  various non-alphanumeric characters as
               punctuation  adjacent  to  newsgroup  names.   (It
               would,  in  fact,  be  preferable  to ban "+" from
               newsgroup  names,  were  it   not   that   several
               widespread  newsgroups related to the C++ program-
               ming language already use it.)

               NOTE: Much existing software  converts  the  news-
               group  name  into  a directory path and stores the
               articles themselves using  numeric  filenames,  so



          2 June 1994                - 34 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.5

-More-
               all-digit  name components can be troublesome; the
               "Great Renaming" early in the  history  of  Usenet
               included  revisions  of several newsgroup names to
               eliminate such components.

               NOTE: The same storage technique is the reason for
               the  14-character limit.  The limit is now largely
               historical, since most modern  systems  have  much
               larger limits on the length of a directory entry's
               name, but many old systems are still in use.  Sys-
               tems  with  shorter  limits  also  exist, but news
               software on such systems has had to deal with  the
               problem   already,   since   there   are   several
               widespread newsgroups with 14-character components
               in  their  names.  Implementors are warned that it
               is intended that the successor to this Draft  will
               increase  the 14-character limit, and are urged to
               fix their software to handle longer  names  grace-
               fully  (if  such  fixes  are  necessary, given the
               intended domain of application of  the  particular
               software).

-More-               NOTE:  The requirement that the first character of
               a name be a letter accommodates existing  software
               which assumes it can tell the difference between a
               newsgroup name and other possible syntactic  enti-
               ties  by  inspecting the first character.  Similar
               considerations motivate excluding  "+",  "-",  and
               "_"  from  coming  first  in  a component, and the
               preference for components that do not  begin  with
               digits.   The "all" sequence is used as a wildcard
               symbol in much existing software,  and  the  "ctl"
               sequence  was  involved  in an obsolete historical
               mechanism for marking control  messages,  so  they
               are best avoided.

               NOTE:  Possibly  newsgroup  names should have been
               case-insensitive, but all existing software treats
               them  as  case-sensitive.   (RFC  977 [rrr] claims
               that they are case-insensitive in NNTP, but exist-
               ing  implementations are believed to ignore this.)
               The simplest solution is just to ban use of upper-
               case  letters,  since no widespread newsgroup name
               uses them anyway; this avoids any  possibility  of
               confusion.
-More-
               NOTE:  The syntax has the disadvantage of contain-
               ing no white space, making it impossible  to  con-
               tinue  a  Newsgroups  header across several lines.
               Implementors of relayers and  reading  agents  are
               warned  that  it is intended that the successor to
               this Draft will change the definition of  ng-delim
               to:




          2 June 1994                - 35 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.5


                    ng-delim = "," [ space ]

-More-               and  are  urged  to  fix  their software to handle
               (i.e., ignore) white space following  the  commas.
               Meanwhile, posters must avoid inserting such space
               (despite  the  natural-language  convention  which
               permits  it)  and  posting  agents should strip it
               out.

               NOTE: Encoded words  as  components  are  somewhat
               problematic,  but are clearly desirable for use in
               non-English-speaking nations.  They are  not  sub-
               ject to the 14-character limit, and this (plus the
               possibility of "/" within them) may  require  spe-
               cial handling in news software.

          Encoded words are allowed in newsgroup names ONLY where non-
          ASCII characters are necessary to the name, and must use the
          "b"  encoding  [rrr] and the first suitable character set in
          the MIME order of preferred character sets [rrr].

               NOTE: Since the  newsgroup  name  is  the  encoded
    

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4791 *bbs.tbbs*
10-08-94 11:29:31
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
           form,  NOT the underlying non-ASCII form, there is
               room for terrible confusion here if the choice  of
               encoding  for a particular name is not fully stan-
               dardized.

          Posters SHOULD use only the names of existing newsgroups  in
          the  Newsgroups  header,  because newsgroups are NOT created
          simply by being posted to.  However,  it  is  legitimate  to
          cross-post to newsgroup(s) which do not exist on the posting
          agent's host, provided that at least one of  the  newsgroups
          DOES  exist  there,  and  followup  agents  MUST accept this
          (posting agents MAY accept it, but SHOULD at least alert the
          poster to the situation and request confirmation).  Relayers
          MUST not rewrite Newsgroups headers in any way, even if some
          or all of the newsgroups do not exist on the relayer's host.

               NOTE: Early experience  with  news  software  that
               created  newsgroups  when they were mentioned in a
-More-               Newsgroups header was thoroughly negative: posters
               frequently mistype newsgroup names.

               NOTE:  While it is legitimate for some of an arti-
               cle's newsgroups not to exist on the host where it
               is  posted,  this  IS  a  rather unusual situation
               except in followups (which should go to all  news-
               groups  the  precursor  was posted to, even if not
               all of them reach the site where the  followup  is
               being posted).

               NOTE:   Rewriting   Newsgroups  headers  to  strip
               locally-unknown   newsgroups   is    superficially
               attractive.    However,   early   experience  with



          2 June 1994                - 36 -       expires 15 July 1994





-More-          INTERNET DRAFT to be        NEWS                    sec. 5.5


               exactly that policy was thoroughly negative:  news
               propagation   is  more  redundant  and  much  less
               orderly than many people imagine, and in  particu-
               lar  it  is  not  unheard-of  for  the (sometimes)
               fastest path between two (say) U of Toronto  sites
               to  pass  outside  U  of  Toronto... in which case
               newsgroup stripping can cause incomplete  propaga-
               tion.   Having  an  article's  set  of  newsgroups
               change as it propagates can also  result  in  fol-
               lowups  not  achieving the same propagation as the
               original.  It's been tried; it's more trouble than
               it's worth; don't do it.

               NOTE:  In particular, newsgroup stripping superfi-
               cially looks like a solution  to  the  problem  of
               duplicate  regional newsgroup names.  For example,
               both University of Toronto and University of Texas
               have  "ut.general" newsgroups, and material cross-
               posted to that name and a global newsgroup appears
               in  both universities' local newsgroups.  However,
-More-               the side effects  of  stripping  are  sufficiently
               unacceptable  to  disqualify  it for this purpose.
               Don't do it.

          Cross-posting an article to several relevant  newsgroups  is
          far  superior  to  posting separate articles with duplicated
          content to each newsgroup, because reading agents can detect
          the  situation  and  show the article to a reader only once.
          Posters SHOULD cross-post rather than duplicate-post.

               NOTE: On the other hand, cross-posting to a  large
               number  of  newsgroups  usually indicates that the
               poster has not thought about his  audience;  arti-
               cles  are rarely pertinent to more than (say) half
               a dozen newsgroups.  Posting agents might wish  to
               request confirmation when the number of newsgroups
               exceeds (say) five in the presence of a  Followup-
               To  header,  or (say) two in the absence of such a
               header.

               NOTE: One problem with cross-postings is  what  to
               do  with an article cross-posted to a set of news-
               groups including both  moderated  and  unmoderated
-More-               ones.   Posters  tend to expect such an article to
               show up immediately in the unmoderated newsgroups,
               especially if they do not realize that one or more
               of the newsgroups is moderated.  However, since it
               is  not  possible for a moderator to retroactively
               add an already-posted article to a moderated news-
               group,  the only correct action is to mail such an
               article to one (and only one)  of  the  moderators
               for  action.   It is probably best for the posting
               agent to detect this situation and ask the  poster
               what  action is preferred.  The acceptable choices



          2 June 1994                - 37 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.5


-More-               are to alter the newsgroup list or to  mail  to  a
               moderator  of  the  poster's  choice;  the posting
               agent should NOT  offer  duplicate-posting  as  an
               easy-to-request  option (if only because many mod-
               erators will reject a submission that has  already
               been posted to unmderated newsgroups).

               NOTE:  An  article cross-posted to multiple moder-
               ated newsgroups really should have  approval  from
               all  the  moderators  involved.   In practice, the
               only straightforward way to do this is to send the
               article  to  one  of them and have him consult the
               others.

          A newsgroup SHOULD not appear more than once  in  the  News-
          groups header.

          Newsgroup  names  having only one component are reserved for
          newsgroups whose propagation is restricted to a single  host
          (or  the  administrative  equivalent).  It is inadvisable to
          name a newsgroup "poster"  because  that  word  has  special
          meaning  in  the  Followup-To header (see section 6.1).  The
          names "control" and "junk" are frequently used  for  pseudo-
-More-          newsgroups  internal  to  relayer implementations, and hence
          are also best avoided.

               NOTE: Beware of the  duplicate-regional-newsgroup-
               names  problem  mentioned  above.   In particular,
               there are many, many hosts with a newsgroup  named
               "general",  and  some surprising things show up in
               such newsgroups when  people  cross-post.   It  is
               probably  better  to  use  multi-component  names,
               which are less likely to  be  duplicated.   Fred's
               Widget  House should use "fwh.general" rather than
               just  "general"  as  its  in-house  general-topics
               newsgroup.

          It is conventional to reserve newsgroup names beginning with
          "to." for test messages sent  on  an  essentially  point-to-
          point basis (see also the ihave/sendme protocol described in
          section 7.2); newsgroup names beginning  with  "to."  SHOULD
          not be used for any other purpose.  The second (and possibly
          later) components of such a name should, together,  comprise
          the  relayer name (see section 5.6) of a relayer.  The news-
          group exists only at the named relayer  and  its  neighbors.
          The  neighbors all pass that newsgroup to the named relayer,
-More-          while the named relayer does not pass it to anyone.

          The order of newsgroup names in the Newsgroups header is not
          significant.







          2 June 1994                - 38 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.6


          5.6. Path

          Te Path header's content indicates which relayers the arti-
-More-


Msg#: 4794 *bbs.tbbs*
10-08-94 11:29:46
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 and a delimiter, to the Path content in all  arti-
          cles  it processes.  A relayer MUST not pass an article to a
          neighboring relayer whose name is already  mentioned  in  an
          article's  path list, unless this is explicitly requested by
          the neighbor  in  some  way.   The  Path  content  is  case-
          sensitive.

               NOTE:  The Path header supplied by a posting agent
               should normally contain only the local part.   The
               relayer  that the posting agent passes the article
               to for posting will prepend its  relayer  name  to
               get the path list started.

               NOTE:  Observe that the trailing local part is NOT
               part of the path list.  This Path header:

                    Path: fee!fie!foe!fum

-More-               contains three relayer names:  "fee",  "fie",  and
               "foe".  A relayer named "fum" is still eligible to
               be sent this article.

               NOTE: This syntax has the disadvantage of contain-
               ing  no  white space, making it impossible to con-
               tinue a Path header across several lines.   Imple-
               mentors  of relayers and reading agents are warned
               that it is intended that  the  successor  to  this
               Draft will change the definition of path delimiter
               to:

                    path-delimiter = "!" [ space ]

               and are urged to  fix  their  software  to  handle
               (i.e.,  ignore) white space following the exclama-
               tion points.  They are urged to hurry;  some  ill-
               behaved  systems  reportedly  already feel free to
               add such white space.




-More-
          2 June 1994                - 39 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 5.6


               NOTE: RFC 1036 allows considerably more  flexibil-
               ity  in  choice  of delimiter, in theory, but this
               flexibility has never  been  used  and  most  news
               software  does  not  implement  it  properly.  The
               grammar reflects the current  reality.   Note,  in
               particular,  that  RFC 1036 treats "_" as a delim-
               iter, but in fact it is known to appear in relayer
               names occasionally.

          Because  an  article will not propagate to a relayer already
          mentioned in its path list, the path list MUST  not  contain
          any  names  other  than  those  of  relayers the article has
          passed through AS NEWS.  This is trivially obvious for  nor-
-More-          mal  news  articles, but requires attention from the modera-
          tors of moderated newsgroups and the implementors and  main-
          tainers of gateways.

               NOTE:  For  the  same  reason,  a  relayer and its
               neighbors need to agree on the choice  of  relayer
               name,  and  names  should  not  be changed without
               notifying neighbors.

          Relayer names need to be unique  among  all  relayers  which
          will  ever  see  the articles using them.  A relayer name is
          normally either an "official" name for the host the  relayer
          runs  on,  or  some  other "official" name controlled by the
          same organization.  Except in cooperating subnets that agree
          to  some  other  convention, and don't let articles using it
          escape beyond the subnet, a relayer name MUST  be  either  a
          UUCP  name  registered  in the UUCP maps (without any domain
          suffix such as ".UUCP"), or a complete Internet domain name.
          Use  of a (registered) UUCP name is recommended, where prac-
          tical, to keep the length of the path list down.

          The use of Internet domain names in the path  list  presents
          one problem: domain names are case-insensitive, but the path
-More-          list is case-sensitive.   Relayers  using  domain  names  as
          their  relayer names MUST pick a standard form for the name,
          and use that form consistently to the exclusion of all  oth-
          ers.   The  preferred  form for this purpose, which relayers
          SHOULD use, is the all-lowercase form.

               NOTE: It is arguably  unfortunate  that  the  path
               list is case-sensitive, but it is much too late to
               change this.   Most  Internet  sites  do,  in  any
               event,  use  one  standardized  form of their name
               almost everywhere.

          In the ordinary case, where the poster is the author of  the
          article,  the  local  part following the path list SHOULD be
          the local part of the poster's full Internet domain  mailing
          address.





          2 June 1994                - 40 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                    sec. 5.6


               NOTE:  It  should  be just the local part, not the
               full address.  The character "@" does  not  appear
               in a Path header.

          The  Path content somewhat resembles a mailing address, par-
          ticularly in the UUCP world with its manual routing and  "!"
          address  syntax.   Historically, this resemblance was impor-
          tant, and the  Path  content  was  often  used  as  a  reply
          address.  This practice has always been somewhat unreliable,
          since news paths are not always mail paths and news  relayer
          names  are  not  always recognized by mail handlers, and its
          reliability has generally worsened  in  recent  times.   The
          widespread   use  of  and  recognition  of  Internet  domain
          addresses, even outside the  actual  Internet,  has  largely
          eliminated  the  problem.   Readers  SHOULD not use the Path
          content as a reply address.   On  the  other  hand,  relayer
-More-          administrators  are  urged  not  to break this usage without
          good reason; where practical, paths followed by news  SHOULD
          be  traversable  by mail, and mail handlers SHOULD recognize
          relayer names as host names.

          It will typically be difficult or impractical  for  gateways
          and  moderators to supply a Path content that is useful as a
          reply address for the author, bearing in mind that the  path
          list they supply will normally be empty.  (To reiterate: the
          path list MUST not contain any names  other  than  those  of
          relayers  the  article  has  passed  through AS NEWS.)  They
          SHOULD supply a local part that will result in replies to  a
          Path-derived  address  being  returned  to the sender with a
          brief explanation.   Software  permitting,  the  local  part
          "not-for-mail" is recommended.

               NOTE:  A  moderator  or  gateway administrator who
               supplies a local part that delivers such  mail  to
               an  administrative  mailbox  will quickly discover
               why it should be  bounced  automatically!   It  is
               best, however, for the returned message to include
               an explanation  of  what  has  probably  happened,
               rather than just a mysterious "undeliverable mail"
-More-               complaint, since the sender may not be aware  that
               his/her  software  is unwisely using the Path con-
               tent as a reply  address.   Reply  software  might
               wish  to  question  attempts  to  reply to a Path-
               derived address ending in "not-for-mail" (which is
               why a specific name is being recommended here).


          6. Optional Headers

          Many  MAIL  headers,  and many of those specified in present
          and future MAIL extensions, are  potentially  applicable  to
          news.   Headers  specific to MAIL's point-to-point transmis-
          sion paradigm, e.g. To and Cc, SHOULD  not  appear  in  news
          articles.   (Gateways  wishing  to preserve such information



          2 June 1994                - 41 -       expires 15 July 1994




-More-
          INTERNET DRAFT to be        NEWS                      sec. 6


          for debugging probably SHOULD hide it under different names;
          prefixing  "X-"  to  the original headers, resulting in e.g.
          "X-To", is suggested.)

          The following optional headers are either specific  to  news
          or  of particular note in news articles; an article MAY con-
          tain some or all of them.  (Note that there are some circum-
          stances  in  which  some  of  them  are mandatory; these are
          explained under the individual headers.)   An  article  MUST
          not contain two or more headers with any one of these header
    

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4797 *bbs.tbbs*
10-08-94 11:30:01
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
      names.

               NOTE: The ban on duplicate header names  does  not
               apply  to  headers  not specified in this Draft at
               all, such as "X-" headers.   Software  should  not
               assume  that  all  header names in a given article
               are unique.


          6.1. Followup-To

          The Followup-To header contents specify  which  newsgroup(s)
          followups should be posted to:

               Followup-To-content = Newsgroups-content / "poster"

          The  syntax  is  the same as that of the Newsgroups content,
          with the exception that the magic word "poster"  means  that
-More-          followups  should  be  mailed to the article's reply address
          rather than posted.  In  the  absence  of  Followup-To,  the
          default  newsgroup(s)  for a followup are those in the News-
          groups header.

               NOTE: The way to request that followups be  mailed
               to  a specific address other than that in the From
               line is  to  supply  "Followup-To: poster"  and  a
               Reply-To header.  Putting a mailing address in the
               Followup-To  line  is  incorrect;  posting  agents
               should reject or rewrite such headers.

               NOTE:   There  is  no  syntax  for  "no  followups
               allowed"  because   "Followup-To: poster"   accom-
               plishes this effect without extra machinery.

          Although it is generally desirable to limit followups to the
          smallest reasonable set of newsgroups, especially  when  the
          precursor was cross-posted widely, posting agents SHOULD not
          supply a Followup-To header except at the poster's  explicit
          request.

               NOTE: In particular, it is incorrect for the post-
-More-               ing agent to assume that  followups  to  a  cross-
               posted  article  should  be  directed to the first
               newsgroup only.  Trimming the list  of  newsgroups



          2 June 1994                - 42 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.1


               should  be  the poster's decision, not the posting
               agent's.  However, when an article is to be cross-
               posted  to  a considerable number of newsgroups, a
               posting agent might wish to SUGGEST to the  poster
               that followups go to a shorter list.


          6.2. Expires
-More-
          The  Expires  header  content specifies a date and time when
          the article is deemed to be no longer useful and  should  be
          removed ("expired"):

               Expires-content = Date-content

          The  content syntax is the same as that of the Date content.
          In the absence of Expires, the default  is  decided  by  the
          administrators  of  each  host  the article reaches, who MAY
          also restrict the extent to which the Expires header is hon-
          ored.

          The Expires header has two main applications: removing arti-
          cles whose utility ends on  a  specific  date  (e.g.,  event
          announcements which can be removed once the day of the event
          is past) and preserving articles expected to be of prolonged
          usefulness  (e.g.,  information  aimed  at  new readers of a
          newsgroup).  The latter  application  is  sometimes  abused.
          Since individual hosts have local policies for expiration of
          news (depending on  available  disk  space,  for  instance),
          posters  SHOULD  not  provide  Expires  headers for articles
          unless there is a natural expiration  date  associated  with
-More-          the  topic.   Posting  agents  MUST  not  provide  a default
          Expires header.  Leave it out and allow local policies to be
          used unless there is a good reason not to.  Expiry dates are
          properly the decision  of  individual  host  administrators;
          posters  and  moderators  SHOULD  set only expiry dates that
          most administrators would agree with.

               NOTE: A poster preparing an Expires header for  an
               article  whose  utility  ends  on  a  specific day
               should typically  specify  the  NEXT  day  as  the
               expiry  date.   A  meeting  on July 7th remains of
               interest on the 7th.


          6.3. Reply-To

          The Reply-To header content specifies a reply  address  dif-
          ferent from the author's address given in the From header:

               Reply-To-content = From-content

          In the absence of Reply-To, the reply address is the address
          in the From header.
-More-


          2 June 1994                - 43 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.3


          Use of a Reply-To header is preferable to including a  simi-
          lar  request  in the article body, because reply-preparation
          software can take account of Reply-To automatically.


          6.4. Sender

          The Sender header identifies the poster, in the  event  that
          this differs from the author identified in the From header:

               Sender-content = From-content
-More-
          In  the  absence of Sender, the default poster is the author
          (named in the From header).

               NOTE: The intent is that the Sender header have  a
               fairly  high probability of identifying the person
               who really posted the  article.   The  ability  to
               specify  a  From  header naming someone other than
               the poster is useful but can be abused.

          If the poster supplies a From header, the posting agent MUST
          ensure that a Sender header is present, unless it can verify
          that the mailing address in the From header is a valid mail-
          ing address for the poster.  A poster-supplied Sender header
          MAY be used, if its mailing address is  verifiably  a  valid
          mailing  address for the poster; otherwise the posting agent
          MUST supply a Sender header and delete (or rename,  e.g.  to
          X-Unverifiable-Sender) any poster-supplied Sender header.

               NOTE:  It  might  be  useful to preserve a poster-
               supplied Sender header so that the poster can sup-
               ply  the full-name part of the content.  The mail-
               ing address, however, must be right.   Hence,  the
-More-               posting  agent  must generate the Sender header if
               it is unable to verify the mailing  address  of  a
               poster-supplied one.

               NOTE:  NNTP implementors, in particular, are urged
               to note this requirement  (which  would  eliminate
               the  need  for  ad  hoc headers like NNTP-Posting-
               Host), although there are admittedly  some  imple-
               mentation  difficulties.   A user name from an RFC
               1413 server and a host name from an  inverse  map-
               ping  of  the  address, perhaps with a "full name"
               comment noting  the  origin  of  the  information,
               would be at least a first approximation:

                    Sender: fred@zoo.toronto.edu (RFC-1413@reverse-lookup; not
verified)

               While  this does not completely meet the specs, it
               comes a lot closer than not having a Sender header
               at all.  Even just supplying a placeholder for the
               user name:


-More-
          2 June 1994                - 44 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.4


                    Sender: somebody@zoo.toronto.edu (user name unknown)

               would be better than nothing.


          6.5. References

          The References header content lists message IDs  of  precur-
          sors:

               References-content = message-id *( spae message-id )

          A  followup  MUST  have  a References header, and an article
-More-          which is not a followup MUST not have a  References  header.
          In a followup, if the precursor had a References header, the
          message ID of the precursor is appended to the  end  of  the
          precursor's References-content to form the followup's Refer-
          ences-content.  a References header containing  the  precur-
          sor's message ID.  A followup to an article which had a Ref-
          erences header MUST have a References header containing  the
          precursor's References content, plus the precursor's message
          ID appended to the end of the list.

               NOTE: Use the See-Also header (section  6.16)  for
               interconnection  of  articles  which  are not in a
               followup relationship

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4800 *bbs.tbbs*
10-08-94 11:30:16
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 to each other.

               NOTE: In retrospect, RFCs 850 and  1036,  and  the
               implementations  whose  practice they represented,
               erred here.  The proper MAIL  header  to  use  for
               references  to  precursors is In-Reply-To, and the
               References header is meant to be used for the pur-
               poses  here ascribed to See-Also.  This incompati-
               bility is far too solidly established to be fixed,
               unfortunately.   The  best  that can be done is to
               provide a clear mapping between the two, and  urge
               gateways to do the transformation.  The news usage
               is (now) a deliberate violation of the MAIL speci-
               fications;  articles  containing  news  References
               headers are technically not valid  MAIL  messages,
               although  it  is  unlikely that much MAIL software
               will notice because the incompatibility  is  at  a
               subtle  semantic  level  that  does not affect the
-More-               syntax.

               UNRESOLVED ISSUE: Would it be better to just  give
               up  and  admit  that news uses References for both
               purposes?

               UNRESOLVED ISSUE: Should the syntax be generalized
               to  include  URLs  as alternatives to message IDs?
               Perhaps not; too many things know about References
               already.   And non-articles can't be precursors of
               articles, not really.



          2 June 1994                - 45 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.5


-More-          Followup agents SHOULD not shorten References  headers.   If
          it  is absolutely necessary to shorten the header, as a des-
          perate last resort, a followup agent MAY do this by deleting
          some  of  the  message IDs.  However, it MUST not delete the
          first message ID, the last three message IDs (including that
          of  the immediate precursor), or any message ID mentioned in
          the body of the followup.  If it is possible  for  the  fol-
          lowup agent to determine the Subject content of the articles
          identified in the References header, it MUST not delete  the
          message  ID of any article where the Subject content changed
          (other than by prepending of a back  reference).   The  fol-
          lowup  agent MUST not delete any message ID whose local part
          ends with "_-_" (underscore (ASCII 95), hyphen  (ASCII  45),
          underscore);  followup  agents are urged to use this form to
          mark subject changes, and to avoid using it otherwise.

               NOTE: As software capable of exploiting References
               chains  has grown more common, the random shorten-
               ing permitted by RFC 1036 has become  increasingly
               troublesome.   ANY  shortening is undesirable, and
               software should do it only in cases of dire neces-
               sity.  In such cases, these rules attempt to limit
               the damage.
-More-
               NOTE: The first message ID is  very  important  as
               the  starting point of the "thread" of discussion,
               and absolutely should not be deleted.  Keeping the
               last  three  message  IDs  gives  thread-following
               software a fighting chance to reconstruct  a  full
               thread  even  if  an  article  or  two is missing.
               Keeping message IDs mentioned in the body is obvi-
               ously desirable.

               NOTE:  Subject changes are difficult to determine,
               but they are significant as possible beginnings of
               new  threads.  The "_-_" convention is provided so
               that posting agents (which have  more  information
               about  subjects)  can  flag  articles containing a
               subject change in a way that followup  agents  can
               detect  without access to the articles themselves.
               The sequence is  chosen  as  one  that  is  fairly
               unlikely to occur by accident.

               NOTE: Is "_-_" really worth having?

          When a References header is shortened, at least three blanks
-More-          SHOULD be left between adjacent message IDs  at  each  point
          where  deletions  were  made.  Software preparing new Refer-
          ences headers SHOULD preserve multiple blanks in older  Ref-
          erences content.

               NOTE:  It's desirable to have some marker of where
               deletions occurred, but the restricted  syntax  of
               the  header  makes  this  difficult.   Extra white



          2 June 1994                - 46 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.5


               space is not a very good marker, since it  may  be
               deleted  by  software  that ill-advisedly rewrites
               headers, but at least it  doesn't  break  existing
-More-               software.

          To  repeat:  followup  agents  SHOULD not shorten References
          headers.

               NOTE:  Unfortunately,  reading  agents  and  other
               software  analyzing References patterns have to be
               prepared for the worst anyway.  The worst includes
               random  deletions  and the possibility of circular
               References chains (when References is  misused  in
               place of See-Also, section 6.16).


          6.6. Control

          The  Control  header  content marks the article as a control
          message, and specifies the desired actions (other  than  the
          usual ones of filing and passing on the article):

               Control-content  = verb *( space argument )
               verb             = 1*( letter / digit )
               argument         = 1*<ASCII printable character>

-More-          The  verb  indicates  what  action  should be taken, and the
          argument(s) (if any) supply details.   In  some  cases,  the
          body  of  the  article  may also contain details.  Section 7
          describes the standard verbs.   See  also  the  Also-Control
          header (section 6.15).

               NOTE:  Control  messages  are  often processedand
               filed rather differently than normal articles.

               NOTE: The restriction of verbs to letters and dig-
               its  is new, but is consistent with existing prac-
               tice and potentially simplifies implementation  by
               avoiding  characters significant to command inter-
               preters.  Beware that the arguments are  under  no
               such restriction in general.

               NOTE:  Two  other  conventions  for distinguishing
               control messages from normal  articles  were  for-
               merly  in  use:  a  three-component newsgroup name
               ending in  ".ctl"  or  a  subject  beginning  with
               "cmsg "  was  considered to imply that the article
               was a  control  message.   These  conventions  are
               obsolete.  Do not use them.
-More-
          An  article  with  a  Control  header MUST not have an Also-
          Control or Supersedes header.





          2 June 1994                - 47 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.7


          6.7. Distribution

          The Distribution  header  content  specifies  geographic  or
          organizational limits on an article's propagation:

               Distribution-content  = distribution *( dist-delim distribution
-More-)
               dist-delim            = ","
               distribution          = plain-component

          A distribution is syntactically identical to a one-component
          newsgroup name, and must satisfy the same rules and restric-
          tions.   In the absence of Distribution, the default distri-
          bution is "world".

               NOTE: This syntax has the disadvantage of contain-
               ing  no  white space, making it impossible to con-
               tinue a Distribution header across several  lines.
               Implementors  of  relayers  and reading agents are
               warned that it is intended that the  successor  to
               this  Draft  will  change  the  definition of dist
               delimiter to:

                    dist-delim = "," [ space ]

               and are urged to  fix  their  software  to  handle
               (i.e., ignore) white space following the commas.

          A relayer MUST not pass an article to another relayer unless
-More-          configuration information  specifies  transmission  to  that
          other  relayer  of  BOTH  (a)  at least one of the ar

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4803 *bbs.tbbs*
10-08-94 11:30:31
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
ticle's
          newsgroup(s), and (b) at least one of the article's  distri-
          bution(s).   In effect, the only role of distributions is to
          limit propagation, by preventing  transmission  of  articles
          that would have been transmitted had the decision been based
          solely on newsgroups.

          A posting agent might wish to present  a  menu  of  possible
          distributions, or suggest a default, but normally SHOULD not
          supply a default without giving the poster a chance to over-
          ride  it.  A followup agent SHOULD initially supply the same
          Distribution header as found in the precursor, although  the
          poster MAY alter this if appropriate.

          Despite  the syntactic similarity and some historical confu-
          sion, distributions are  NOT  newsgroup  names.   The  whole
          point  of putting a distribution on an article is that it is
          DIFFERENT from the newsgroup(s).  In general,  a  meaningful
-More-          distribution  corresponds to some sort of region of propaga-
          tion: a geographical area, an organization, or a cooperating
          subnet.

               NOTE:  Distributions  have  historically  suffered
               from the completely uncontrolled nature  of  their
               name  space,  the  lack  of feedback to posters on



          2 June 1994                - 48 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.7


               incomplete propagation resulting from use of  ran-
               dom  trash  in Distribution headers, and confusion
               with  newsgroups  (arising  partly  because   many
               regions  and  organizations DO have internal news-
-More-               groups with names resembling their  internal  dis-
               tributions).  This has resulted in much garbage in
               Distribution headers, notably the pointless  prac-
               tice  of  automatically supplying the first compo-
               nent of  the  newsgroup  name  as  a  distribution
               (which is MOST unlikely to restrict propagation!).
               Many sites have opted to maximize  propagation  of
               such  ill-formed  articles by essentially ignoring
               distributions.  This unfortunately interferes with
               legitimate uses.  The situation is bad enough that
               distributions must be considered  largely  useless
               except  within  cooperating  subnets  that make an
               organized effort to restrain propagation of  their
               internal distributions.

               NOTE:  The  distributions "world" and "local" have
               no standard magic meaning (except that the  former
               is  the  default  distribution  if none is given).
               Some pieces of software do assign such meanings to
               them.


          6.8. Keywords
-More-
          The  Keywords header content is one or more phrases intended
          to describe some aspect of the content of the article:

               Keywords-content = plain-phrase *( "," [ space ] plain-phrase )

          Keywords, separated  by  commas,  each  follow  the  <plain-
          phrase>  syntax  defined  in  section 5.2.  Encoded words in
          keywords MUST not contain characters other than letters  (of
          either case), digits, and the characters "!", "*", "+", "-",
          "/", "=", and "_".

               NOTE: Posters and posting agents are asked to take
               note that keywords are separated by commas, not by
               white space.  The following Keywords  header  con-
               tains  only  one  keyword  (a  rather unlikely and
               improbable one):

                    Keywords: Thompson Ritchie Multics Linux

               and should probably have been written:

                    Keywords: Thompson, Ritchie, Multics, Linux
-More-
               This  particular  error  is  unfortunately  rather
               widespread.




          2 June 1994                - 49 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 6.8


               NOTE:   Reading  agents  and  archivers  preparing
               indexes of articles should bear in mind that user-
               chosen  keywords are notoriously poor for indexing
               purposes unless the keywords  are  picked  from  a
               predefined  set (which they are not in this case).
               Also, some followup agents unwisely propagate  the
               Keywords  header  from the precursor into the fol-
-More-               lowup by default.  At least one news-based experi-
               ment has found the contents of Keywords headers to
               be completely valueless for indexing.


          6.9. Summary

          The Summary header content is a short phrase summarizing the
          article's content:

               Summary-content = nonblank-text

          As with the subject, no restriction is placed on the content
          since it is intended solely for display to humans.

               NOTE: Reading agents should be aware that the Sum-
               mary  header  is often used as a sort of secondary
               Subject header,  and  (if  present)  its  contents
               should  perhaps  be  displayed when the subject is
               displayed.

          The summary SHOULD be terse.  Posters SHOULD avoid trying to
          cram  their  entire  article into the headers; even the sim-
-More-          plest query usually benefits from a sentence or two of elab-
          oration  and context, and not all reading agents display all
          headers.


          6.10. Approved

          The Approved header content indicates the mailing  addresses
          (and  possibly  the  full  names) of the persons or entities
          approving the article for posting:

               Approved-content = From-content *( "," [ space ] From-content )

          An Approved header is required in all postings to  moderated
          newsgroups;  the presence or absence of this header allows a
          posting agent to distinguish between articles posted by  the
          moderator  (which are normal articles to be posted normally)
          and attempted  contributions  by  others  (which  should  be
          mailed  to  the moderator for approval).  An Approved header
          is also required in certain control messages, to reduce  the
          probability  of accidental posting of same; see the relevant
          parts of section 7.

-More-



          2 June 1994                - 50 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 6.10


               NOTE: There is, at present, no way to authenticate
               Approved   headers  to  ensure  that  the  claimed
               approval really was bestowed.   Nor  is  there  an
               established  mechanism for even maintaining a list
               of legitimate approvers (such a list would quickly
               become  out  of date if it had to be maintained by
               hand).  Such  mechanisms,  presumably  relying  on
               cryptographic  authentication,  would  be a worth-
               while extension to this  Draft,  and  experimental
               work  in this area is encouraged.  (The problem is
-More-               harder than it sounds because news is used on many
               systems  which do not have real-time access to key
               servers.)

               NOTE: Relayer implementors, please note  well:  it
               is the POSTING AGENT that is authorized to distin-
               guish between  moderator  postings  and  attempted
               contributions,  and to mail the latter to the mod-
               erator.  As discussed  in  section  9.1,  relayers
               MUST  not,  repeat  MUST  not,  send such mail; on
               receipt of an unApproved article  in  a  moderated
               newsgroup,  they  should  discard the article, NOT
               transform it into a mail message  (except  perhaps
               to a local administrator).

               NOTE:  RFC  1036  restricted  Approved to a single
               From-content.  However, multiple moderation is  no
               longer  rare, and multi-moderator Approved headers
               are already in use.


          6.11. Lines

-More-          The Lines header content indicates the number  of  lines  in
          the body of the article:

               Lines-content = 1*digit

          The line count includes all body lines, including the signa-
          ture if any, including empty lines (if any) at beginning  or
          end  of  the body.  (The single empty separator line between
          the headers and the body is not  part  of  the  body.)   The
     

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4806 *bbs.tbbs*
10-08-94 11:30:45
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
     "body"  here  is  the  body  as found in the posted article,
          AFTER all transformations such as MIME encodings.

          Reading agents SHOULD not  rely  on  the  presence  of  this
          header, since it is optional (and some posting agents do not
          supply it).  They MUST not rely on it being  precise,  since
          it frequently is not.

               NOTE: The average line length in article bodies is
               surprisingly consistent at  about  40  characters,
               and  since  the  line count typically is used only
               for approximate judgements ("is this too  long  to



          2 June 1994                - 51 -       expires 15 July 1994


-More-


          INTERNET DRAFT to be        NEWS                   sec. 6.11


               read  quickly?"),  dividing  the byte count of the
               body by 40 gives an  estimate  of  the  body  line
               count that is adequate for normal use.  This esti-
               mate is NOT adequate if the  body  has  been  MIME
               encoded...  but neither is the Lines header, since
               at least one major relayer  will  supply  a  Lines
               header for an article that lacks one, and will not
               consider the possibility of  MIME  encodings  when
               computing the line count.

               NOTE:  It  would  be better to have a Content-Size
               header as part of MIME, so that body  parts  could
               have  their  own sizes, and so that the units used
               could be appropriate to the data type (line  count
               is  not a useful measure of the size of an encoded
               image, for example).  Doing this is preferable  to
               trying to fix Lines.
-More-
               UNRESOLVED ISSUE: Update on Content-Size?

          Relayers  SHOULD  discard this header if they find it neces-
          sary to re-encode the article in such a way that the  origi-
          nal Lines header would be rendered incorrect.


          6.12. Xref

          The Xref header content indicates where an article was filed
          by the last relayer to process it:

               Xref-content     = relayer 1*( space location )
               relayer          = relayer-name
               location         = newsgroup-name ":" article-locator
               article-locator  = 1*<ASCII printable character>

          The relayer's name is included so that software  can  deter-
          mine  which  relayer generated the header (and specifically,
          whether it really was the one  that  filed  the  copy  being
          examined).   The locations specify what newsgroups the arti-
          cle was filed under (which may  differ  from  those  in  the
-More-          Newsgroups  header)  and where it was filed under them.  The
          exact form of an article locator is implementation-specific.

               NOTE:  Reading agents can exploit this information
               to avoid presenting the same article to  a  reader
               several   times.   The  information  is  sometimes
               available in system databases, but  having  it  in
               the article is convenient.  Relayers traditionally
               generate an Xref header only  if  the  article  is
               cross-posted, but this is not mandatory, and there
               is at  least  one  new  application  ("mirroring":
               keeping  news  databases  on  two hosts identical)
               where the header is useful in all articles.




          2 June 1994                - 52 -       expires 15 July 1994





-More-          INTERNET DRAFT to be        NEWS                   sec. 6.12


               NOTE: The traditional form of an  article  locator
               is  a  decimal number, with articles in each news-
               group  numbered  consecutively  starting  from  1.
               NNTP  [rrr] demands that such a model be provided,
               and there may be other sotware which expects  it,
               but  it  seems desirable to permit flexibility for
               unorthodox implementations.

          A relayer inserting an Xref  header  into  an  article  MUST
          delete  any  previous  Xref  header.  A relayer which is not
          inserting its own Xref header  SHOULD  delete  any  previous
          Xref  header.   A  relayer  MAY  delete the Xref header when
          passing an article on to another relayer.

               NOTE: RFC 1036 specified that the Xref header  was
               not  transmitted  when  an  article  was passed to
               another relayer, but the  major  news  implementa-
               tions  have  never  obeyed this rule, and applica-
               tions like mirroring depend on this  disobedience.

-More-          A  relayer MUST use the same name in Xref headers as it uses
          in Path headers.  Reading agents MUST ignore an Xref  header
          containing  a  relayer  name  that differs from the one that
          begins the path list.


          6.13. Organization

          The Organization header content is a short phrase  identify-
          ing the poster's organization:

               Organization-content = nonblank-text

          This header is typically supplied by the posting agent.  The
          Organization content SHOULD  mention  geographical  location
          (e.g.  city  and  country)  when  it is not obvious from the
          organization's name.

               NOTE: The motive here is that the organization  is
               often difficult to guess from the mailing address,
               is not always supplied in  a  signature,  and  can
               help identify the poster to the reader.

-More-               NOTE: There is no "s" in "Organization".

          The  Organization  content  is  provided  for identification
          only, and does not imply that  the  poster  speaks  for  the
          organization  or  that  the  article represents organization
          policy.  Posting agents SHOULD permit the poster to override
          a local default Organization header.







          2 une 1994                - 53 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 6.14


-More-          6.14. Supersedes

          The  Supersedes header content specifies articles to be can-
          celled on arrival of this one:

               Supersedes-content = message-id *( space message-id )

          Supersedes is equivalent to Also-Control (section 6.15) with
          an implici verb of "cancel" (section 7.1).

               NOTE:  Supersedes is normally used where the arti-
               cle is an updated version of the one(s) being can-
               celled.

               NOTE: Although the ability to use multiple message
               IDs in Supersedes is highly desirable (see section
               7.1), posters are warned that existing implementa-
               tions often do not correctly handle more than one.

               NOTE: There is no "c" in "Supersedes".

          An  article  with a Supersedes header MUST not have an Also-
          Control or Control header.
-More-

          6.15. Also-Control

          The Also-Control header content marks the article as being a
          control  message IN ADDITION to being a normal news article,
          and specifies the desired actions:

               Also-Control-content = Control-content

          An article with an Also-Control header is filed  and  passed
          on  normally,  but the content of the Also-Control header is
          processed as if it were found in a Control header.

               NOTE: It is sometimes desirable to piggyback  con-
               trol  actions  on  a  normal  article, so that the
               article will be filed normally but  will  also  be
               acted  on  as  a  control message.  This header is
               essentially a generalization of Supersedes.

               NOTE: Be warned that  some  old  relayers  do  not
               implement Also-Control.

-More-          An  article with an Also-Control header MUST not have a Con-
          trol or Supersedes header.


          6.16. See-Also

          The See-Also header content lists message  IDs  of  articles
          that are related to this one but are not its precursors:



          2 June 1994                - 54 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 6.16


               See-Also-content = message-id *( space message-id )

          See-Also  resembles References, but without the restrictions
-More-          imposed on References by the followup rules.

               NOTE: See-Also provides a  way  to  group  related
               articles,  such  as the parts of a single document
               that had to be split across multiple articles  due
               to  i

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4809 *bbs.tbbs*
10-08-94 11:31:00
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
ts size, or to cross-reference between paral-
               lel threads.

               NOTE: See the discussion (in section 6.5) on  MAIL
               compatibility issues of References and See-Also.

               NOTE:  In the specific case where it is desired to
               essentially make another article PART of the  cur-
               rent  one,  e.g. for annotation of the other arti-
               cle, MIME's "message/external-body" convention can
               be used to do so without actual inclusion.  "news-
               message-ID" was registered as a standard external-
               body  access method, with a mandatory NAME parame-
               ter giving the message ID  and  an  optional  SITE
               parameter  suggesting an NNTP site that might have
               the article available  (if  it  is  not  available
               locally), by IANA 22 June 1993.

-More-               UNRESOLVED  ISSUE: Could the syntax be generalized
               to include URLs as alternatives  to  message  IDs?
               Here  it makes much more sense than in References.


          6.17. Article-Names

          The Article-Names header content indicates any special  sig-
          nificance the article may have in particular newsgroups:

               Article-Names-content  = 1*( name-clause space )
               name-clause            = newsgroup-name ":" article-name
               article-name           = letter 1*( letter / digit / "-" )

          Each  name  clause  specifies  a  newsgroup (which SHOULD be
          among those in the Newsgroups header) and  an  article  name
          local  to  that  newsgroup.   Article  names  MAY be used by
          relayers to file the article in special ways,  or  they  MAY
          just  be  noted  for  possible  special attention by reading
          agents.  Article names are case-sensitive.

               NOTE: This header provides a way to  mark  special
               postings, such as introductions, frequently-asked-
-More-               question lists, etc., so that reading agents  have
               a  way  of  finding them automatically.  The news-
               group name is  specified  for  each  article  name
               because  the  names may be newsgroup-specific; for
               example, many frequently-asked-question lists  are



          2 June 1994                - 55 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 6.17


               posted  to  "news.answers"  in  addition  to their
               "home" newsgroup, and they would not be  known  by
               the same name(s) in both newsgroups.

          The  Article-Names header SHOULD be ignored unless the arti-
          cle also contains an Approved header.
-More-
               NOTE: This stipulation is made in anticipation  of
               the  possibility  that  Approved  headers  will be
               involved in cryptographic authentication.

          The presence of an Article-Names header does not necessarily
          imply  that  the  article  will  be  retained unusually long
          before expiration, or that previous article(s) with  similar
          Article-Names  headers  will  be  cancelled  by its arrival.
          Posters preparing special postings SHOULD include  appropri-
          ate  other  headers,  such  as  Expires  and  Supersedes, to
          request such actions.

          Different networks MAY establish different sets  of  article
          names  for the special postings they deem significant; it is
          preferable for usage to  be  standardized  within  networks,
          although  it might be desirable for individual newsgroups to
          have different naming conventions in some situations.  Arti-
          cle  names  MUST  be  14  characters or less.  The following
          names are suggested but are not mandatory:

          intro       Introduction to the newsgroup for newcomers.

-More-          charter     Charter, rules, organization,  moderation  poli-
                      cies, etc.

          background  Biographies  of special participants, history of
                      the newsgroup, notes on related newsgroups, etc.

          subgroups   Descriptions  of sub-newsgroups under this news-
                      group, e.g. "sci.space.news" under  "sci.space".

          facts       Information relating to the purpose of the news-
                      group, e.g. an acronym glossary in  "sci.space".

          references  Where  to get more information: books, journals,
                      FTP repositories, etc.

          faq         Answers to frequently-asked questions.

          menu        If present, a list  of  all  the  other  article
                      names   local  to  this  newsgroup,  with  brief
                      descriptions of their contents.

          Such articles may be divided into subsections using the MIME
          "multipart/mixed"  conventions.  If size considerations make
-More-          it necessary to split  such  articles,  names  ending  in  a
          hyphen  and  a  part  number  are  suggested; for example, a



          2 June 1994                - 56 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 6.17


          three-part frequently-asked-questions list could have  arti-
          cle names "faq-1", "faq-2", and "faq-3".

               NOTE: It is somewhat premature to attempt to stan-
               dardize article names, since this is essentially a
               new  feature  with  no experience behind it.  How-
               ever, if reading agents are to attach special sig-
               nificance to these names, some attempt at standard
               conventions  is  imperative.   This  is  a   first
-More-               attempt at providing some.


          6.18. Article-Updates

          The  Article-Updates  header content indicates what previous
          articles this one is deemed (by the poster) to update (i.e.,
          replace):

               Article-Updates-content  = message-id *( space message-id )

          Each  message ID identifies a previous article that this one
          is deemed to update.  This MUST not cause the previous arti-
          cle(s)  to be cancelled or otherwise altered, unless this is
          implied by other headers (e.g. Supersedes);  Article-Updates
          is  merely an advisory which MAY be noted for special atten-
          tion by reading agents.

               NOTE: This header provides a way to mark  articles
               which  are  only  minor  updates of previous ones,
               containing no significant new information and  not
               worth reading if the previous ones have been read.

-More-               NOTE: If suitable conventions using MIME multipart
               bodies  and  the "message/external-body" body-part
               type can be developed, a replacing  article  might
               contain  only differences between the old text and
               the new text, rather than  a  complete  new  copy.
               This  is  the  motivation  for not making Article-
               Updates also  function  as  Supersedes  does:  the
               replacing  article  might  depend on the continued
               presence of the replaced article.


          7. Control Messages

          The following sections document the  currently-defined  con-
          trol  messages.   "Message"  is used herein as a synonym for
          "article" unless context indicates otherwise.

          Posting agents are warned that since  certain  control  mes-
          sages require article bodies in quite specific formats, sig-
          natures SHOULD not be appended to such articles, and it  may
          be  wise to take greater care than usual to avoid unintended
          (although perhaps well-meaning) alterations to text supplied

-More-

          2 June 1994                - 57 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. 7


          by  the  poster.  Relayers MUST assume that control messages
          mean what they say; they MAY be obeyed as  is  or  rejected,
          but MUST not be reinterpreted.

          The  execution  of the actions requested by control messages
          is subject to local administrative restrictions,  which  MAY
          deny   requests  or  refer  them  to  an  administrator  for
          approval.  The descriptions below are generally  phrased  in
          terms  suggesting mandatory actions, but any or all of these
          MAY be subject to local administrative approval (either as a
          class  or case-by-case).  Analogously, where the description
          below specifies that a message or portion thereof is  to  be
-More-          ignored, this action MAY include reporting it to an adminis-
     

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4812 *bbs.tbbs*
10-08-94 11:31:15
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
     trator.

               NOTE: The  exact  choice  of  local  action  might
               depend   on   what   action  the  control  message
               requests, who it claims to come from, etc.

          Relayers MUST propagate even control messages  they  do  not
          understand.

          In  the  following sections, each type of control message is
          defined syntactically by  defining  its  arguments  and  its
          body.   For example, "cancel" is defined by defining cancel-
          arguments and cancel-body.


          7.1. cancel

          The cancel message requests that one or more previous  arti-
-More-          cles be "cancelled":

               cancel-arguments  = message-id *( space message-id )
               cancel-body       = body

          The  argument(s)  identify  the articles to be cancelled, by
          message ID.  The body is  a  comment,  which  software  MUST
          ignore,  and SHOULD contain an indication of why the cancel-
          lation was requested.  The cancel message SHOULD  be  posted
          to  the same newsgroup(s), with the same distribution(s), as
          the article(s) it is attempting to cancel.

               NOTE: Using the same newsgroups and  distributions
               maximizes the chances of the cancel message propa-
               gating everywhere the target articles went.

               NOTE: RFC 1036 permitted only a single  message-id
               in  a cancel message.  Support for cancelling mul-
               tiple articles is highly desirable, especially for
               use  with  Supersedes (see section 6.14).  If sev-
               eral revisions of an article appear in  fast  suc-
               cession,  each using Supersedes to cancel the pre-
               vious one, it is possible for a middle revision to
-More-


          2 June 1994                - 58 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.1


               be  destroyed  by cancellation before it is propa-
               gated onward to cancel its predecessor.   Allowing
               each   article   to  cancel  several  predecessors
               greatly alleviates this problem.  (Posting  agents
               preparing a cancel of an article which itself can-
               cels other articles might wish to add those  arti-
               cles  to  the cancel-arguments.)  However, posters
               should be aware that much old  software  does  not
               implement   multiple  cancellation  properly,  and
               should avoid using it when  reliable  cancellation
               is vitally important.
-More-
          When  an  article (the "target article") is to be cancelled,
          there are four cases of interest: the article hasn't arrived
          yet,  it  has  arrived  and  been filed and is available for
          reading, it has expired and  been  archived  on  some  less-
          accessible  storage  medium,  or  it  has  expired  and been
          deleted.  The next few paragraphs discuss each case in  turn
          (in reverse order, which is convenient for the explanation).

          EXPIRED AND DELETED.  Take no action.

          EXPIRED AND ARCHIVED.  If the article is readily  accessible
          and can be deleted or made unreadable easily, treat as under
          AVAILABLE below.   Otherwise  treat  as  under  EXPIRED  AND
          DELETED.

               NOTE:  While it is desirable for archived articles
               to be cancellable, this can easily involve rewrit-
               ing  an  entire  archive volume just to get rid of
               one article, perhaps with manual actions  required
               to arrange it.  It is difficult to envision a sit-
               uation so dire as to require  such  measures  from
               hundreds  or  thousands  of administrators, or for
-More-               that matter one  in  which  widespread  compliance
               with such a request is likely.

          AVAILABLE.   Compare  the  mailing  addresses  from the From
          lines of the cancel message and the target article,  bearing
          in mind that local parts (except for "postmaster") are case-
          sensitive and domains are case-insensitive.  If they do  not
          match,  either  refer  the  issue  to an administrator for a
          case-by-case decision, or treat as if they matched.

               NOTE: It is generally trivial to  forge  articles,
               so  nothing  short of cryptographic authentication
               is really adequate to ensure that  a  cancel  came
               from  the original article's author.  Moreover, it
               is highly desirable to  permit  authorities  other
               than  the  author to cancel articles, to allow for
               cases in which the author is unavailable,  uncoop-
               erative,  or malicious, and in which damage and/or
               legal problems may be minimized by prompt  cancel-
               lation.  Reliable authentication that would permit



-More-          2 June 1994                - 59 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.1


               such administrative cancels would be a  worthwhile
               extension  to this Draft, and experimental work in
               this area is encouraged.

               NOTE: Meanwhile, a simple check  of  addresses  is
               useful  accident  prevention  and catches at least
               the most simple-minded forgers.  Since the  intent
               is  accident prevention rather than ironclad secu-
               rity, use of the From address is appropriate,  all
               the  more  so  because in the presence of gateways
               (especially  redundant  multiple  gateways),   the
               author may not have full control over Sender head-
               ers.

-More-               NOTE: The "refer... or treat as if  they  matched"
               rule  is  intended  to specifically forbid quietly
               ignoring cancels with mismatched addresses.

          If the addresses match, then if  technically  possible,  the
          relayer  MUST delete the target article completely and imme-
          diately.  Failing that, it  MUST  make  the  target  article
          unreadable  (preferably  to  everyone, minimally to everyone
          but the administrator) and  either  arrange  for  it  to  be
          deleted  as  soon  as possible or notify an administrator at
          once.

               NOTE:  To  allow  for  events  such  as   criminal
               actions,   malicious   forgeries,   and  copyright
               infringements, where damage and/or legal  problems
               may  be minimized by prompt cancellation, complete
               removal is strongly preferred over  merely  making
               the  target article unreadable.  The potential for
               malice is outweighed by the importance  of  really
               getting  rid of the target article in some legiti-
               mate cases.  (In cases  of  inadvertent  copyright
               violation  in  particular,  the ability to quickly
               remedy the  violation  is  of  considerable  legal
-More-               importance.)   Failing  that, making it unreadable
               is better than nothing.

               NOTE: Merely annotating the article so that  read-
               ers  see  an  indication that the author wanted it
               cancelled is not acceptable.  Making  the  article
               unreadable is the minimum action.

               NOTE: There have been experiments with making can-
               celled articles unreadable,  so  that  local  news
               administrators  could  revers  cancellations.  In
               practice, administrators almost never  find  cause
               to  do  so.  Removal appears to be clearly prefer-
               able where technically feasible.

          NOT ARRIVED YET.  If practical, retain  the  cancel  message
          until  the  target article does arrive, or until there is no



          2 June 1994                - 60 -       expires 15 July 1994


-More-


          INTERNET DRAFT to be        NEWS                    sec. 7.1


          further possibility of it arriving and being  accepted  (see
          section  9.2),  and  then treat as under AVAILABLE.  Failing
          that, arrange for the target article to be rejected and dis-
          carded if it does arrive.

               NOTE:  It  may  well  be impractical to retain the
               control message, given uncertainty  about  whether
               the  target  article  will  ever arrive.  Existing
               practice in such cases is to assume that addresses
               would  match  and  arrange the equivalent of dele-
               tion.  This

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4815 *bbs.tbbs*
10-08-94 11:31:30
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 is often done  by  making  a  spurious
               entry  in  a  database of already-seen message IDs
               (see section 9.3), so that  if  the  article  does
               arrive, it will be rejected as a duplicate.

          The  cancel  message  MUST be propagated onward in the usual
          fashion, regardless of which of the four cases  applied,  so
          that the target article will be cancelled everywhere even if
          cancellation and target article follow different routes.

               NOTE: RFC 1036 appeared to require stopping cancel
               propagation  in the NOT ARRIVED YET case, although
               the wording was somewhat unclear.  This appears to
               have  been  an  unwise  decision;  there are known
               cases of important  cancellations  (in  situations
               of, e.g., inadvertent copyright violation) achiev-
               ing rather  poorer  propagation  than  the  target
               article.   News  propagation  is often a much less
-More-               orderly process  than  the  authors  of  RFC  1036
               apparently   envisioned.   Modern  implementations
               generally propagate the cancellation regardless.

          Posting agents meant for  use  by  ordinary  posters  SHOULD
          reject  an  attempt  to  post a cancel message if the target
          article is available and the mailing  address  in  its  From
          header  does  not match the one in the cancel message's From
          header.

               NOTE: This, again, is primarily  accident  preven-
               tion.


          7.2. ihave, sendme

          The  ihave  and  sendme  control  messages implement a crude
          batched predecessor of the NNTP [rrr]  protocol.   They  are
          largely  obsolete  in the Internet, but still see use in the
          UUCP environment, especially for backup feeds that  normally
          are active only when a primary feed path has failed.

               NOTE:  The  ihave and sendme messages defined here
-More-               have ABSOLUTELY NOTHING TO DO WITH  NNTP,  despite
               similarities of terminology.




          2 June 1994                - 61 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.2


          The two messages share the same syntax:

               ihave-arguments   = *( message-id space ) relayer-name
               sendme-arguments  = ihave-arguments
               ihave-body        = *( message-id eol )
               sendme-body       = ihave-body

          Message IDs MUST appear in either the arguments or the body,
-More-          but not both.  Relayers SHOULD  generate  the  form  putting
          message  IDs  in  the  body, but the other form MUST be sup-
          ported for backward compatibility.

               NOTE: RFC 1036 made the relayer name optional, but
               difficulties could easily ensue in determining the
               origin of the message, and this option is believed
               to be unused nowadays.  Putting the message IDs in
               the body is strongly preferred over  putting  them
               in the arguments because it lends itself much bet-
               ter to large numbers of message IDs and avoids the
               empty-body problem mentioned in section 4.3.1.

          The  ihave  message  states that the named relayer has filed
          articles with the specified message IDs,  which  may  be  of
          interest to the relayer(s) receiving the ihave message.  The
          sendme message requests that the relayer receiving  it  send
          the  articles  having the specified message IDs to the named
          relayer.

          These control messages  are  normally  sent  essentially  as
          point-to-point messages, by using "to." newsgroups (see sec-
          tion 5.5) that are sent only to the relayer the messages are
-More-          intended  for.  The two relayers MUST be neighbors, exchang-
          ing news directly with each other.  Each relayer  advertises
          its new arrivals to the other using ihave messages, and each
          uses sendme messages to request the articles it lacks.

               NOTE: Arguably these point-to-point  control  mes-
               sages  should  flow  by  some other protocol, e.g.
               mail, but administrative  and  interfacing  issues
               are  simplified if the news system doesn't need to
               talk to the mail system.

          To reduce overhead, ihave and sendme messages SHOULD be sent
          relatively  infrequently and SHOULD contain substantial num-
          bers of message IDs.  If ihave and sendme are being used  to
          implement  a  backup  feed,  it may be desirable to insert a
          delay between reception of an  ihave  and  generation  of  a
          sendme,  so that a slightly slow primary feed will not cause
          large numbers of articles to be requested unnecessarily  via
          sendme.




-More-


          2 June 1994                - 62 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.3


          7.3. newgroup

          The  newgroup  control message requests that a new newsgroup
          be created:

               newgroup-arguments  = newsgroup-name [ space moderation ]
               moderation          = "moderated" / "unmoderated"
               newgroup-body       = body
                                   / [ body ] descriptor [ body ]
               descriptor          = descriptor-tag eol description-line eol
               descriptor-tag      = "For your newsgroups file:"
-More-               description-line    = newsgroup-name space description
               description         = nonblank-text [ " (Moderated)" ]

          The first argument names the newsgroup to  be  created,  and
          the  second  one (if present) indicates whether it is moder-
          ated.  If there  is  no  second  argument,  the  default  is
          "unmoderated".

               NOTE:  Implementors are warned that there is occa-
               sional use of other forms in the second  argument.
               It  is  suggested  that  such  violations  of this
               Draft, which are  also  violations  of  RFC  1036,
               cause  the  newgroup  message  to be ignored.  RFC
               1036 was slightly vague about how second arguments
               other than "moderated" were to be treated (specif-
               ically,  whether  they  were   illegal   or   just
               ignored),  but  it  is  thought  that all existing
               major implementations  will  handle  "unmoderated"
               correctly,  and it appears desirable to tighten up
               the specs to make it possible for other  forms  to
               be used in future.

          The  body  is  a comment, which software MUST ignore, except
-More-          that if it contains a descriptor, the  description  line  is
          intended  to be suitable for addition to a list of newsgroup
          descriptions.  The  description  cannot  be  continued  onto
          later  lines,  but  is  not  constrained  to  any particular
          length.  Moderated newsgroups  have  descriptions  that  end
          with the string " (Moderated)" (note that this string begins
         with a blank).

               NOTE: It is unfortunate that the description  line
               is part of the body, rather than being supplied in
               a header, but this is established practice.  News-
               group  creators  are cautioned that the descriptor
               tag must be reproduced  exactly  as  given  above,
               alone  on  a  line,  and  is  case-sensitive.  (To
               reduce errors in this regard, posting agents might
               wish to question or reject newgroup messages which
               do not contain a descriptor.)   Given  the  desire
               for  short lines, description writers should avoid
               content-free  phrases  like  "discussion  of"  and
               "news  about",  and  stick  to  defining  what the



-More-          2 June 1994                - 63 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.3


               newsgroup is about.

          The remainder of the body SHOULD contain an  explanation  of
          the  purpose of the newsgroup and the decision to create it.

               NOTE: Criteria for newsgroup creation vary  widely
               and  are  outside  the scope of this Draft, but if
               formal procedures of one kind or another were fol-
               lowed  in  the  decision,  the body should mention
               this.  Administrators often look for such informa-
               tion  when  deciding  whether  to comply with cre-
      

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4818 *bbs.tbbs*
10-08-94 11:31:45
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
         ation/deletion requests.

          A newgroup message which lacks an Approved  header  MUST  be
          ignored.

               NOTE:  It would also be desirable to ignore a new-
               group message unless its Approved header  names  a
               person who is authorized (in some sense) to create
               such a newsgroup.  A cooperating subnet with  suf-
               ficiently  strong  coordination to maintain a cor-
               rect and current list of authorized creators might
               wish  to  do  so  for its internal newsgroups.  It
               also (or alternatively) might  wish  to  ignore  a
               newgroup  message  for  an internal newsgroup that
               was posted (or  cross-posted)  to  a  non-internal
               newsgroup.

               NOTE:  As  mentioned in section 6.10, some form of
-More-               (cryptographic?) authentication of Approved  head-
               ers would be highly desirable, especially for con-
               trol messages.

          It would be desirable to provide some  way  of  suplying  a
          moderator's  address  in  a newgroup message for a moderated
          newsgroup, but this will  cause  problems  unless  effective
          authentication  is available, so it is left for future work.

               NOTE: This leaves news administrators  stuck  with
               the  annoying chore of arranging proper mailing of
               moderated-newsgroup submissions.  On Usenet,  this
               can  be  simplified  by  exploiting  a  forwarding
               facility that some major sites provide: they main-
               tain forwarding addresses, each the name of a mod-
               erated newsgroup with all periods (".", ASCII  46)
               replaced by hyphens ("-", ASCII 45), which forward
               mail to the current  newsgroup  moderators.   More
               advice  on the subject of forwarding to moderators
               can be found in the document titled "How  to  Con-
               struct  the  Mailpaths  File", posted regularly to
               the Usenet newsgroups news.lists, news.admin.misc,
               and news.answers.
-More-




          2 June 1994                - 64 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.3


          A newgroup message naming a newsgroup that already exists is
          requesting a change in the moderation status or  description
          of the newsgroup.  The same rules apply.


          7.4. rmgroup

          The rmgroup messag requests that a newsgroup be deleted:

-More-               rmgroup-arguments  = newsgroup-name
               rmgroup-body       = body

          The sole argument is the newsgroup name.  The body is a com-
          ment, which software  MUST  ignore;  it  SHOULD  contain  an
          explanation of the decision to delete the newsgroup.

               NOTE:  Criteria for newsgroup deletion vary widely
               and are outside the scope of this  Draft,  but  if
               formal procedures of one kind or another were fol-
               lowed in the decision,  the  body  should  mention
               this.  Administrators often look for such informa-
               tion when deciding whether  to  comply  with  cre-
               ation/deletion requests.

          A  rmgroup  message  which  lacks an Approved header MUST be
          ignored.

               NOTE: It would  also  be  desirable  to  ignore  a
               rmgroup message unless its Approved header names a
               person who is authorized (in some sense) to delete
               such  a newsgroup.  A cooperating subnet with suf-
               ficiently strong coordination to maintain  a  cor-
-More-               rect and current list of authorized deleters might
               wish to do so for  its  internal  newsgroups.   It
               also  (or  alternatively)  might  wish to ignore a
               rmgroup message for an internal newsgroup that was
               posted  (or  cross-posted) to a non-internal news-
               group.

          Unexpected  deletion  of  a  newsgroup  being  a  disruptive
          action,   implementations  are  strongly  advised  to  refer
          rmgroup messages to an administrator by default, unless per-
          haps the message can be determined to have originated within
          a cooperating subnet whose members are considered  trustwor-
          thy.  Abuses have occurred.


          7.5. sendsys, version, whogets

          The  sendsys  message  requests  that  a  description of the
          relayer's news feeds to other  relayers  be  mailed  to  the
          article's reply address:



-More-

          2 June 1994                - 65 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.5


               sendsys-arguments  = [ relayer-name ]
               sendsys-body       = body

          If  there  is an argument, relayers other than the one named
          by the argument MUST not respond.  The body  is  a  comment,
          which software MUST ignore; it SHOULD contain an explanation
          of the reason for the request.

          The version message requests that the name  and  version  of
          the relayer software be mailed to the reply address:

               version-arguments  =
-More-               version-body       = body

          There  are no arguments.  The body is a comment, which soft-
          ware MUST ignore; it SHOULD contain an  explanation  of  the
          reason for the request.

          The  whogets  message  requests  that  a  description of the
          relayer and its news feeds to other relayers  be  mailed  to
          the article's reply address:

               whogets-arguments  = newsgroup-name [ space relayer-name ]
               whogets-body       = body

          The  first  argument  is the name of the "target newsgroup",
          specifying the newsgroup for which  propagation  information
          is desired.  This MUST be a complete newsgroup name, not the
          name of a hierarchy or a portion of a newsgroup name that is
          not  itself  the  name of a newsgroup.  If there is a second
          argument, only the relayer named  by  that  argument  should
          respond.  The body is a comment, which software MUST ignore;
          it SHOULD contain an  explanation  of  the  reason  for  the
          request.

-More-               NOTE:  Whogets  is  intended  as a replacement for
               sendsys (and version) with  a  precisely-specified
               reply  format.   Since  the  syntax for specifying
               what newsgroups get sent to  what  other  relayers
               varies  widely  between different forms of relayer
               software, the only practical  way  to  standardize
               the  reply  format is to indicate a specific news-
               group and ask  where  THAT  newsgroup  propagates.
               The  requirement  that  it be a complete newsgroup
               name is intended to (largely) avoid the problem of
               having  to  answer "yes and no" in cases where not
               all newsgroups in a hierarchy are sent.

          Any of these messages lacking an  Approved  header  MUST  be
          ignored.   Response  to  any  of  these  messages  SHOULD be
          delayed for at least 24 hours, and  no  response  should  be
          attempted  if  the  message has been cancelled in that time.
          Also, no response SHOULD be attempted unless the local  part
          of    the    destination   address   is   "newsmap".    News



          2 June 1994                - 66 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                    sec. 7.5


          administrators SHOULD arrange for mail to "newsmap" on their
          systems  to  be  discarded (without reply) unless legitimate
          use is in progress.

               NOTE: Because these messages can cause many,  many
               relayers  to  send  mail  to one person, such mes-
               sages, specifying mailing to an innocent  person's
               mailbox, have been forged as a half-witted practi-
               cal joke.  A delay gives  administrators  time  to
               notice a fraudulent message and act (by cancelling
               the message, preparing to divert the flood of mail
               into the bit bucket, or both).  Restriction of the
               destination  address  to  "newsmap"  reduces   the
               appeal  of fraud by making it impossible to use it
               to harass a normal user.  (A site which  does  NOT
-More-               discard  mail  to "newsmap", but rather bounces it
               back, may incur higher communications  costs  than
               if  the mail had been accepted

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4821 *bbs.tbbs*
10-08-94 11:32:00
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 into a user's mail-
               box... but a  malicious  forger  could  accomplish
               this  anyway, by using an address whose local part
               is very unlikely to be a legitimate mailbox name.)

               NOTE: RFC 1036 did not require the Approved header
               for these control messages.  This has  been  added
               because  of  the  possibility  that  cryptographic
               authentication of  Approved  headers  will  become
               available.

          The  body of the reply to a sendsys message SHOULD be of the
          form:

               sendsys-reply      = responder 1*sys-line
               responder          = "Responding-System:" space domain eol
               sys-line           = relayer-name ":" newsgroup-patterns [ ":"
text ] eol
-More-               newsgroup-patterns = newsgroup-name *( "," newsgroup-name )

          The first line identifies the  responding  system,  using  a
          syntax  resembling a header (but note that it is part of the
          BODY).  Remaining lines indicate what newsgroups are sent to
          what other systems.  The syntax of newsgroup patterns is not
          well standardized; the form described is common (often  with
          newsgroup  names  only  partially  given, denoting all names
          starting with a particular set of components) but  not  uni-
          versal.   The  whogets  message  provides  a  better-defined
          alternative.

          The reply to a version message is  of  somewhat  ill-defined
          form,  with  a  body normally consisting of a single line of
          text that somehow describes the version of the relayer soft-
          ware.   The whogets message provides a better-defined alter-
          native.

          The body of the reply to a whogets message MUST  be  of  the
          form:



-More-          2 June 1994                - 67 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.5


               whogets-reply      = responder-domain responder-relayer
response-date
                                    responding-to arrived-via responder-version
                                    whogets-delimiter *pass-line
               responder-domain   = "Responding-System:" space domain eol
               responder-relayer  = "Responding-Relayer:" space relayer-name
eol
               response-date      = "Response-Date:" space date eol
               responding-to      = "Responding-To:" space message-id eol
               arrived-via        = "Arrived-Via:" path-list eol
               responder-version  = "Responding-Version:" space nonblank-text
eol
               whogets-delimiter  = eol
               pass-line          = relayer-name [ space domain ] eol
-More-
          The  first  six lines identify the responding relayer by its
          Internet domain name  (use  of  the  ".uucp"  and  ".bitnet"
          pseudo-domains is permissible, for registered hosts in them,
          but discouraged) and its relayer name, specify the date when
          the  reply  was  generated and the message ID of the whogets
          message being replied to, give the path list (from the  Path
          header)  of  the  whogets  message (which MAY, if absolutely
          necessary, be truncated to a  convenient  length,  but  MUST
          contain at least the leading three relayer names), and indi-
          cate the version of relayer software responding.  Note  that
          these  lines  are  part of the BODY even though their format
          resembles that of  headers.   Despite  the  apparently-fixed
          order  specified by the syntax above, they can appear in any
          order, but there must be exactly one of each.

          After those preliminaries, and an empty  line  to  unambigu-
          ously  define their end, the remaining lines are the relayer
          names (which MAY be accompanied by the corresponding  domain
          names,  if  known)  of  systems  which the responding system
          passes the target newsgroup to.   Only  the  names  of  news
          relayers are to be included.

-More-               NOTE:  It is desirale for a reply to identify its
               source  by  both  domain  name  and  relayer  name
               because news propagation is governed by the latter
               but location in a broader context is  best  deter-
               mined by the former.  The date and whogets message
               ID should, in principle, be present  in  the  MAIL
               headers,  but are included in the body for robust-
               ness in the presence of  uncooperative  mail  sys-
               tems.   The  reason for the path list is discussed
               below.  Adding version information eliminates  the
               need for a separate message to gather it.

               NOTE: The limitation of pass lines to contain only
               names of news relayers is meant to  exclude  names
               used within a single host (as identifiers for mail
               gateways,  portions  of  ihave/sendme  implementa-
               tions, etc.), which do not actually refer to other
               hosts.





-More-          2 June 1994                - 68 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 7.5


          A relayer which is unaware of the existence  of  the  target
          newsgroup  MUST  not  reply  to  a  whogets  message at all,
          although this MUST not influence  decisions  on  whether  to
          pass the article on to other relayers.

               NOTE:  While this may result in discontinuous maps
               in  cases  where  some  hosts  have  not   honored
               requests for creation of a newsgroup, it will also
               prevent a flood of useless responses in the  event
               that  a  whogets  message  intended to map a small
               region "leaks" out to a larger one.  The possibil-
               ity  of  discontinuous  recognition of a newsgroup
               does make it important that  the  whogets  message
               itself  continue  to  propagate (if other criteria
-More-               permit).  This is also the reason for  the  inclu-
               sion  of  the  whogets  message's path list, or at
               least the leading portion of it, in the reply:  to
               permit  reconstruction  of  at least small gaps in
               maps.

          Different networks set different rules for the legitimacy of
          these  messages, given that they may reveal details of orga-
          nization-internal topology  that  are  sometimes  considered
          proprietary.

               NOTE:  On  Usenet,  in  particular, willingness to
               respond to these messages is held to be  a  condi-
               tion of network membership: the topology of Usenet
               is public information.  Organizations  wishing  to
               belong to such networks while keeping their inter-
               nal topology confidential might wish  to  organize
               their  internal news software so that all articles
               reaching outsiders appear  to  be  from  a  single
               "gatekeeper"  system, with the details of internal
               topology hidden behind that system.

               UNRESOLVED ISSUE: It might be useful to have a way
-More-               to set some sort of hop limit for these.


          7.6. checkgroups

          The   checkgroups  control  message  contains  a  supposedly
          authoritative list of the valid newsgroups within some  sub-
          set of the newsgroup name space:

               checkgroups-arguments  =
               checkgroups-body       = [ invalidation ] valid-groups
                                      / invalidation
               invalidation           = "!" plain-component *( ","
plain-component ) eol
               valid-groups           = 1*( description-line eol )

          There are no arguments.  The body lines (except possibly for
          an initial invalidation) each contain a description line for



          2 June 1994                - 69 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                    sec. 7.6


          a  newsgroup, as defined under the newgroup message (section
          7.3).

               NOTE: Some other, ill-defined, forms of the check-
               groups body were formerly used.  See appendix A.

          The  checkgroups message applies to all hierarchies contain-
          ing any of the newsgroups listed in the  body.   The  check-
          groups  message asserts that the newsgroups it lists are the
          only newsgroups in those hierarchies.  If there is an inval-
          idation,  it asserts that the hierarchies it names no longer
          contain any newsgroups.



<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4824 *bbs.tbbs*
10-08-94 11:32:14
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
          Processing a checkgroups message MAY cause a local  list  of
          newsgroup  descriptions to be updated.  It SHOULD also cause
          the local lists of newsgroups  (and  their  moderation  sta-
          tuses)  in  the  mentioned hierarchies to be checked against
          the message.  The results of the check MAY be used for auto-
          matic  corrective  action,  or  MAY  be reported to the news
          administrator in some way.

               NOTE:  Automatically  updating   descriptions   of
               existing  newsgroups  is  relatively safe.  In the
               case of newsgroup additions or  deletions,  simply
               notifying  the administrator is generally the wis-
               est action, unless  perhaps  the  message  can  be
               determined to have originated within a cooperating
               subnet whose members are considered trustworthy.

               NOTE: There is a problem with the checkgroups con-
               cept:  not all newsgroups in a hierarchy necessar-
-More-               ily  propagate  to  the  same  set  of   machines.
               (Notably,  there  is  a set of newsgroups known as
               the "inet" newsgroups, which have relatively  lim-
               ited  distribution  but coexist in several hierar-
               chies with  more  widely-distributed  newsgroups.)
               The  advice  of checkgroups should always be taken
               with a grain of salt, and should never be followed
               blindly.


          8. Transmission Formats

          While  this  Draft  does  not  specify  transmission methods
          except to place a few constraints on them,  there  are  some
          data  formats  used only for transmission that are unique to
          news.


          8.1. Batches

          For efficient bulk transmission and processing of news arti-
          cles,  it is often desirable to transmit a number of them as
          a single block of data, a "batch".  The format  of  a  batch
-More-


          2 June 1994                - 70 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 8.1


          is:

               batch         = 1*( batch-header article )
               batch-header  = "#! rnews " article-size eol
               article-size  = 1*digit

          A batch is a sequence of articles, each prefixed by a header
          line that includes its size.  The article size is a  decimal
          count of the octets in the article, counting each EOL as one
          octet regardless of how it is actually represented.

-More-               NOTE: A relayer might wish to accept either a sin-
               gle article or a batch as input.  Since "#" cannot
               appear in a header name, examination of the  first
               octet of the input will reveal its nature.

               NOTE:  In  the  header  line, there is exactly one
               blank before "rnews", there is exactly  one  blank
               after "rnews", and the EOL immediately follows the
               article size.  Beware that some  software  inserts
               non-standard trash after the size.

               NOTE: Despite the similarity of this format to the
               executable-script format used  by  some  operating
               systems,  it  is  EXTREMELY  unwise  to  just feed
               incoming batches to a command interpreter  in  the
               anticipation  that  it  will  run  a command named
               "rnews" to process the batch.  Unless arrangements
               are  made  to  very  tightly restrict the range of
               commands that can be executed by this  means,  the
               security implications are disastrous.


          8.2. Encoded Batches
-More-
          When transmitting news, especially over communications links
          that are slow or are billed by the bit, it is  often  desir-
          able  to  batch  news  and  apply  data  compression  to the
          batches.   Transmission  links  sending  compressed  batches
          SHOULD use out-of-band means of communication to specify the
          compression algorithm being used.  If there  is  no  way  to
          send out-of-band information along with a batch, the follow-
          ing encapsulation for a compressed batch MAY be used:

               ec-batch             = "#! " compression-keyword eol
compressed-batch
               compression-keyword  = "cunbatch"

          A line containing a keyword indicating the type of  compres-
          sion  is  followed  by the compressed batch.  The only truly
          widespread compression keyword  at  present  is  "cunbatch",
          indicating  compression  using  the widely-distributed "com-
          press" program.  Other compression keywords MAY be  used  by
          mutual agreement between the hosts involved.



-More-
          2 June 1994                - 71 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 8.2


               NOTE:  An encapsulated compressed batch is NOT, in
               general, a text file, despite  having  an  initial
               text  line.  This combination of text and non-text
               data is often  awkward  to  handle;  for  example,
               standard  decompression  programs  cannot  be used
               without first stripping off the initial line,  and
               that  in  turn is painful to do because many text-
               handling tools that are  superficially  suited  to
               the  job  do  not  cope  well  with non-text data.
               Hence the recommendation that out-of-band communi-
               cation be used instead when possible.

               NOTE: For UUCP transmission, where a batch is typ-
-More-               ically transmitted by invoking the remote  command
               "rnews"  with  the  batch  as  its input stream, a
               plausible out-of-band method for indicating a com-
               pression  type would be to give a compression key-
               word in an option to "rnews", perhaps in the form:

                    rnews -d decompressor

               where  "decompressor"  is the name of a decompres-
               sion program (e.g. "uncompress" for a  batch  com-
               pressed  with  "compress"  or "gunzip" for a batch
               compressed with "gzip").  How  this  decompression
               program  is  located  and invoked by the receiving
               relayer is implementation-specific.

               NOTE: See the notes in section 8.1 on the inadvis-
               ability  of  feeding  batches  directly to command
               interpreters.

               NOTE: There is exactly one blank between "#!"  and
               the  compression  keyword, and the EOL immediately
               follows the keyword.

-More-
          8.3. News Within Mail

          It is often desirable to transmit news as mail,  either  for
          the  convenience of a human recipient or because that is the
          only type of transmission available on a restrictive  commu-
          nication path.

          Given  the  similarity  between the news format and the MAIL
          format, it is superficially attractive to just send the news
          article  as  a  mail  message.  This is typically a mistake:
          mail-handling software often feels free to manipulate  vari-
          ous  headers  in  undesirable  ways  (in some cases, such as
          Sender, such manipulation is actually mandatory),  and  mail
          transmission  problems etc. MUST be reported to the adminis-
          trators responsible for the mail transmission rather than to
          the  article's author.  In general, news sent as mail should
          be encapsulated to separate the mail headers  and  the  news



          2 June 1994                - 72 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                    sec. 8.3


          headers.

          When  the intended recipient is a human, any convenient form
          of encapsulation may be used.  Recommended  practice  is  to
          use   MIME  encapsulation  with  a  content  type  of  "mes-
          sage/news", given that news articles have additional  seman-
          tics beyond what "message/rfc822" implies.

               NOTE:  "message/news" was registered as a standard
               subtype by IANA 22 June 1993.

          When mail is being used as a transmission path  between  two
          relayers,  however,  a  standard  method is desirable.  Cur-
          rently the standard method is to send the mail to an address
          whose  local part is "rnews", with whatever mail headers are
          necessary for successful  transmission.   The  news  article
-More-          (including its headers) is sent as the body of the mail mes-
      

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4827 *bbs.tbbs*
10-08-94 11:32:29
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
    sage, with an "N" prepended to each line.

               NOTE: The "N" reduces the probability of an  inno-
               cent line in a news article being taken as a magic
               command to mail software, and makes  it  easy  for
               receiving software to strip off any lines added by
               mail software (e.g. the trailing empty line  added
               by some UUCP mail software).

          This  method  has its weaknesses.  In particular, it assumes
          that the mail  transmission  channel  can  transmit  nearly-
          arbitrary body text undamaged.  When mail is being used as a
          transmission path of last resort, however, the  mail  system
          often has inconvenient preconceived notions about the format
          of message bodies.  Various  ad-hoc  encoding  schemes  have
          been used to avoid such problems.  The recommended method is
          to send a news article or batch as the body of a  MIME  mail
          message,  using content type "application/news-transmission"
-More-          and MIME's "base64" encoding (which is specifically designed
          to survive all known major mail systems).

               NOTE:  In  the  process, MIME conventions could be
               used to fragment and reassemble an  article  which
               is  too  large to be sent as a single mail message
               over a transmission path  that  restricts  message
               length.   In addition, the "conversions" parameter
               to the content type could be used to indicate what
               (if  any)  compression  method has been used.  And
               the Content-MD5 header [rrr 1544] can be used as a
               "checksum" to provide high confidence of detecting
               accidental damage to the contents.

               UNRESOLVED ISSUE: The "conversions"  parameter  no
               longer exists.  What should be done about this, if
               anything?





          2 June 1994                - 73 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                    sec. 8.3


               NOTE: It might look tempting to use a content type
               such  as  "message/X-netnews",  but MIME bans non-
               trivial encodings of the entire body  of  messages
               with  content  type  "message".   The intent is to
               avoid obscuring nested structure underneath encod-
               ings.   For inter-relayer news transmission, there
               is no nested structure  of  interest,  and  it  is
               important  that  the entire article (including its
               headers, not just its body) be  protected  against
               the  vagaries  of intervening mail software.  This
               situation appears to fit the MIME  description  of
               circumstances in which "application" is the proper
               content type.

               NOTE:  "application/news-transmission",   with   a
-More-               "conversions" parameter, was registered as a stan-
               dard subtype by IANA 22 June 1993.

               UNRESOLVED ISSUE: The "conversions"  parameter  no
               longer  exists  in  MIME.  What should we do about
               this?


          8.4. Partial Batches

               UNRESOLVED ISSUE: The existing  batch  conventions
               assemble  (potentially)  many  articles  into  one
               batch.  Handling very large articles would be sub-
               stantially  less  troublesome  if there was also a
               fragmentation convention  for  splitting  a  large
               article  into  several  batches.   Is  this  worth
               defining at this time?


          9. Propagation and Processing

          Most aspects of news propagation and processing  are  imple-
          mentation-specific.   The  basic propagation algorithms, and
-More-          certain details of how they  are  implemented,  nevertheless
          need to be standard.

          There  are  two  important principles that news implementors
          (and administrators) need to keep in mind.  The first is the
          well-known Internet Robustness Principle:

               Be liberal in what you accept, and conservative in what you
send.

          However, in the case of news there is an even more important
          principle, derived from a much older code of  practice,  the
          Hippocratic  Oath  (we  will  thus call this the Hippocratic
          Principle):

               First, do no harm.




          2 June 1994                - 74 -       expires 15 July 1994


-More-


          INTERNET DRAFT to be        NEWS                      sec. 9


          It is VITAL to realize that decisions which might be  merely
          suboptimal  in a smaller context can become devastating mis-
          takes when amplified by the actions of  thousands  of  hosts
          within a few hours.


          9.1. Relayer General Issues

          Relayers  MUST not alter the content of articles unnecessar-
          ily.  Well-intentioned attempts  to  "improve"  headers,  in
          particular,  typically do more harm than good.  It is neces-
          sary for a relayer to prepend its own name to the Path  con-
          tent  (see section 5.6) and permissible for it to rewrite or
          delete the Xref header (see  section  6.12).   Relayers  MAY
          delete the thoroughly-obsolete headers described in appendix
          A.3, although this behavior no longer seems useful enough to
          encourage.   Other  alterations  SHOULD  be  avoided  at all
-More-          costs, as per the Hippocratic Principle.

               NOTE: As discussed in section 2.3, tidying up  the
               headers  of  a user-prepared article is the job of
               the posting agent, not the relayer.  The relayer's
               purpose  is  to  move  already-compliant  articles
               around efficiently without  damaging  them.   Note
               that  in  existing  implementations, specific pro-
               grams may contain both posting-agent functions and
               relayer  functions.  The distinction is that post-
               ing-agent functions are invoked only  on  articles
               posted   by   local  posters,  never  on  articles
               received from other relayers.

               NOTE: A particular corollary of this rule is  that
               relayers  should not add headers unless truly nec-
               essary.  In particular, this is not SMTP;  do  not
               add Received headers.

          Relayers  MUST  not pass non-conforming articles on to other
          relayers, except perhaps in a cooperating  subnet  that  has
          agreed  to  permit certain kinds of non-conforming behavior.
          This is a direct  consequence  of  the  Internet  Robustness
-More-          Principle.

          The  two  preceding paragraphs may appear to be in conflict.
          What  is  to  be  done  when  a  non-conforming  article  is
          received?  The Robustness Principle argues that it should be
          accepted but must not be passed on to other  relayers  while
          still non-conforming, and the Hippocratic Principle strongly
          discourages attempts at repair.  The  conclusion  that  this
          appears  to lead to is correct: a non-conforming article MAY
          be accepted for local filing and processing, or  it  MAY  be
          discarded  entirely,  but  it MUST not be passed on to other
          relayers.





          2 June 1994                - 75 -       expires 15 July 1994





-More-          INTERNET DRAFT to be        NEWS                    sec. 9.1


          A relayer MUST not respond to the arrival of an  article  by
          sending mail to any destination, other than a local adminis-
          trator, except by explicit prearrangement with  the  recipi-
          ent.   Neither  posting an article (other than certain types
          of control message, see section 7.5) nor being the moderator
          of  a  moderated  newsgroup constitutes such prearrangement.
          UNDER NO CIRCUMSTANCES WHATSOEVER may a relayer  attempt  to
          send  mail to either an article's originator or a moderator.

               NOTE: Reporting apparent errors in message  compo-
               sition  is  the  job  of  a  posting  agent, not a
               relayer.  The same is true of  mailing  moderated-
               newsgroup  postings to moderators.  In networks of
               thousands of cooperating relayers,  it  is  simply
               unacceptable  for  there  to  be  any circumstance
               whatsoever that causes any significant fraction of
               them  to simultaneously send mail to the same des-
               tination.  (Some control messages are  exceptions,
               although  perhaps  ill-advised 

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4830 *bbs.tbbs*
10-08-94 11:32:44
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
ones.)  What might,
               in a smaller network, be a useful notification  or
               forwarding becomes a deluge of near-identical mes-
               sages that can bring mail software  to  its  knees
               and  severely  inconvenience  recipients.  Modera-
               tors, in particular,  historically  have  suffered
               grievously from this.

          Notification  of  problems  in  incoming  articles MAY go to
          local administrators, or at most  (by  prearrangement!)   to
          the administrators of the neighboring relayer(s) that passed
          on the problematic articles.

               NOTE: It would be desirable to notify  the  author
               that his posting is not propagating as he expects.
               However, there is no known method for  doing  this
               that  will  scale  up gracefully.  (In particular,
               "notify only if within N relayers of the  origina-
-More-               tor" falls down in the presence of commercial news
               services like UUNET:  there  may  be  hundreds  or
               thousands  of  relayers within a couple of hops of
               the originator.)  The best that can be done  right
               now is to notify neighbors, in hopes that the word
               will eventually propagate up the line, or organize
               regional monitoring at major hubs.

          If it is necessary to alter an article, e.g. translate it to
          another character  set  or  alter  its  EOL  representation,
          strenuous  efforts should be made to ensure that such trans-
          formations are reversible, and that relayers or other  soft-
          ware  that might wish to reverse them know exactly how to do
          so.

               NOTE:  For  example,  a  cooperating  subnet  that
               exchanges articles using a non-ASCII character set
               like EBCDIC should define a  standard,  reversible



          2 June 1994                - 76 -       expires 15 July 1994

-More-



          INTERNET DRAFT to be        NEWS                    sec. 9.1


               ASCII-EBCDIC mapping and take pains to see that it
               is used at all points where the subnet  meets  the
               outside.   If  the only reason for using EBCDIC is
               that the readers typically employ EBCDIC  devies,
               it  would  be  more  robust to employ ASCII as the
               interchange format and do  the  transformation  in
               the reading and posting agents.


          9.2. Article Acceptance And Propagation

          When  a  relayer  first  receives an article, it must decide
          whether to accept it.  (This applies regardless  of  whether
          the  article arrived by itself or as part of a batch, and in
          principle regardless of whether it  originated  as  a  local
          posting or as traffic from another relayer.)  In a cooperat-
-More-          ing subnet with well-controlled propagation paths,  some  of
          the  tests  specified  here  MAY  be delegated to centrally-
          located relayers; that is, relayers that  can  receive  news
          ONLY  via  one of the central relayers might simplify accep-
          tance testing based on the assumption that incoming  traffic
          has  already  passed  the  full  set  of  tests at a central
          relayer.

          The wording that follows is based on a model in which  arti-
          cles  arrive on a relayer's host before acceptance tests are
          done.  However, depending on the degree  of  integration  of
          the  transport  mechanisms  and  the relayer, some or all of
          these tests MAY be  done  before  the  article  is  actually
          transmitted,  so  that articles which definitely will not be
          accepted need not be transmitted at all.

          The wording that follows also specifies a  particular  order
          for  the  acceptance tests.  While this order is the obvious
          one, the tests MAY be done in any order.

          First, the relayer MUST verify that the article is  a  legal
          news  article, with all mandatory headers present with legal
          contents.
-More-
               NOTE: This check in principle is done by the first
               relayer  to see an article, so an article received
               from another relayer should always be  legal,  but
               there  is  enough  old  software still operational
               that this cannot be taken  for  granted;  see  the
              discussion of the Internet Robustness Principle in
               section 9.1.

          Second, the relayer MUST determine whether  it  has  already
          seen  this  article (identified by its message ID).  This is
          normally done by retaining a history of all article  message
          IDs seen in the last N days, where the value of N is decided
          by the relayer's administrator but SHOULD  be  at  least  7.
          Since  N cannot practically be infinite, articles whose Date



          2 June 1994                - 77 -       expires 15 July 1994




-More-
          INTERNET DRAFT to be        NEWS                    sec. 9.2


          content indicates that  they  are  older  than  N  days  are
          declared "stale" and are deemed to have been seen already.

               NOTE:  This check is important because news propa-
               gation  topology  is  typically  redundant,  often
               highly  so,  and  it  is not at all uncommon for a
               relayer to receive the same article  from  several
               neighbors.   The  history  of already-seen message
               IDs can get quite large, hence the desire to limit
               its  length... but it is important that it be long
               enough that slowly-propagating  articles  are  not
               classed  as  stale.   News  propagation within the
               Internet is normally very  rapid,  but  when  UUCP
               links  are  involved, end-to-end delays of several
               days are not rare, so a week is not a particularly
               generous minimum.

               NOTE:  Despite generally more rapid propagation in
               recent times, it is still not unheard-of for  some
-More-               propagation  paths  to  be  very  slow.   This can
               introduce the possibility of old articles arriving
               again after they are gone from the history.  Hence
               the "stale" rule.

          Third, the relayer MUST determine whether any of  the  arti-
          cle's newsgroups are "subscribed to" by the host, i.e. fit a
         description of what hierarchies or newsgroups the site wants
          to receive.

               NOTE:  This  check is significant because informa-
               tion  on  what  newsgroups  a  relayer  wishes  to
               receive  is often stored at its neighbors, who may
               not have up-to-date information  or  may  simplify
               the  rules for implementation reasons.  As a hedge
               against the possibility of missed or delayed  new-
               group  control  messages,  relayers  may  wish  to
               observe a notion of a newsgroup subscription  that
               is  independent of the list of newsgroups actually
               known to the relayer.  This would permit reception
               and  relaying  of  articles in newsgroups that the
               relayer is not (yet) aware  of,  subject  to  more
               general  criteria  indicating that they are likely
-More-               to be of interest.

          Once an article has been accepted, it may be  passed  on  to
          other  relayers.  The fundamental news propagation rule is a
          flooding algorithm: on receiving and accepting  an  article,
          send  it to all neighboring relayers not already in its path
          list that are sent its newsgroup(s) and distribution(s).

               NOTE: The path list's role in loop prevention  may
               appear  relatively unimportant, given that looping
               articles would typically be rejected as duplicates
               anyway.    However,   the   path  list's  role  in



          2 June 1994                - 78 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 9.2

-More-
               preventing superfluous transmissions is not  triv-
               ial.   In  particular,  the  path list is the only
               thing that prevents relayer  X,  on  receiving  an
               article  from relayer Y, from sending it back to Y
               again.  (Indeed, the usual  symptom  of  confusion
              about  relayer  names  is that incoming news loops
               back in this manner.)  The looping articles  would
               be rejected as duplica

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4833 *bbs.tbbs*
10-08-94 11:32:59
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
tes, but doubling the commu-
               nications load on every news transmission path  is
               not to be taken lightly!

          In  general,  relayers SHOULD not make propagation decisions
          by "anticipation": relayer X, noting that the article's path
          list  already  contains relayer Y, decides not to send it to
          relayer Z because X anticipates that Z will get the  article
          by  a  better  path.  If that is generally true, then why is
          there a news feed from X to Z at all?  In fact, the  "better
          path"  may  be running slowly or may be down.  News propaga-
          tion is very robust precisely because some redundant  trans-
          mission  is  done  "just  in  case".  If it is imperative to
          limit unnecessary traffic on a path, use of  NNTP  [rrr]  or
          ihave/sendme  (see  section  7.2) to pass articles only when
          necessary is better than arbitrary  decisions  not  to  pass
          articles at all.

-More-          Anticipation  is  occasionally  justified  in special cases.
          Such cases should involve  both  (1)  a  cooperating  subnet
          whose   propagation  paths  are  well-understood  and  well-
          monitored, with failures and  slowdowns  noticed  and  dealt
          with  promptly, and (2) a persistent pattern of heavy unnec-
          essary traffic on a path that is either slow or costly.   In
          addition,  there  should be some reason why neither NNTP nor
          ihave/sendme is suitable as a solution to the problem.


          9.3. Administrator Contact

          It is desirable to have a standardized contact address for a
          relayer's  administrators, in the spirit of the "postmaster"
          address for mail administrators.  Mail addressed  to  "news-
          master"  on a relayer's host MUST go to the administrator(s)
          of  that  relayer.   Mail  addressed  to  "usenet"  on   the
          relayer's  host  SHOULD be handled likewise.  Mail addressed
          to either  address  on  other  hosts  using  the  same  news
          database SHOULD be handled likewise.

               NOTE: These addresses are case-sensitive, although
               it would be desirable for sequences equivalent  to
-More-               them  using case-insensitive comparison to be han-
               dled likewise.  While "newsmaster" seems the  pre-
               ferred  network-independent address, by analogy to
               "postmaster", there is  an  existing  practice  of
               using  "usenet"  for this purpose, and so "usenet"
               should be supported if at all possible (especially



          2 June 1994                - 79 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. 9.3


               on  hosts  belonging  to  Usenet!).   The  address
               `news" is also sometimes used  for  purposes  like
               this, but less consistently.


-More-          10. Gatewaying

          Gatewaying of traffic between news networks using this Draft
          and those using other exchange mechanisms can be useful, but
          must  be done cautiously.  Gateway administrators are taking
          on significant responsibilities, and must recognize that the
          consequences of error can be quite serous.


          10.1. General Gatewaying Issues

          This section will primarily address the problems of gateway-
          ing traffic INTO news networks.  Little can  be  said  about
          the  other  direction without some specific knowledge of the
          network(s)  involved.   However,  the  two  issues  are  not
          entirely  independent:  if  a  non-news network is gatewayed
          into a news network at more than one point, traffic injected
          into  the  non-news  network  by  one  gateway may appear at
          another as a candidate for injection back into the news net-
          work.

          This raises a more general principle, the single most impor-
          tant issue for gatewaying:
-More-
               Above all, prevent loops.

          The normal loop prevention of news transmission  is  vitally
          dependent on the Message-ID header.  Any gateway which finds
          it necessary to remove this header, alter it,  or  supersede
          it (by moving it into the body), MUST take equally effective
          precautions against looping.

               NOTE: There are few things more effective at turn-
               ing  news readers into a lynch mob than a malfunc-
               tioning gateway, or pair of gateways,  that  takes
               in news articles, mangles them just enough to pre-
               vent news relayers from recognizing them as dupli-
               cates,  and  regurgitates  them back into the news
               stream.  This happens rather too often.

          Gateway implementors should realize that gateways  have  all
          the  responsibilities  of relayers, plus the added complica-
          tions introduced by transformations between different infor-
          mation  formats.   Much of section 9's discussion of relayer
          issues is relevant to  gateways  as  well.   In  particular,
          gateways SHOULD keep a history of recently-seen articles, as
-More-          described in section 9.2, and not assume that articles  will
          never reappear.  This is particularly important for networks
          that have their own concept  analogous  to  message  IDs:  a



          2 June 1994                - 80 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 10.1


          gateway  should  keep  a  history  of traffic seen from BOTH
          directions.

          If at all possible, articles entering the  non-news  network
          SHOULD  be  marked  in some way so that they will NOT be re-
          gatewayed back into news.  Multiple gateways obviously  must
          agree  on  the  marking method used; if it is done by having
          them know each others' names, name changes MUST  be  coordi-
-More-          nated  with  great  care.   If  marking  cannot be done, all
          transformations MUST be reversible so  that  a  re-gatewayed
          article  is  identical to the original (except perhaps for a
          longer Path header).

          Gateways MUST not pass control messages (articles containing
          Control, Also-Control, or Supersedes headers) without remov-
          ing the headers that  make  them  control  messages,  unless
          there  are compelling reasons to believe that they are rele-
          vant to both sides and that conventions are compatible.   If
          it  is truly desirable to pass them unaltered, suitable pre-
          cautions MUST be taken to ensure that there is NO  POSSIBIL-
          ITY of a looping control message.

               NOTE:  The damage done by looping articles is mul-
               tiplied a thousandfold  if  one  of  the  affected
               articles  is something like a sendsys message (see
               section  7.3)  that  requests  multiple  automatic
               replies.   Most  gateways  simply  should not pass
               control messages at all.  If some  unusual  reason
               dictates doing so, gateway implementors and admin-
               istrators are urged to consider bulletproof  rate-
               limiting  measures  for  the more destructive ones
-More-               like sendsys, e.g. passing only one  per  hour  no
               matter how many are offered.

          Gateways,  like  relayers, SHOULD make determined efforts to
          avoid mangling articles unnecessarily.  In the case of gate-
          ways,  some  transformations  may be inevitable, but keeping
          them to a minimum and ensuring that they are  reversible  is
          still highly desirable.

          Gateways  MUST avoid destroying information.  In particular,
          the restrictions of section 4.2.2  are  best  taken  with  a
          grain  of salt in the context of gateways.  Information that
          does not translate directly  into  news  headers  SHOULD  be
          retained, perhaps in "X-" headers, both because it may be of
          interest to sophisticated readers and because it may be cru-
          cial to tracing propagation problems.

          Gateway implementors should take particular note of the dis-
          cussion of mailed replies, or  more  precisely  the  ban  on
          same,  in section 9.1.  Gateway problems MUST be reported to
          the local administration, not to the innocent originator  of
          traffic.   "Gateway  problems"  here  includes  all forms of
          propagation anomaly on the non-news  side  of  the  gateway,
-More-


          2 June 1994                - 81 -       expires 15 July 1994





          INTERNET DRAFT to

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4836 *bbs.tbbs*
10-08-94 11:33:14
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 be        NEWS                   sec. 10.1


          e.g.  unreachable  addresses  on  a mailing list.  Note that
          this  requires  consideration  of  possible  misbehavior  of
          "downstream" hosts, not just the gateway host.


          10.2. Header Synthesis

          News  articles prepared by gateways MUST be legal news arti-
          cles.  In particular, they MUST include all of the mandatory
          headers  (see  section  5)  and  MUST  fully  conform to the
          restrictions on said headers.  This often  requires  that  a
          gateway function not only as a relayer, but also partly as a
          posting agent, aiding in the synthesis of a conforming arti-
          cle from non-conforming input.

-More-               NOTE:  The full-conformance requirement needs par-
               ticularly careful attention when gatewaying  mail-
               ing  lists to news, because a number of constructs
               that are legal in MAIL headers are NOT permissible
               in  news  headers.   (Note  also that not all mail
               traffic fully conforms to even the MAIL specifica-
               tion.)   The  rest of this section will be phrased
               in terms of mail-to-news gatewaying, but  most  of
               it is more generally applicable.

          The mandatory headers generally present few problems.

          If no date information is available, the gateway should sup-
          ply a Date header with the gateway's current date.  If  only
          partial  information  is available (e.g. date but not time),
          this should be fleshed out to a full Date header  by  adding
          default values, not by mixing in parts of the gateway's cur-
          rent date.  (Defaults should be chosen so  that  fleshed-out
          dates  will  not  be in the future!)  It may be necessary to
          map timezone information to the restricted  forms  permitted
          in the news Date header.  See section 5.1.

               NOTE:  The  prohibition  of mixing dates is on the
-More-               theory that it is better to admit  ignorance  than
               to lie.

          If  the author's address as supplied in the original message
          is not suitable for inclusion in a From header, the  gateway
          MUST  transform it so it is, e.g. by use of the "% hack" and
          the domain address of the gateway.  The desire  to  preserve
          information  is  NOT  an excuse for violating the rules.  If
          the transformation is drastic enough that there is reason to
          suspect  loss of information, it may be desirable to include
          the original form in an X- header,  but  the  From  header's
          contents MUST be as specified in section 5.2.

          If  the  message  contains a Message-ID header, the contents
          should be dealt with as discussed in section 10.3.  If there
          is no message ID present, it will be necessary to synthesize



          2 June 1994                - 82 -       expires 15 July 1994



-More-

          INTERNET DRAFT to be        NEWS                   sec. 10.2


          one, following the news rules (see section 5.3).

          Every effort should be made to produce a meaningful  Subject
          header;  see section 5.4.  Many news readers select articles
          to read based on Subject headers,  and  inserting  a  place-
          holder  like  "<no  subject available>" is considered highly
          objectionable.  Even synthesizing a Subject header by  pick-
          ing  out  the  first  half-dozen nouns and adjectives in the
          article body is better than using a  placeholder,  since  it
          offers SOME indication of what the article might contain.

          The contents of the Newsgroups header (section 5.5) are usu-
          ally predetermined by gateway configuration, but  a  gateway
          to  a network that has its own concept of newsgroups or dis-
          cussions might have to make transformations.  Such transfor-
          mations  should be reversible; otherwise confusion is likely
          on both sides.

-More-          It will rarely be possible for gateways to  provide  a  Path
          header  that is both an accurate history of the relayers the
          article has passed  through  AS  NEWS  and  a  usable  reply
          address.   The  history function MUST be given priority; see
          the discussion in section 5.6.  It will usually be necessary
          for  a  gateway to supply an empty path list, abandoning the
          reply function.

          It is desirable for gatewayed articles  to  convey  as  much
          useful information as possible, e.g. by use of optional news
          headers (see section 6) when  the  relevant  information  is
          available.  Synthesis of optional headers can generally fol-
          low similar rules.

          Software synthesizing References  headers  should  note  the
          discussion  in  section  6.5  concerning the incompatibility
          between MAIL and news.  Also of interest is the  possibility
          of  incorporating  information  from In-Reply-To headers and
          from attribution lines in the body; an incomplete  or  some-
          what  conjectural References header is much better than none
          at all, and reading agents already have to cope with  incom-
          plete or slightly erroneous References lists.

-More-
          10.3. Message ID Mapping

          This  section, like the previous one, is phrased in terms of
          mail being gatewayed into news, but most of  the  discussion
          should be more generally applicable.

          A  particularly  sticky problem of gatewaying mail into news
          is supplying legal news message IDs.  Note,  in  particular,
          that  not  all  MAIL message IDs are legal in news; the news
          syntax (specified in section 5.3, with related  material  in
          5.2)  is  more  restrictive.   Generating a fully-conforming
          news article from a mail message  may  require  transforming



          2 June 1994                - 83 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 10.3
-More-

          the message ID somewhat.

          Generation and transformation of message IDs assumes partic-
          ular importance if a given mailing  list  (or  whatever)  is
          being handled by more than one gateway.  It is highly desir-
          able that the same article contents not appear twice in  the
          same  newsgroup,  which  requires that they receive the same
          message ID from all gateways.  Gateways SHOULD use the  fol-
          lowing  algorithm (possibly modified by the later discussion
          of gatewaying into more than  one  newsgroup)  unless  local
          considerations dictate another:

               1. Separate message ID from surroundings, if necessary.
                  A plausible method for this is to start at the first
                  "<",  end at the next ">", and reject the message if
                  no ">" is found or a second "<" is seen  before  the
                  ">".  Also reject the message if the message ID con-
                  tains no "@" or more than one "@", or if it contains
                  no  ".".   Also reject the message if the message ID
                  contains non-ASCII characters, ASCII control charac-
                  ters, or white space.
-More-
                    NOTE:  Any  legitimate domain will include at
                    least one ".".  RFC 822 section 6.2.2 forbids
                    white space in this context when passing mail
                    on to non-MAIL software.

               2.   Delete the leading "<" and trailing ">".  Separate
                    message  ID into local part and domain at the "@".

               3.   In both  components,  transliterate  leading  dots
                    (".", ASCII 46), trailing dots, and dots after the
                    first in sequences  of  two  or  more  consecutive
                    dots, into underscores (ASCII 95).

               4.   In both components, transliterate disallowed char-
                    acters other than  dots  (see  the  definition  of
                    <unquoted-char>  in  section  5.2)  to underscores
                    (ASCII 95).

               5.   Form the message ID as

                         "<" local-part "@" domain ">"

-More-
               NOTE: This algorithm is approximately that of Rich
               Salz's successful gatewaying package.

          Despite  the  desire  to  keep message IDs consistent across
          multiple gateways, there is also a more  subtle  issue  that
          can  require a different approach.  If the same articles are
          being gatewayed into more than one newsgroup, and it is  not
          possible  to  arrange  that all gat

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4839 *bbs.tbbs*
10-08-94 11:33:28
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
eways gateway them to the
          same cross-posted set of newsgroups, then the message IDs in



          2 June 1994                - 84 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 10.3


          the different newsgroups MUST be DIFFERENT.

               NOTE:  Otherwise,  arrival  of  an  article in one
               newsgroup  will  prevent  it  from  appearing   in
-More-               another,  and which newsgroup a particular article
               appears in will be an accident of which  direction
               it  arrives  from  first.  It is very difficult to
               maintain a coherent discussion when each  partici-
               pant  sees a randomly-selected 50% of the traffic.
               The fundamental problem here  is  that  the  basic
               assumption  behind  message IDs is being violated:
               the gateways are assigning the same message ID  to
               articles  that  differ  in  an  important  respect
               (Newsgroups header).

          In such cases, it is suggested that the newsgroup  name,  or
          an agreed-on abbreviation thereof, be prepended to the local
          part of the message ID (with a separating ".") by the  gate-
          way.   This  will ensure that multiple gateways generate the
          same message ID, while also ensuring  that  different  news-
          groups can be read independently.

               NOTE:  It  is  preferable  to  have the gateway(s)
               cross-post the article, avoiding the  issue  alto-
               gether,  but  this may not be feasible, especially
               if one newsgroup is widespread and  the  other  is
               purely local.
-More-

          10.4. Mail to and from News

          Gatewaying mail to news, and vice-versa, is the most obvious
          form of news gatewaying.  It is common to  set  up  gateways
          between news and mail rather too casually.

          It  is hard to go very wrong in gatewaying news into a mail-
          ing list, except for the non-trivial matter of  making  sure
          that  error  reports  go  to the local administration rather
          than to the authors of news articles.  (This requires atten-
          tion  to  the  "envelope  address" as well as to the message
          headers.)  Doing the reverse connection  correctly  is  much
          harder than it looks.

               NOTE: In particular, just feeding the mail message
               to "inews -h" or the  equivalent  is  NOT,  repeat
               NOT,  adequate  to gateway mail to news.  Signifi-
               cant gatewaying software is  necessary  to  do  it
               right.   Not  all headers of mail messages conform
               to even the MAIL specifications,  never  mind  the
               stricter rules for news.
-More-
          It  is  useful to distinguish between two different forms of
          mail-to-news gatewaying: gatewaying a mailing  list  into  a
          newsgroup,  and  operating a "post-by-mail" service in which



          2 June 1994                - 85 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 10.4


          individual articles can be posted to a newsgroup by  mailing
          them  to a specific address.  In the first case, the message
          is already being  "broadcast",  and  the  situation  can  be
          viewed  as  gatewaying  one  form of news into another.  The
          second case is closer to that of a moderator posting submis-
          sions to a moderated newsgroup.

-More-          In  either  case,  the discussions in the preceding two sec-
          tions are relevant, as is the Hippocratic Principle of  sec-
          tion  9.   However,  some additional considerations are spe-
          cific to mail-to-news gatewaying.

          As mentioned in section 6, point-to-point  headers  like  To
          and  Cc  SHOULD  not  appear as such in news, although it is
          suggested that they be transformed to "X-" headers, e.g.  X-
          To  and X-Cc, to preserve their information content for pos-
          sible use  by  readers  or  troubleshooters.   The  Received
          header  is  entirely  specific to MAIL and SHOULD be deleted
          completely  during  gatewaying,  except  perhaps   for   the
          Received header supplied by the gateway host itself.

          The  Sender  header is a tricky case, one where mailing-list
          and post-by-mail practice  should  differ.   For  gatewaying
          mailing  lists, the mailing-list host should be considered a
          relayer, and the From and Sender  headers  supplied  in  its
          transmissions left strictly untouched.  For post-by-mail, as
          for a moderator posting  a  mailed  submission,  the  Sender
          header should reflect the poster rather than the author.  If
          a post-by-mail gateway  receives  a  message  with  its  own
          Sender  header,  it might wish to preserve the content in an
-More-          X-Sender header.

          It will generally be necessary to transform  between  mail's
          In-Reply-To/References convention and news's References/See-
          Also convention, to preserve correct semantics of cross ref-
          erences.   This also requires attention when going the other
          way, from news to mail.  See the discussion of  the  differ-
          ence in section 6.5.


          10.5. Gateway Administration

          Any  news  system will benefit from an attentive administra-
          tor, preferably assisted by automated monitoring for  anoma-
          lies.  This is particularly true of gateways.  Gateway soft-
          ware SHOULD be instrumented  so  that  unusual  occurrences,
          such  as  sudden  massive  surges  in  traffic, are reported
          promptly.  It is desirable, in fact, to go further:  gateway
          software  SHOULD endeavour to limit damage in the event that
          the administrator does not respond promptly.

               NOTE: For example, software might limit the  gate-
               waying  rate by queueing incoming traffic and emp-
-More-               tying the queue at a  finite  maximum  rate  (well



          2 June 1994                - 86 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 10.5


               below  the  maximum  that the host is capable of!)
               which is set  by  the  administrator  and  is  not
               raised automatically.

          Traffic gatewayed into a news network SHOULD include a suit-
          able  header,  perhaps  X-Gateway-Administrator,  giving  an
          electronic  address  that  can  be  used to report problems.
          This SHOULD be an address that goes direct to a  human,  not
          to  a  "routine administrative issues" mailbox that is exam-
          ined only occasionally, since the point is  to  be  able  to
-More-          reach  the  administrator  quickly in an emergency.  Gateway
          administrators SHOULD arrange substitutes to  cover  gateway
          operation  (with suitable redirection of mail) when they are
          on vacation etc.


          11. Security And Related Issues

          Although the interchange format itself raises no significant
          security issues, the wider context does.


          11.1. Leakage

          The  most  obvious  form  of  security  problem with news is
          "leakage" of  articles  which  are  intended  to  have  only
          restricted circulation.  The flooding algorithm is EXREMELY
          good at finding any path by which articles can leave a  sub-
          net  with  supposedly-restrictive  boundaries.   Substantial
          administrative effort is required to ensure that local news-
          groups remain local, unless connections to the outside world
          are tightly restricted.

-More-          A related problem is that the sendme control message can  be
          used  to ask for any article by its message ID.  The useful-
          ness of this has declined  as  message-ID  generation  algo-
          rithms have become less predictable, but it remains a poten-
          tial problem for "secure" newsgroups.  Hosts with such news-
          groups  may wish  to  disable  the  sendme  control message
          entirely.

          The sendsys, version,  and  whogets  control  messages  also
          allow  "outsiders"  to  request  information  from "inside",
          which may reveal details of internal topology  (etc.)   that
          are  considered  confidential.   (Note that at least limited
          openness about such matters may be a condition of membership
          in such networks, e.g. Usenet.)

          Organizations  wishing to control these forms of leakage are
          strongly advised to designate a small  

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4842 *bbs.tbbs*
10-08-94 11:33:43
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
number  of  "official
          gateway"  hosts to handle all news exchange with the outside
          world, so that a bounded amount of administrative effort  is
          needed   to  control  propagation  and  eliminate  problems.
          Attempts to keep news out entirely, by refusing  to  support



          2 June 1994                - 87 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 11.1


          an  official  gateway,  typically result in large numbers of
-More-          unofficial partial gateways appearing  over  time.   Such  a
          configuration is much more difficult to troubleshoot.

          A somewhat-related problem is the possibility of proprietary
          material being disclosed unintentionally  by  a  poster  who
          does  not  realize  how far his words will propagate, either
          from sheer misunderstanding or because of  errors  made  (by
          human or software) in followup preparation.  There is little
          that can be done about this except education.


          11.2. Attacks

          Although the limitations of the medium restrict what can  be
          done  to  attack  a host via news, some possibilities exist,
          most of them problems news shares with mail.

          If reading  agents  are  careless  about  transmitting  non-
          printable  characters  to  output devices, malicious posters
          may post articles  containing  control  sequences  ("letter-
          bombs")  meant to have various destructive effects on output
          devices.  Possible effects depend on the  device,  but  they
          can  include  hardware  damage  (e.g. by repeated writing of
-More-          values into configuration memories that can tolerate only  a
          limited number of write cycles) and security violation (e.g.
          by reprogramming function keys potentially  used  by  privi-
          leged readers).

          A  more  sophisticated variation on the letterbomb is inclu-
          sion of "Trojan horses"  in  programs.   Obviously,  readers
          must  be  cautious  about  using software found in news, but
          more subtly, reading agents must also exercise  care.   MIME
          messages  can  include  material  that is executable in some
          sense, such as PostScript documents (which  are  programs!),
          and letterbombs may be introduced into such material.

          Given  the  presence  of finite resources and other software
          limitations,  some  degree  of  system  disruption  can   be
          achieved  by  posting  otherwise-innocent  material in great
          volume, either in single huge articles (see section 4.6)  or
          in  a stream of modest-sized articles.  (Some would say that
          the steady growth of Usenet volume constitutes a subtle  and
          unintentional  attack  of  the latter type; certainly it can
          have disruptive effects if administrators are  inattentive.)
          Systems  need some ability to cope with surges, because sin-
          gle huge articles occur occasionally as the result of  soft-
-More-          ware error, innocent misunderstanding, or deliberate malice,
          and downtime at upstream hosts can cause droughts,  followed
          by floods, of legitimate articles.  (There is also a certain
          amount of normal variation; for example, Usenet  traffic  is
          noticeably  lighter  on  weekends and during Christmas hli-
          days, and rises noticeably at the start of the  school  term
          of  North  American  universities.)   However,  a  site that



          2 June 1994                - 88 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 11.2


          normally receives little traffic may be quite vulnerable  to
          "swamping" attack if its software is insufficiently careful.

          In general, careless implementation may open doors that  are
-More-          not  intrinsic  to  news.   In particular, implementation of
          control messages (see sections 6.6  and  7)  and  unbatchers
          (see section 8.1 and 8.2) via a command interpreter requires
          substantial precautions to ensure  that  only  the  intended
          capabilities  are  available.   Care must also be taken that
          article-supplied text is  not  fed  to  programs  that  have
          escapes to command interpreters.

          Finally,  there  is considerable potential for malice in the
          sendsys, version, and whogets control  messages.   They  are
          not  harmful  to  the hosts receiving them as news, but they
          can be used to enlist those  hosts  (by  the  thousands)  as
          unwitting  allies  in a mail-swamping attack on a victim who
          may not even receive news.   The  precautions  discussed  in
          section  7.5  can reduce the potential for such attacks con-
          siderably, but the hazard cannot be eliminated  as  long  as
          these control messages exist.


          11.3. Anarchy

          The  highly  distributed nature of news propagation, and the
          lack of adequate authentication  protocols  (especially  for
-More-          use  over  the less-interactive transport mechanisms such as
          UUCP), make article forgery relatively straightforward.   It
          may  be  possible to at least track a forgery to its source,
          once it is recognized as such, but clever forgers  can  make
          even  that  relatively difficult.  The assumption that forg-
          eries will be recognized as such is also not to be taken for
          granted;  readers  are notoriously prone to blindly assuming
          authenticity.  If  a  forged  article's  initial  path  list
          includes the relayer name of the supposed poster's host, the
          article will never be sent to that  host,  and  the  alleged
          author may learn about the forgery secondhand or not at all.

          A particularly noxious form of forgery is the  forged  "can-
          cel"  control  message.  Notably, it is relatively straight-
          forward to write software that will automatically send out a
          (forged)  cancel message for any article meeting some crite-
          rion, e.g. written by a specific author.  The authentication
          problems discussed in section 7.1 make it difficult to solve
          this without crippling cancel's important functionality.

          A related problem is the possibility of  disagreements  over
          newsgroup  creation,  on  networks where such things are not
          decided by central authorities.  There have  been  cases  of
-More-          "rmgroup wars", where one poster persistently sends out new-
          group messages to create a newsgroup  and  another,  equally
          persistently,  sends  out rmgroup messages asking that it be
          removed.  This is not particularly damaging, if relayers are



          2 June 1994                - 89 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                   sec. 11.3


          configured  to  be cautious, but can cause serious confusion
          among innocent third parties who just want to  know  whether
          they can use the newsgroup for communication or not.


          11.4. Liability

-More-          News shares the legal uncertainty surrounding other forms of
          electronic communication:  what  rules  apply  to  this  new
          medium  of  information  exchange?   News  is a particularly
          problematic case because it is  a  broadcast  medium  rather
          than  a point-to-point one like mail, and analogies to older
          forms of communication are particularly weak.

          Are news-carrying hosts common carriers, like the phone com-
          panies, providing communications paths without having either
          authority over or responsibility for content?  Or  are  they
          publishers,   responsible  for  the  content  regardless  of
          whether they are aware  of  it  or  not?   Or  something  in
          between?   Such  questions are particularly significant when
          the content is technically criminal, e.g. some types of sex-
          ually-oriented material in some jurisdictions, in which case
          ignorance of its presence may not be an adequate defence.

          Even in milder situations such as libel or copyright  viola-
          tion,  the  responsibilities  of  the  poster, his host, and
          other hosts carrying the traffic are unclear.  Note, in par-
          ticular, the problems arising when the article is a forgery,
          or when the alleged author claims it is a forgery but cannot
          prove this.
-More-

          A. Archeological Notes


          A.1. A-News Article Format

          The  obsolete  "A  News" article format consisted of

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4845 *bbs.tbbs*
10-08-94 11:33:58
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
 exactly
          five lines of header information, followed by the body.  For
          example:

               Aeagle.642
               news.misc
               cbosgd!mhuxj!mhuxt!eagle!jerry
               Fri Nov 19 16:14:55 1982
               Usenet Etiquette - Please Read
               body
               body
               body

          The first line consisted of an "A" followed by an article ID
          (analogous to a message ID and used for  similar  purposes).
          The  second line was the list of newsgroups.  The third line
          was the path.  The fourth was the date, in the format  above

-More-

          2 June 1994                - 90 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                    sec. A.1


          (all  fields  fixed  width), resembling an Internet date but
          not quite the same.  The fifth was the subject.

          This format is documented for archeological  purposes  only.
          Do not generate articles in this format.


          A.2. Early B-News Article Format

          The  obsolete  pseudo-Internet  article format, used briefly
          during the transition between the A News format and the mod-
          ern  format,  followed the general outline of a MAIL message
-More-          but with some non-standard headers.  For example:

               From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
               Newsgroups: news.misc
               Title: Usenet Etiquette -- Please Read
               Article-I.D.: eagle.642
               Posted: Fri Nov 19 16:14:55 1982
               Received: Fri Nov 19 16:59:30 1982
               Expires: Mon Jan 1 00:00:00 1990

               body
               body
               body

          The From header contained the information now found  in  the
          Path header, plus possibly the full name now typically found
          in the From header.  The Title header contained what is  now
          the  Subject  content.   The Posted header contained what is
          now the Date content.  The Article-I.D. header contained  an
          article  ID,  analogous to a message ID and used for similar
          purposes.  The Newsgroups and Expires headers were  approxi-
          mately  as now.  The Received header contained the date when
          the latest relayer to process the article first saw it.  All
-More-          dates were in the above format, with all fields fixed width,
          resembling an Internet date but not quite the same.

          This format is documented for archeological  purposes  only.
          Do not generate articles in this format.


          A.3. Obsolete Headers

          Early  versions of news software following the modern format
          sometimes generated headers like the following:

               Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
               Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
               Date-Received: Friday, 19-Nov-82 16:59:30 EST

          Relay-Version  contained  version  information   about   the
          relayer  that  last  processed the article.  Posting-Version
          contained version information about the posting  agent  that



          2 June 1994                - 91 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                    sec. A.3


          posted  the  article.  Date-Received contained the date when
          the last relayer to process the article first saw it  (in  a
          slightly nonstandard format).

          These  headers  are  documented  for  archeological purposes
          only.  Do not generate articles using them.


          A.4. Obsolete Control Messages

          There once was  a  senduuname  control  message,  resembling
          sendsys  but  requesting  transmission  of the list of hosts
          that the receiving  host  had  UUCP  connections  to.   This
          rapidly  ceased  to  be  of much use, and many organizations
          consider information about their internal connectivity to be
-More-          confidential.

          Historically,  a  checkgroups  body consisting of one or two
          lines, the first of the form "-n newsgroup",  caused  check-
          groups to apply to only that single newsgroup.  This form is
          documented for archeological purposes only; do not use it.

          Historically, an article posted to a  newsgroup  whose  name
          had  exactly  three  components of which the third was "ctl"
          signified that article was to be taken as a control message.
          The  Subject  header  specified the actions, in the same way
          the Control header does now.  This form  is  documented  for
          archeological purposes only; do not use it; do not implement
          it.


          B. A Quick Tour Of MIME

          (The editor wishes to thank Luc Rooijakkers;  most  of  this
          appendix  is a lightly-edited version of a summary he kindly
          supplied.)

          MIME (Multipurpose Internet Mail Extensions) is  an  upward-
-More-          compatible  set  of  extensions  to RFC 822, currently docu-
          mented in RFCs 1341  and  1342.   This  appendix  summarizes
          these  documents.   See  the MIME RFCs for more information;
          they are very readable.

               UNRESOLVED ISSUE:  These  RFC  numbers  (here  and
               elsewhere  in  this  Draft) need updating when the
               new MIME RFCs come out.

          MIME defines the following new headers:









          2 June 1994                - 92 -       expires 15 July 1994



-More-

          INTERNET DRAFT to be        NEWS                      sec. B


               MIME-Version
               Content-Type
               Content-Transfer-Encoding
               Content-ID
               Content-Description


          The MIME-Version header is mandatory for all  messages  con-
          forming  to  the  MIME specification and carries the version
          number of the MIME specification.  Example:

               MIME-Version: 1.0


          The Content-Type header indicates the content  type  of  the
          message.   Content types are split into a top-level type and
          a subtype, separated by a slash.  Auxiliary information  can
          also  be supplied, using an attribute-value notation.  Exam-
-More-          ple:

               Content-Type: text/plain; charset=us-ascii

          (In the absence of a Content-Type header this is in fact the
          default content type.)

          Important type/subtype combinations are

          text/plain                Plain  text,  possibly  in  a non-
                                    ASCII character set.

          text/enriched             A very  simple  wordprocessor-like
                                    language    supporting   character
                                    attributes  (e.g.,   underlining),
                                    justification  control, and multi-
                                    ple character  sets.   (This  pro-
                                    posal  has  gone  through  several
                                    iterations and has recently  split
                                    off from the main MIME RFCs into a
                                    separate document.)

          message/rfc822            A mail  message  conforming  to  a
-More-                                    slightly-relaxed  version  of  RFC
                                    822.

          message/partial           Part of a message (supporting  the
                                    transparent  splitting and joining
                                    of  messages  when  they  are  too
                                    large to be handled by some trans-
                                    port agent).

          message/external-body     A message whose body is  external.
                                    Possible  access  methods  include
                                    via mail, FTP, local file, etc.




          2 June 1994                - 93 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. B
-More-

          multipart/mixed           A message whose body  consists  of
                                    multiple  parts,  possibly of dif-
                                    ferent  types,  intended   to   be
                                    viewed in serial order.  Each part
                                    looks like  an  RFC  822  message,
                                    consisting  of headers and a body.
                                    Most of the RFC 822  headers  have
                                    no   defined  semantics  for  body
                                    parts.

          multipart/parallel        Likewise, except  that  the  parts
                                

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4848 *bbs.tbbs*
10-08-94 11:34:13
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
    are  intended to be viewed in par-
                                    allel (on user agents that support
                                    it).

          multipart/alternative     Likewise,  except  that  the parts
                                    are intended  to  be  semantically
                                    equivalent such that the part that
                                    best matches the  capabilities  of
                                    the  environment  should  be  dis-
                                    played.  For  example,  a  message
                                    may  include plain-text, enriched-
                                    text, and postscript  versions  of
                                    some document.

          multipart/digest          A variant of multipart/mixed espe-
                                    cially   intended   for    message
                                    digests  (the  default type of the
                                    parts is message/rfc822 instead of
-More-                                    text/plain,  saving  on the number
                                    of headers for the parts).

          application/postscript    A       PostScript       document.
                                    (PostScript   is  a  trademark  of
                                    Adobe.)

          Other top-level types exist for  still  images,  audio,  and
          video sampes.

          Some  of  the  above  types require the ability to transport
          binary data.  Since the existing message systems usually  do
          not  support this, MIME provides a Content-Transfer-Encoding
          header to indicate the kind of encoding used.  The  possible
          encodings are:

          7bit                 No encoding; the data consists of short
                               (less than 1000  characters)  lines  of
                               7-bit  ASCII  data,  delimited  by  EOL
                               sequences.  This is the default  encod-
                               ing.

          8bit                 Like  7bit,  except that bytes with the
-More-                               high-order  bit  set  may  be  present.
                               Many  transmission  paths are incapable



          2 June 1994                - 94 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. B


                               of carrying  messages  which  use  this
                               encoding.

          binary               No  encoding; any sequence of bytes may
                               be present.   Many  transmission  paths
                               are   incapable  of  carrying  messages
                               which use this encoding.

          base64               The data  is  encoded  by  representing
-More-                               every  group of 3 bytes as 4 characters
                               from the alphabet "A-Za-z0-9+/",  which
                               was  chosen  for  its  high  robustness
                               through  mail  gateways  (the  alphabet
                               used   by  uuencode  does  not  survive
                               ASCII-EBCDIC-ASCII  translations).   In
                               the final group of 4 characters, "=" is
                               used for those  characters  not  repre-
                               senting  data  bytes.   Line  length is
                               limited and EOLs in  the  encoded  form
                               are ignored.

          quoted-printable     Any  byte can be represented by a three
                               character "=XX" sequence where the  X's
                               are   upper  case  hexadecimal  digits.
                               Bytes representing printable 7-bit  US-
                               ASCII characters except "=" may be rep-
                               resented literally.   Tabs  and  blanks
                               may  be represented literally if not at
                               the end of a line.  Line length is lim-
                               ited,  and  an  EOL preceded by "=" was
                               inserted for this purpose  and  is  not
                               present in the original.
-More-
          The  base64  and  quoted-printable  encodings are applied to
          data in Internet canonical form, which means  that  any  EOL
          encoded  as  anything  but EOL must be an Internet canonical
          EOL:  CR followed by LF.

          The Content-Description header allows further description of
          a body part, analogous to the use of Subject for messages.

          Finally,  the  Content-ID  header  can  be used to assign an
          identification to body parts, analogous to the assignment of
          identifications to messages by Message-ID.

          Note  that  most  of  these  headers  are  structured header
          fields, as defined in RFC 822.  Consequently,  comments  are
          allowed  in  their  values.   The  following is a legal MIME
          header:

               Content-Type: (a comment) text (yeah)   /
                       plain    (and now some params:) ; charset= (guess what)
                  iso-8859-1 (we don't have iso-10646 yet, pity)


-More-

          2 June 1994                - 95 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. B


               NOTE: Although the MIME specification  was  devel-
               oped for mail, there is nothing precluding its use
               for news as well.  While it might simplify  imple-
               mentation  to  restrict the MIME headers somewhat,
               in the same way  that  other  news  headers  (e.g.
               From) are restricted subsets of the RFC-822 origi-
               nals,  this  would  add  yet  another   divergence
               between two formats that ought to be as compatible
               as possible.  In the case  of  the  MIME  headers,
               there  is no body of existing code posing compati-
               bility concerns.   A  full-featured  MIME  reading
               agent needs a full RFC-822 parser anyway, to prop-
-More-               erly  handle  body  parts  of  types   like   mes-
               sage/rfc822,   so   there   is  little  gain  from
               restricting MIME headers.  Adopting the MIME spec-
               ification unchanged seems best.  However, article-
               level MIME headers  must  still  comply  with  the
               overall  news header syntax given in section 4, so
               that news software which is NOT interested in MIME
               need not contain a full RFC-822 parser.

          The  second  part  of MIME, RFC 1342 (Representation of Non-
          ASCII Text in Internet Message Headers), addresses the prob-
          lem  of  non-ASCII  characters  in headers.  An example of a
          header using the RFC 1342 mechanism is

               From: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>

          Such encodings are allowed in selected headers,  subject  to
          the restrictions listed in RFC 1342.

          The MIME effort has also produced an RFC defining a Content-
          MD5 header [rrr 1544], containing an MD5-based "checksum" of
          the  contents of an article or body part, giving high confi-
          dence of detecting accidental modifications to the contents.
-More-
          The  "metamail"  software  package  [rrr] helps provide MIME
          support with minimal changes to mailers,  and  may  also  be
          relevant to news reading agents.

          The PEM (Privacy Enhanced Mail) effort is pursuing analogous
          facilities to offer stronger  guarantees  against  malicious
          modifications,   unauthorized  eavesdropping,  and  forgery.
          This work too may be applicable to news, once it  is  recon-
          ciled with MIME (by efforts now underway).


          C. Summary of Changes Since RFC 1036

          This  Draft  is much longer than RFC 1036, so there is obvi-
          ously  much  change  in  content.   Much  of  this  is  just
          increased precision and rigor.  Noteworthy changes and addi-
          tions include:




          2 June 1994                - 96 -       expires 15 July 1994
-More-




          INTERNET DRAFT to be        NEWS                      sec. C


               + section 4.3's restrictions on article bodies

               + all references to MIME facilities

               + size limits on articles

               + precise specification of Date-content syntax

               + message IDs must never be re-used, ever

               + "!" is the only Path delimiter

               + multiple moderator

<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4851 *bbs.tbbs*
10-08-94 11:34:28
From: NEWS
  To: ALL
Subj: RE: NETWORK SPECIFICATIONS: SON-OF-RFC 1036 (DRAFT
s in the Approved header

               + rules on References trimming, and the _-_ mechanism

               + generalization of the Xref rules

               + multiple message IDs in Cancel and Supersedes

               + Also-Control

               + See-Also

               + Article-Names

               + Article-Replacing

               + more precise rules for cancellation

-More-               + cancellation authorization based on From, not Sender

               + "unmoderated" and descriptors in newgroup messages

               + restrictive rules on handling of sendsys and  version
                 messages

               + the whogets control message

               + precise specification of checkgroups messages

               + compression type preferably specified out-of-band

               + rules for encapsulating news in MIME mail

               + tighter specification of relayer functioning (section
                 9.1)

               + the "newsmaster" contact address

               + rules for gatewaying (section 10)

               + discussion of security issues (section 11)
-More-



          2 June 1994                - 97 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. C


          D. Summary of Completely New Features

          Most of this Draft merely documents existing  practice,  but
          there are a few attempts to extend it.  These are:

          TBW


          E. Summary of Differences From RFC 822+1123

-More-          The   following  are  noteworthy  differences  between  this
          Draft's articles and MAIL messages:

               + generally less-permissive header syntax

               + notably, limited From syntax

               + MAIL header comments allowed in only a few contexts

               + slightly more restricted message-ID syntax

               + several more mandatory headers

               + duplicate headers forbidden

               + References/See-Also   versus   In-Reply-To/References
                 (section 6.5)

               + case sensitivity in some contexts

               + point-to-point  headers,  e.g.  To  and Cc, forbidden
                 (section 6)

-More-               + several new headers


          References

          [Sanderson] "Smileys", David Sanderson, O'Reilly  &  Associ-
          ates Ltd., 1993.

          TBW


          Security Considerations

          Section 11 discusses security considerations in detail.


          Author's Address






-More-
          2 June 1994                - 98 -       expires 15 July 1994





          INTERNET DRAFT to be        NEWS                      sec. -


               Henry Spencer
               henry@zoo.toronto.edu

               SP Systems
               Box 280 Stn. A
               Toronto, Ont. M5W1B2  Canada







-More-






















-More-




















          2 June 1994                - 99 -       expires 15 July 1994

-More-
-- 
alan@manawatu.planet.co.nz==alan@manawatu.gen.nz~~brown_a@kosmos.wcc.govt.nz
Manawatu Internet Services,   "We should grant power over our affairs only to
Box 678, Palmerston North,     those who are reluctant to hold it and then only
New Zealand +64 25 480-204     under conditions that increase the reluctance."


<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? N


Msg#: 4948 *bbs.tbbs*
10-08-94 15:37:03
From: NEWS
  To: ALL
Subj: NETWORK SPECIFICATIONS: RFC 822
Newsgroups: comp.bbs.tbbs
From: phil.buonomo@starship.com
Subject: NETWORK SPECIFICATIONS: RFC 822
Organization: Starship II BBS 201-935-1485
Date: Sat, 08 Oct 94 07:43:17 -0400



 >      /  "CST" / "CDT"                ;  Central:  - 6/ - 5
 >                  /  "MST" / "MDT"                ;  Mountain: - 7/ - 
 > 6
 >                  /  "PST" / "PDT"                ;  Pacific:  - 8/ - 
 > 7
 >                  /  1ALPHA                       ; Military: Z = UT;
 >                                                  ;  A:-1; (J not 
 > used)
 >                                                  ;  M:-12; N:+1; 
 > Y:+12
-More- >                  / ( ("+" / "-") 4DIGIT )        ; Local 
 > differential
 >                                                  ;  hours+min. 
 > (HHMM)
 > 
 >      5.2.  SEMANTICS
 > 
 >           If included, day-of-week must be the day implied by the 

Is there some reason for all this being posted?


<C>ubby, <R>eply, <A>gain, <N>ext, or <S>top? 