NEWSFEEDS (5)

NAME
     newsfeeds - determine where Usenet articles get sent

DESCRIPTION
     The  file  <config$_PATH_NEWSFEEDS>  (typically   /var/news/etc/newsfeeds
     specifies how incoming articles should be distributed to other sites.  It
     is parsed by the InterNetNews server innd(8)  when it starts up,  or  when
     directed to by ctlinnd(8) .

     The file is interpreted as a set of  lines  according  to  the  following
     rules.  If a line ends with a backslash, then the backslash, the newline,
     and any whitespace at the start of the next line  is  deleted.   This  is
     repeated  until the entire ``logical'' line is collected.  If the logical
     line is blank, or starts with a number sign (``#''), it is ignored.

     All other lines are interpreted as feed entries.  An entry should consist
     of  four colon-separated fields; two of the fields may have optional sub-
     fields, marked off by a slash.  Fields or sub-fields that  take  multiple
     parameters  should  be  separated by a comma.  Extra whitespace can cause
     problems.  Except for the site names, case is significant.  The format of
     an entry is:
          sitename[/exclude,exclude,...]\
               :pattern,pattern...[/distrib,distrib...]\
               :flag,flag...\
               :param
     Each field is described below.

     The sitename is the name of the site to which a news article can be sent.
     It  is  used  for  writing  log entries and for determining if an article
     should be forwarded to a  site.   If  sitename  already  appears  in  the
     article's  Path  header,  then  the article will not be sent to the site.
     The name is usually whatever the remote site uses to identify  itself  in
     the Path line, but can be almost any word that makes sense; special local
     entries (such as archivers or  gateways)  should  probably  end  with  an
     exclamation point to make sure that they do not have the same name as any
     real site.  For example, ``gateway'' is an obvious  name  for  the  local
     entry  that  forwards articles out to a mailing list.  If a site with the
     name ``gateway'' posts an article,  when  the  local  site  receives  the
     article  it will see the name in the Path and not send the article to its
     own ``gateway'' entry.  See also the  description  of  the  ``Ap''  flag,
     below.  If an entry has an exclusion sub-field, then the article will not
     be sent to that site if any of the names specified as excludes appear  in
     the  Path  header.   The  same sitename can be used more than once -- the
     appropriate action will be taken for each entry that should  receive  the
     article,  regardless of the name -- although this is recommended only for
     program feeds to avoid confusion.  Case is not significant in site names.

     The patterns specify which groups to send to the site and are interpreted
     to  build a ``subscription list'' for the site.  The default subscription
     is to get all groups.  The patterns in  the  field  are  wildmat(3) -style
     patterns,  and  are  matched in order against the list of newsgroups that
     the local site receives.  If the first  character  of  a  pattern  is  an
     exclamation  mark,  then any groups matching the pattern are removed from
     the subscription, otherwise any matching groups are added.  For  example,
     to  receive  all  ``comp''  groups, but only comp.sources.unix within the
     sources newsgroups, the following set of patterns can be used:
          comp.*,!comp.sources.*,comp.sources.unix
     There are three things to note about this example.  The first is that the
     trailing ``.*'' is required.  The second is that, again,  the  result  of
     the   last   match   is   the   most   important.    The  third  is  that
     ``comp.sources.*'' could be written as ``comp.sources*'' but  this  would
     not have the same effect if there were a ``comp.sources-only'' group.

     There is also a way to subscribe to a newsgroup negatively.  That  is  to
     say,  do  not  send  this  group even if the article is cross-posted to a
     subscribed newsgroup.  If the first character of a pattern is  an  atsign
     ``@'',  it  means that any article posted to a group matching the pattern
     will not be sent even though the article may be cross-posted to  a  group
     which is subscribed.  The same rules of precedence apply in that the last
     match is the one which counts.  For example, if you want to  prevent  all
     articles  posted  to any "alt.binaries.warez" group from being propagated
     even if it is cross-posted to another "alt" group or any other group  for
     that matter, then the following set of patterns can be used:
          alt.*,@alt.binaries.warez.*,misc.*
     If you reverse the alt.*  and  alt.binaries.warez.*  patterns,  it  would
     nullify  the  atsign because the result of the last match is the one that
     counts.  Using the above example, if an article is posted to one or  more
     of the alt.binaries.warez.* groups and is cross-posted to misc.test, then
     the article is not sent.

     See innd(8)  for details on the propagation of control messages.

     A subscription can be further modified  by  specifying  ``distributions''
     that  the  site should or should not receive.  The default is to send all
     articles to all sites that subscribe to any of the groups  where  it  has
     been  posted  ,  but  if  an  article  has  a Distribution header and any
     distribs are specified, then they are checked according to the  following
     rules:

     1.   If the Distribution header matches any of the  values  in  the  sub-
          field, then the article is sent.

     2.   If a distrib starts with an exclamation point, and  it  matches  the
          Distribution header, then the article is not sent.

     3.   If Distribution header does not match  any  distrib  in  the  site's
          entry, and no negations were used, then the article is not sent.

     4.   If Distribution header does not match  any  distrib  in  the  site's
          entry,  and  any distrib started with an exclamation point, then the
          article is sent.

     If an article has more than one distribution specified, then each one  is
     according  to  the  above  rules.   If any of the specified distributions
     indicate that the article should be sent, it is; if none do,  it  is  not
     sent  -- the rules are used as a ``logical or.''  It is almost definitely
     a mistake to have a single feed that specifies distributions  that  start
     with an exclamation point along with some that don't.

     Distributions are text words, not patterns; entries like ``*'' or ``all''
     have no special meaning.

     The flags parameter specifies  miscellaneous  parameters.   They  may  be
     specified  in  any  order;  flags  that take values should have the value
     immediately after the flag letter with no whitespace.   The  valid  flags
     are:

     <size
          An article will only be sent to the site if it  is  less  than  size
          bytes long.  The default is no limit.

     >size
          An article will only be sent to the site if it is greater than  size
          bytes long.  The default is no limit.

     Achecks
          An  article  will  only  be  sent  to  the  site  if  it  meets  the
          requirements  specified  in  the checks, which should be chosen from
          the following set:
               d       Distribution header required
               p       Do not check Path header for the sitename before
                    propagating (the exclusions are still checked).

     Bhigh/low
          If a site is being fed by a file, channel, or exploder (see  below),
          the  server  will  normally start trying to write the information as
          soon as  possible.   Providing  a  buffer  may  give  better  system
          performance  and  help  smooth  out overall load if a large batch of
          news comes in.  The value of the this flag  should  be  two  numbers
          separated  by  a  slash.  The first specifies the point at which the
          server can start draining the feed's  I/O  buffer,  and  the  second
          specifies  when to stop writing and begin buffering again; the units
          are bytes.  The default is to do no  buffering,  sending  output  as
          soon as it is possible to do so.

     Fname
          This flag specifies the name of the file that should be used  if  it
          is necessary to begin spooling for the site (see below).  If name is
          not  an  absolute  pathname,  it  is  taken  to   be   relative   to
          <config$_PATH_BATCHDIR>    (typically    /var/news/spool/out.going.)
          Then, if the destination is a  directory,  the  file  togo  in  that
          directory will be used as filename.

     Gcount
          If this flag is specified, an article will only be sent to the  site
          if it is posted to no more than count newsgroups.

     Hcount
          If this flag is specified, an article will only be sent to the  site
          if  it  has count or fewer sites in its Path line.  This flag should
          only be used as a rough guide because of the loose interpretation of
          the Path header; some sites put the poster's name in the header, and
          some sites that might logically be considered to be one  hop  become
          two  because  they put the posting workstation's name in the header.
          The default value for count is one.

     Isize
          The flag specifies the size of the internal buffer for a file  feed.
          If  there  are more file feeds then allowed by the system, they will
          be  buffered  internally  in  least-recently-used  order.   If   the
          internal buffer grows bigger then size bytes, however, the data will
          be written out to  the  appropriate  file.   The  default  value  is
          <config$SITE_BUFFER_SIZE> bytes (typically (16 * 1024 ) .)

     Nmodifiers
          The newsgroups that a site receives are modified  according  to  the
          modifiers, which should be chosen from the following set:
               m       Only moderated groups
               u       Only unmoderated groups

     Ssize
          If the amount of data queued for the site gets  to  be  larger  than
          size  bytes, then the server will switch to spooling, appending to a
          file  specified  by  the  ``F''  flag,  or  <config$_PATH_BATCHDIR>/
          sitename  (typically  /var/news/spool/out.going/  sitename)  if  the
          ``F'' flag is not specified.   Spooling  usually  happens  only  for
          channel or exploder feeds.

     Ttype
          This flag specifies the type of feed for the site.  Type should be a
          letter chosen from the following set:
               c       Channel
               f       File
               l       Log entry only
               m       Funnel (multiple entries feed into one)
               p       Program
               x       Exploder
          Each feed is described below in the  section  on  feed  types.   The
          default is Tf.

     Witems
          If a site is fed by file, channel, or exploder, this  flag  controls
          what  information  is  written.  If a site is fed by a program, only
          the asterisk (``*'') has any effect.  The  items  should  be  chosen
          from the following set:
               b       Size of the article in bytes
               f       Article's full pathname
               g       The newsgroup the article is in;
                    if cross-posted, then the first of the groups this
                    site gets
               m       Article's Message-ID
               n       Article's pathname relative to the spool directory
               p       The time the article was posted as seconds since epoch.
               s       The site that fed the article to the server;
                    from the Path header, when <config$IPADDR_LOG> is DONT,
                    otherwise the IP address.
               t       Time article was received as seconds since epoch
               *       Names of the appropriate funnel entries;
                    or all sites that get the article
               D       Value of the Distribution header;
                    ? if none present
               H       All headers
               N       Value of the Newsgroups header
               O       Overview data
               R       Information needed for replication
          More than one letter can be used; the entries will be separated by a
          space,  and  written  in the order in which they are specified.  The
          default is Wn.

          The ``H'' and ``O'' items are intended  for  use  by  programs  that
          create  news  overview databases.  If ``H'' is present, then the all
          the article's headers are written followed by a blank line.  An Xref
          header  (even  if  one  does  not appear in the filed article) and a
          Bytes header, specifying the article's size, will also  be  part  of
          the  headers.  If used, this should be the only item in the list; if
          preceeded by other items, however, a newline will be written  before
          the  headers.  The ``O'' generates input to the overchan(8)  program.
          It, too, should be the only item in the list.
          The asterisk has special meaning.  It expands to  a  space-separated
          list of all sites that received the current article.  If the site is
          the target of a funnel however (i.e., it is  named  by  other  sites
          which have a ``Tm'' flag), then the asterisk expands to the names of
          the funnel feeds that received the article.  If the site is fed by a
          program,  then  an asterisk in the param field will be expanded into
          the list of funnel feeds that received the article.  A site fed by a
          program  cannot  get  the site list unless it is the target of other
          ``Tm'' feeds.

     The interpretation of the param field depends on the type of feed, and is
     explained  in  more detail below in the section on feed types.  It can be
     omitted.

     The site named ME is special.  There should only be one such  entry,  and
     it  should  be  the  first  entry  in  the  file.   If the ME entry has a
     subscription list, then that  list  is  automatically  prepended  to  the
     subscription    list    of    all    other    entries.     For   example,
     ``*,!control,!junk,!foo.*''  can  be  used  to   set   up   the   initial
     subscription list for all feeds so that local postings are not propagated
     unless ``foo.* explicitly appears in the site's subscription list.   Note
     that  most  subscriptions should have ``!junk,!control'' in their pattern
     list; see the discussion of ``control  messages''  in  innd(8) .   (Unlike
     other news software, it does not affect what groups are received; that is
     done by the active(5)  file.)

     If the ME entry has a distribution  subfield,  then  only  articles  that
     match  the  distribution  list  are  accepted;  all  other  articles  are
     rejected.  A commercial news server, for example, might have  ``/!local''
     to reject local postings from other, misconfigured, sites.

FEED TYPES
     Innd provides four basic types of feeds: log, file, program, and channel.
     An  exploder  is a special type of channel.  In addition, several entries
     can feed into the same feed; these are funnel feeds,  that  refer  to  an
     entry  that  is  one  of the other types.  Note that the term ``feed'' is
     technically a misnomer, since the server does not transfer articles,  but
     reports that an article should be sent to the site.

     The simplest feed is one that is fed  by  a  log  entry.   Other  than  a
     mention  in  the  news  logfile,  no  data  is ever written out.  This is
     equivalent to a ``Tf'' entry writing to /dev/null except that no file  is
     opened.

     A site fed by a file is simplest type of  feed.   When  the  site  should
     receive  an  article,  one line is written to the file named by the param
     field.  If param is not an absolute pathname, it is taken to be  relative
     to  <config$_PATH_BATCHDIR>  (typically  /var/news/spool/out.going.)   If
     empty,  the   filename   defaults   to   <config$_PATH_BATCHDIR>/sitename
     (typically  /var/news/spool/out.going/sitename.)   This  name  should  be
     unique.

     When a site fed by a file is flushed (see ctlinnd), the  following  steps
     are  performed.  The script doing the flush should have first renamed the
     file.  The server tries to write out any buffered data, and  then  closes
     the  file.   The  renamed file is now available for use.  The server will
     then re-open the original file, which will now get created.

     A site fed by a program has a process spawned for every article that  the
     site  receives.   The param field must be a sprintf(3) format string that
     may have a single %s parameter, which will be given a  pathname  for  the
     article, relative to the news spool directory.  The full path name may be
     obtained by prefixing the %s  in  the  param  field  by  the  news  spool
     directory prefix.  Standard input will be set to the article or /dev/null
     if the article cannot be opened for some  reason.   Standard  output  and
     error  will  be set to the error log.  The process will run with the user
     and  group  ID  of  the  <config$_PATH_INNDDIR>  directory  (typically  .
     /var/news/run.)  Innd  will  try to avoid spawning a shell if the command
     has no shell meta-characters; this feature can be defeated by appending a
     semi-colon  to  the end of the command.  The full pathname of the program
     to be run must be specified; for security, PATH is not searched.

     If the entry is the target of a funnel, and if the ``W*'' flag  is  used,
     then  a  single  asterisk may be used in the param field where it will be
     replaced by the names of the sites that fed  into  the  funnel.   If  the
     entry  is  not  a  funnel,  or  if  the ``W*'' flag is not used, then the
     asterisk has no special meaning.

     Flushing a site fed by a program does no action.

     When a site is fed by a channel or exploder, the param  field  names  the
     process to start.  Again, the full pathname of the process must be given.
     When the site is to receive an article, the process receives  a  line  on
     its  standard  input  telling  it about the article.  Standard output and
     error, and the user and group ID of the all sub-process are set as for  a
     program feed, above.  If the process exits, it will be restarted.  If the
     process cannot be started, the server will spool input to  a  file  named
     <config$_PATH_BATCHDIR>/sitename                               (typically
     /var/news/spool/out.going/sitename.)  It  will  then  try  to  start  the
     process some time later.

     When a site fed by a channel or exploder is flushed,  the  server  closes
     down  its  end  of  the pipe.  Any pending data that has not been written
     will be spooled; see the description of the ``S'' flag, above.  No signal
     is  sent; it is up to the program to notice EOF on its standard input and
     exit.  The server then starts a new process.

     Exploders are a superset  of  channel  feeds.   In  addition  to  channel
     behavior, exploders can be sent command lines.  These lines start with an
     exclamation point, and their interpretation is up to the  exploder.   The
     following messages are generated automatically by the server:
          newgroup group
          rmgroup group
          flush
          flush site
     These messages are sent when the ctlinnd command  of  the  same  name  is
     received by the server.  In addition, the ``send'' command can be used to
     send an arbitrary  command  line  to  the  exploder  child-process.   The
     primary exploder is buffchan(8) .

     Funnel feeds provide a way of merging several site entries into a  single
     output  stream.   For a site feeding into a funnel, the param field names
     the actual entry that does the feeding.

     For more details on setting up different types of news feeds, see the INN
     installation manual.

EXAMPLES
          ##  Initial subscription list and our distributions.
          ME:*,!junk,!foo.*/world,usa,na,ne,foo,ddn,gnu,inet\
               ::
          ##  Feed all moderated source postings to an archiver
          source-archive!:!*,*sources*,!*wanted*,!*.d\
               :Tc,Wn:<config$_PATH_NEWSBIN>/archive -f -i \
                  /usr/spool/news.archive/INDEX
          ##  Watch for big postings
          watcher!:*\
               :Tc,Wbnm\
               :exec awk '$1 > 1000000 { print "BIG", $2, $3 }' >/dev/console
          ##  A UUCP feed, where we try to keep the "batching" between  4  and
          1K.
          ihnp4:/world,usa,na,ddn,gnu\
               :Tf,Wnb,B4096/1024:
          ##  Usenet as mail; note ! in funnel name to avoid Path conflicts.
          ##  Can't use ! in "fred" since it would like look a UUCP address.
          fred:!*,comp.sources.unix,comp.sources.bugs\
               :Tm:mailer!
          barney@bar.com:!*,news.software.nntp,comp.sources.bugs\
               :Tm:mailer!
          mailer!:!*\
               :W*,Tp:/usr/ucb/Mail -s "News article" *
          ##  NNTP feeds fed off-line via nntpsend or equivalent.
          feed1::Tf,Wnm:feed1.domain.name
          peer.foo.com:foo.*:Tf,Wnm:peer.foo.com
          ##  Real-time transmission.
          mit.edu:/world,usa,na,ne,ddn,gnu,inet\
               :Tc,Wnm:<config$_PATH_NEWSBIN>/nntplink -i stdin mit.edu
          ##  Two sites feeding into a hypothetical NNTP fan-out program:
          nic.near.net:\
               :Tm:nntpfunnel1
          uunet.uu.net/uunet:!ne.*/world,usa,na,foo,ddn,gnu,inet\
               :Tm:nntpfunnel1
          nntpfunnel1:!*\
               :Tc,Wmn*:<config$_PATH_NEWSBIN>/nntpfanout
          ##  A UUCP site that wants comp.* and moderated soc groups
          uucpsite!comp:!*,comp.*/world,usa,na,gnu\
               :Tm:uucpsite
          uucpsite!soc:!*,soc.*/world,usa,na,gnu\
               :Tm,Nm:uucpsite
          uucpsite:!*\
               :Tf,Wnb:/usr/spool/batch/uucpsite

     The last two sets of entries show how funnel  feeds  can  be  used.   For
     example, the nntpfanout program would receive lines like the following on
     its standard input:
          <123@litchi.foo.com> comp/sources/unix/888 nic.near.net uunet.uu.net
          <124@litchi.foo.com> ne/general/1003 nic.near.net
     Since the UUCP funnel is only destined for one site, the asterisk is  not
     needed and entries like the following will be written into the file:
          <qwe#37x@snark.uu.net> comp/society/folklore/3
          <123@litchi.foo.com> comp/sources/unix/888

HISTORY
     Written by Rich $alz  <rsalz@uunet.uu.net>  for  InterNetNews.   This  is
     revision 1.35, dated 1996/12/17.

SEE ALSO
     active(5) , buffchan(8) , ctlinnd(8) , innd(8) , wildmat(3) .

You can find a summary and links related to this topic
as part of the Mib Software Usenet RKT.