s-o-1036 June 1994
[< Prev]
[TOC] [ Next >]
9.2. Article Acceptance And Propagation
When a relayer first receives an article, it must decide
whether to accept it. (This applies regardless of whether
the article arrived by itself or as part of a batch, and in
principle regardless of whether it originated as a local
posting or as traffic from another relayer.) In a cooperat-
ing subnet with well-controlled propagation paths, some of
the tests specified here MAY be delegated to centrally-
located relayers; that is, relayers that can receive news
ONLY via one of the central relayers might simplify accep-
tance testing based on the assumption that incoming traffic
has already passed the full set of tests at a central
relayer.
The wording that follows is based on a model in which arti-
cles arrive on a relayer's host before acceptance tests are
done. However, depending on the degree of integration of
the transport mechanisms and the relayer, some or all of
these tests MAY be done before the article is actually
transmitted, so that articles which definitely will not be
accepted need not be transmitted at all.
The wording that follows also specifies a particular order
for the acceptance tests. While this order is the obvious
one, the tests MAY be done in any order.
First, the relayer MUST verify that the article is a legal
news article, with all mandatory headers present with legal
contents.
NOTE: This check in principle is done by the first
relayer to see an article, so an article received
from another relayer should always be legal, but
there is enough old software still operational
that this cannot be taken for granted; see the
discussion of the Internet Robustness Principle in
section 9.1.
Second, the relayer MUST determine whether it has already
seen this article (identified by its message ID). This is
normally done by retaining a history of all article message
IDs seen in the last N days, where the value of N is decided
by the relayer's administrator but SHOULD be at least 7.
Since N cannot practically be infinite, articles whose Date
INTERNET DRAFT to be NEWS sec. 9.2
content indicates that they are older than N days are
declared "stale" and are deemed to have been seen already.
NOTE: This check is important because news propa-
gation topology is typically redundant, often
highly so, and it is not at all uncommon for a
relayer to receive the same article from several
neighbors. The history of already-seen message
IDs can get quite large, hence the desire to limit
its length... but it is important that it be long
enough that slowly-propagating articles are not
classed as stale. News propagation within the
Internet is normally very rapid, but when UUCP
links are involved, end-to-end delays of several
days are not rare, so a week is not a particularly
generous minimum.
NOTE: Despite generally more rapid propagation in
recent times, it is still not unheard-of for some
propagation paths to be very slow. This can
introduce the possibility of old articles arriving
again after they are gone from the history. Hence
the "stale" rule.
Third, the relayer MUST determine whether any of the arti-
cle's newsgroups are "subscribed to" by the host, i.e. fit a
description of what hierarchies or newsgroups the site wants
to receive.
NOTE: This check is significant because informa-
tion on what newsgroups a relayer wishes to
receive is often stored at its neighbors, who may
not have up-to-date information or may simplify
the rules for implementation reasons. As a hedge
against the possibility of missed or delayed new-
group control messages, relayers may wish to
observe a notion of a newsgroup subscription that
is independent of the list of newsgroups actually
known to the relayer. This would permit reception
and relaying of articles in newsgroups that the
relayer is not (yet) aware of, subject to more
general criteria indicating that they are likely
to be of interest.
Once an article has been accepted, it may be passed on to
other relayers. The fundamental news propagation rule is a
flooding algorithm: on receiving and accepting an article,
send it to all neighboring relayers not already in its path
list that are sent its newsgroup(s) and distribution(s).
NOTE: The path list's role in loop prevention may
appear relatively unimportant, given that looping
articles would typically be rejected as duplicates
anyway. However, the path list's role in
INTERNET DRAFT to be NEWS sec. 9.2
preventing superfluous transmissions is not triv-
ial. In particular, the path list is the only
thing that prevents relayer X, on receiving an
article from relayer Y, from sending it back to Y
again. (Indeed, the usual symptom of confusion
about relayer names is that incoming news loops
back in this manner.) The looping articles would
be rejected as duplicates, but doubling the commu-
nications load on every news transmission path is
not to be taken lightly!
In general, relayers SHOULD not make propagation decisions
by "anticipation": relayer X, noting that the article's path
list already contains relayer Y, decides not to send it to
relayer Z because X anticipates that Z will get the article
by a better path. If that is generally true, then why is
there a news feed from X to Z at all? In fact, the "better
path" may be running slowly or may be down. News propaga-
tion is very robust precisely because some redundant trans-
mission is done "just in case". If it is imperative to
limit unnecessary traffic on a path, use of NNTP [rrr] or
ihave/sendme (see section 7.2) to pass articles only when
necessary is better than arbitrary decisions not to pass
articles at all.
Anticipation is occasionally justified in special cases.
Such cases should involve both (1) a cooperating subnet
whose propagation paths are well-understood and well-
monitored, with failures and slowdowns noticed and dealt
with promptly, and (2) a persistent pattern of heavy unnec-
essary traffic on a path that is either slow or costly. In
addition, there should be some reason why neither NNTP nor
ihave/sendme is suitable as a solution to the problem.
[< Prev]
[TOC] [ Next >]
#Diff to first older