Movatterモバイル変換
[0]ホーム
[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]
Obsoleted by:1036 UNKNOWN
RFC 850 June 1983Standard for Interchange of USENET Messages Mark R. Horton[ This memo is distributed as an RFC only to make thisinformation easily accessible to researchers in the ARPAcommunity. It does not specify an Internet standard. ]1. IntroductionThis document defines the standard format for interchangeof Network News articles among USENET sites. It describesthe format for articles themselves, and gives partialstandards for transmission of news. The news transmissionis not entirely standardized in order to give a good dealof flexibility to the individual hosts to choosetransmission hardware and software, whether to batch news,and so on.There are five sections to this document. Section twosection defines the format. Section three defines thevalid control messages. Section four specifies some validtransmission methods. Section five describes the overallnews propagation algorithm.2. Article FormatThe primary consideration in choosing an article format isthat it fit in with existing tools as well as possible.Existing tools include both implementations of mail andnews. (The notesfiles system from the University ofIllinois is considered a news implementation.) A standardformat for mail messages has existed for many years on theARPANET, and this format meets most of the needs ofUSENET. Since the ARPANET format is extensible,extensions to meet the additional needs of USENET areeasily made within the ARPANET standard. Therefore, therule is adopted that all USENET news articles must beformatted as valid ARPANET mail messages, according to theARPANET standard RFC 822. This standard is morerestrictive than the ARPANET standard, placing additionalrequirements on each article and forbidding use of certainARPANET features. However, it should always be possibleto use a tool expecting an ARPANET message to process anews article. In any situation where this standardconflicts with the ARPANET standard, RFC 822 should beconsidered correct and this standard in error. - 1 -
An example message is included to illustrate the fields. Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP Posting-Version: version B 2.10 2/13/83; site eagle.UUCP Path: cbosgd!mhuxj!mhuxt!eagle!jerry From: jerry@eagle.uucp (Jerry Schwarz) Newsgroups: net.general Subject: Usenet Etiquette -- Please Read Message-ID: <642@eagle.UUCP> Date: Friday, 19-Nov-82 16:14:55 EST Followup-To: net.news Expires: Saturday, 1-Jan-83 00:00:00 EST Date-Received: Friday, 19-Nov-82 16:59:30 EST Organization: Bell Labs, Murray Hill The body of the article comes here, after a blank line.Here is an example of a message in the old format (beforethe existence of this standard). It is recommended thatimplementations also accept articles in this format toease upward conversion. From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) Newsgroups: net.general Title: Usenet Etiquette -- Please Read Article-I.D.: eagle.642 Posted: Fri Nov 19 16:14:55 1982 Received: Fri Nov 19 16:59:30 1982 Expires: Mon Jan 1 00:00:00 1990 The body of the article comes here, after a blank line.Some news systems transmit news in the "A" format, whichlooks like this: Aeagle.642 net.general cbosgd!mhuxj!mhuxt!eagle!jerry Fri Nov 19 16:14:55 1982 Usenet Etiquette - Please Read The body of the article comes here, with no blank line.An article consists of several header lines, followed by ablank line, followed by the body of the message. Theheader lines consist of a keyword, a colon, a blank, andsome additional information. This is a subset of theARPANET standard, simplified to allow simpler software tohandle it. The "from" line may optionally include afull name, in the format above, or use the ARPANET anglebracket syntax. To keep the implementations simple, otherformats (for example, with part of the machine addressafter the close parenthesis) are not allowed. The ARPANETconvention of continuation header lines (beginning with ablank or tab) is allowed. - 2 -
Certain headers are required, certain headers areoptional. Any unrecognized headers are allowed, and willbe passed through unchanged. The required headers areRelay-Version, Posting-Version, From, Date, Newsgroups,Subject, Message-ID, Path. The optional headers areFollowup-To, Date-Received, Expires, Reply-To, Sender,References, Control, Distribution, Organization.2.1 Required Headers2.1.1 Relay-VersionThis header line shows the versionof the program responsible for the transmission of thisarticle over the immediate link, that is, the program thatis relaying the article from the next site. For example,suppose site A sends an article to site B, and site Bforwards the article to site C. The message beingtransmitted from A to B would have a Relay-Version headeridentifying the program running on A, and the messagetransmitted from B to C would identify the program runningon B. This header can be used to interpret older headersin an upward compatible way. Relay-Version must always bethe first in a message; thus, all articles meeting thisstandard will begin with an upper case "R". No otherrestrictions are placed on the order of header lines.The line contains two fields, separated by semicolons.The fields are the version and the full domain name of thesite. The version should identify the system program used(e.g., "B") as well as a version number and versiondate. For example, the header line might contain Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCPThis header should not be passed on to additional sites.A relay program, when passing an article on, shouldinclude only its own Relay-Version, not the Relay-Versionof some other site. (For upward compatibility with oldersoftware, if a Relay-Version is found in a header which isnot the first line, it should be assumed to be moved by anolder version of news and deleted.)2.1.2 Posting-Version This header identifies thesoftware responsible for entering this message into thenetwork. It has the same format as Relay-Version. Itwill normally identify the same site as the Message-ID,unless the posting site is serving as a gateway for amessage that already contains a message ID generated bymail. (While it is permissible for a gateway to use anexternally generated message ID, the message ID should bechecked to ensure it conforms to this standard and to RFC822.) - 3 -
2.1.3 FromThe From line contains the electronic mailingaddress of the person who sent the message, in the ARPAinternet syntax. It may optionally also contain the fullname of the person, in parentheses, after the electronicaddress. The electronic address is the same as the entityresponsible for originating the article, unless the Senderheader is present, in which case the From header might notbe verified. Note that in all site and domain names,upper and lower case are considered the same, thusmark@cbosgd.UUCP, mark@cbosgd.uucp, and mark@CBosgD.UUcpare all equivalent. User names may or may not be casesensitive, for example, Billy@cbosgd.UUCP might bedifferent from BillY@cbosgd.UUCP. Programs should avoidchanging the case of electronic addresses when forwardingnews or mail.RFC 822 specifies that all text in parentheses is to beinterpreted as a comment. It is common in ARPANET mail toplace the full name of the user in a comment at the end ofthe From line. This standard specifies a more rigidsyntax. The full name is not considered a comment, but anoptional part of the header line. Either the full name isomitted, or it appears in parentheses after the electronicaddress of the person posting the article, or it appearsbefore an electronic address enclosed in angle brackets.Thus, the three permissible forms are: From: mark@cbosgd.UUCP From: mark@cbosgd.UUCP (Mark Horton) From: Mark Horton <mark@cbosgd.UUCP>Full names may contain any printing ASCII characters fromspace through tilde, with the exceptions that they may notcontain parentheses "(" or ")", or angle brackets"<" or ">". Additional restrictions may be placed onfull names by the mail standard, in particular, thecharacters comma ",", colon ":", and semicolon ";"are inadvisable in full names.2.1.4 DateThe Date line (formerly "Posted") is thedate, in a format that must be acceptable both to theARPANET and to the getdate routine, that the article wasoriginally posted to the network. This date remainsunchanged as the article is propagated throughout thenetwork. One format that is acceptable to both is Weekday, DD-Mon-YY HH:MM:SS TIMEZONESeveral examples of valid dates appear in the samplearticle above. Note in particular that ctime format: Wdy Mon DD HH:MM:SS YYYY - 4 -
is not acceptable because it is not a valid ARPANET date.However, since older software still generates this format,news implementations are encouraged to accept this formatand translate it into an acceptable format.The contents of the TIMEZONE field is currently subject toworldwide time zone abbreviations, including the usualAmerican zones (PST, PDT, MST, MDT, CST, CDT, EST, EDT),the other North American zones (Bering throughNewfoundland), European zones, Australian zones, and soon. Lacking a complete list at present (and unsure if anunambiguous list exists), authors of software areencouraged to keep this code flexible, and in particularnot to assume that time zone names are exactly threeletters long. Implementations are free to edit thisfield, keeping the time the same, but changing the timezone (with an appropriate adjustment to the local timeshown) to a known time zone.2.1.5 NewsgroupsThe Newsgroups line specifies whichnewsgroup or newsgroups the article belongs in. Multiplenewsgroups may be specified, separated by a comma.Newsgroups specified must all be the names of existingnewsgroups, as no new newsgroups will be created by simplyposting to them.Wildcards (e.g., the word "all") are never allowed in aNewsgroups line. For example, a newsgroup "net.all" isillegal, although a newsgroup name "net.sport.football"is permitted.If an article is received with a Newsgroups line listingsome valid newsgroups and some invalid newsgroups, a siteshould not remove invalid newsgroups from the list.Instead, the invalid newsgroups should be ignored. Forexample, suppose site A subscribes to the classes"btl.all" and "net.all", and exchanges news articleswith site B, which subscribes to "net.all" but not"btl.all". Suppose A receives an article with"Newsgroups: net.micro,btl.general". This article ispassed on to B because B receives net.micro, but B doesnot receive btl.general. A must leave the Newsgroup lineunchanged. If it were to remove "btl.general", theedited header could eventually reenter the "btl.all"class, resulting in an article that is not shown to userssubscribing to "btl.general". Also, followups fromoutside "btl.all" would not be shown to such users. - 5 -
2.1.6 Subject The Subject line (formerly "Title")tells what the article is about. It should be suggestiveenough of the contents of the article to enable a readerto make a decision whether to read the article based onthe subject alone. If the article is submitted inresponse to another article (e.g., is a "followup") thedefault subject should begin with the four characters"Re: " and the References line is required. (The usermight wish to edit the subject of the followup, but thedefault should begin with "Re: ".)2.1.7 Message-IDThe Message-ID line gives the article aunique identifier. The same message ID may not be reusedduring the lifetime of any article with the same messageID. (It is recommended that no message ID be reused forat least two years.) Message ID's have the syntax "<" "string not containing blank or >" ">"In order to conform toRFC 822, the Message-ID must havethe format "<" "unique" "@" "full domain name" ">"where "full domain name" is the full name of the host atwhich the article entered the network, including a domainthat host is in, and unique is any string of printingASCII characters, not including "<", ">", or "@". Forexample, the "unique" part could be an integerrepresenting a sequence number for articles submitted tothe network, or a short string derived from the date andtime the article was created. For example, valid messageID for an article submitted from site ucbvax in domainBerkeley.ARPA would be "<4123@ucbvax.Berkeley.ARPA>".Programmers are urged not to make assumptions about thecontent of message ID fields from other hosts, but totreat them as unknown character strings. It is not safe,for example, to assume that a message ID will be under 14characters, nor that it is unique in the first 14characters.The angle brackets are considered part of the message ID.Thus, in references to the message ID, such as theihave/sendme and cancel control messages, the anglebrackets are included. White space characters (e.g.,blank and tab) are not allowed in a message ID. Allcharacters between the angle brackets must be printingASCII characters.2.1.8 PathThis line shows the path the article took toreach the current system. When a system forwards themessage, it should add its own name to the list of systemsin the Path line. The names may be separated by anypunctuation character or characters, thus - 6 -
"cbosgd!mhuxj!mhuxt", "cbosgd, mhuxj, mhuxt", and"@cbosgd.uucp,@mhuxj.uucp,@mhuxt.uucp" and even"teklabs, zehntel, sri-unix@cca!decvax" are validentries. (The latter path indicates a message that passedthrough decvax, cca, sri-unix, zehntel, and teklabs, inthat order.) Additional names should be added from theleft, for example, the most recently added name in thethird example was "teklabs". Letters, digits, periodsand hyphens are considered part of site names; otherpunctuation, including blanks, are considered separators.Normally, the rightmost name will be the name of theoriginating system. However, it is also permissible toinclude an extra entry on the right, which is the name ofthe sender. This is for upward compatibility with oldersystem.The Path line is not used for replies, and should not betaken as a mailing address. It is intended to show theroute the message travelled to reach the local site.There are several uses for this information. One is tomonitor USENET routing for performance reasons. Anotheris to establish a path to reach new sites. Perhaps themost important is to cut down on redundant USENET trafficby failing to forward a message to a site that is known tohave already received it. In particular, when site Asends an article to site B, the Path line includes "A",so that site B will not immediately send the article backto site A. The site name each site uses to identifyitself should be the same as the name by which itsneighbors know it, in order to make this optimizationpossible.A site adds its own name to the front of a path when itreceives a message from another site. Thus, if a messagewith path A!X!Y!Z is passed from site A to site B, B willadd its own name to the path when it receives the messagefrom A, e.g., B!A!X!Y!Z. If B then passes the message onto C, the message sent to C will contain the pathB!A!X!Y!Z, and when C receives it, C will change it toC!B!A!X!Y!Z.Special upward compatibility note: Since the From, Sender,and Reply-To lines are in internet format, and since manyUSENET sites do not yet have mailers capable ofunderstanding internet format, it would break the replycapability to completely sever the connection between thePath header and the reply function. Thus, sites arerequired to continue to keep the Path line in a workingreply format as much as possible, until January 1, 1984.It is recognized that the path is not always a valid replystring in older implementations, and no requirement to fixthis problem is placed on implementations. However, the - 7 -
existing convention of placing the site name and an "!"at the front of the path, and of starting the path withthe site name, an "!", and the user name, should bemaintained at least until 1984.2.2 Optional Headers2.2.1 Reply-ToThis line has the same format as From.If present, mailed replies to the author should be sent tothe name given here. Otherwise, replies are mailed to thename on the From line. (This does not prevent additionalcopies from being sent to recipients named by the replier,or on To or Cc lines.) The full name may be optionallygiven, in parentheses, as in the From line.2.2.2 SenderThis field is present only if the submittermanually enters a From line. It is intended to record theentity responsible for submitting the article to thenetwork, and should be verified by the software at thesubmitting site.For example, if John Smith is visiting CCA and wishes topost an article to the network, using friend Sarah Jonesaccount, the message might read From: smith@ucbvax.uucp (John Smith) Sender: jones@cca.arpa (Sarah Jones)If a gateway program enters a mail message into thenetwork at site sri-unix, the lines might read From: John.Doe@CMU-CS-A.ARPA Sender: network@sri-unix.ARPAThe primary purpose of this field is to be able to trackdown articles to determine how they were entered into thenetwork. The full name may be optionally given, inparentheses, as in the From line.2.2.3 Followup-ToThis line has the same format asNewsgroups. If present, follow-up articles are to beposted to the newsgroup(s) listed here. If this line isnot present, followups are posted to the newsgroup(s)listed in the Newsgroups line, except that followups to"net.general" should instead go to "net.followup".2.2.4 Date-ReceivedThis line (formerly "Received") isin a legal USENET date format. It records the date andtime that the article was first received on the localsystem. If this line is present in an article beingtransmitted from one host to another, the receiving hostshould ignore it and replace it with the current date.Since this field is intended for local use only, no siteis required to support it. However, no site should passthis field on to another site unchanged. - 8 -
2.2.5 ExpiresThis line, if present, is in a legalUSENET date format. It specifies a suggested expirationdate for the article. If not present, the local defaultexpiration date is used.This field is intended to be used to clean up articleswith a limited usefulness, or to keep important articlesaround for longer than usual. For example, a messageannouncing an upcoming seminar could have an expirationdate the day after the seminar, since the message is notuseful after the seminar is over. Since local sites havelocal policies for expiration of news (depending onavailable disk space, for instance), users are discouragedfrom providing expiration dates for articles unless thereis a natural expiration date associated with the topic.System software should almost never provide a defaultExpires line. Leave it out and allow local policies to beused unless there is a good reason not to.2.2.6 ReferencesThis field lists the message ID's ofany articles prompting the submission of this article. Itis required for all follow-up articles, and forbidden whena new subject is raised. Implementations should provide afollow-up command, which allows a user to post a follow-uparticle. This command should generate a Subject linewhich is the same as the original article, except that ifthe original subject does not begin with "Re: " or "re: ",the four characters "Re: " are inserted before thesubject. If there is no References line on the originalheader, the References line should contain the message IDof the original article (including the angle brackets).If the original article does have a References line, thefollowup article should have a References line containingthe text of the original References line, a blank, and themessage ID of the original article.The purpose of the References header is to allow articlesto be grouped into conversations by the user interfaceprogram. This allows conversations within a newsgroup tobe kept together, and potentially users might shut offentire conversations without unsubscribing to a newsgroup.User interfaces may not make use of this header, but allautomatically generated followups should generate theReferences line for the benefit of systems that do use it,and manually generated followups (e.g. typed in well afterthe original article has been printed by the machine)should be encouraged to include them as well.2.2.7 ControlIf an article contains a Control line, thearticle is a control message. Control messages are usedfor communication among USENET host machines, not to beread by users. Control messages are distributed by thesame newsgroup mechanism as ordinary messages. The bodyof the Control header line is the message to the host. - 9 -
For upward compatibility, messages that match thenewsgroup pattern "all.all.ctl" should also beinterpreted as control messages. If no Control: header ispresent on such messages, the subject is used as thecontrol message. However, messages on newsgroups matchingthis pattern do not conform to this standard.2.2.8 Distribution This line is used to alter thedistribution scope of the message. It has the same formatas the Newsgroups line. User subscriptions are stillcontrolled by Newsgroups, but the message is sent to allsystems subscribing to the newsgroups on the Distributionline instead of the Newsgroups line. Thus, a car for salein New Jersey might have headers including Newsgroups: net.auto,net.wanted Distribution: nj.allso that it would only go to persons subscribing tonet.auto or net.wanted within New Jersey. The intent ofthis header is to further restrict the distribution of anewsgroup, not to increase it. A local newsgroup, such asnj.crazy-eddie, will probably not be propagated by sitesoutside New Jersey that do not show such a newsgroup asvalid. Wildcards in newsgroup names in the Distributionline are allowed. Followup articles should default to thesame Distribution line as the original article, but theuser can change it to a more limited one, or escalate thedistribution if it was originally restricted and a morewidely distributed reply is appropriate.2.2.9 OrganizationThe text of this line is a shortphrase describing the organization to which the senderbelongs, or to which the machine belongs. The intent ofthis line is to help identify the person posting themessage, since site names are often cryptic enough to makeit hard to recognize the organization by the electronicaddress.3. Control MessagesThis section lists the control messages currently defined.The body of the Control header is the control message.Messages are a sequence of zero or more words, separatedby white space (blanks or tabs). The first word is thename of the control message, remaining words areparameters to the message. The remainder of the headerand the body of the message are also potential parameters;for example, the From line might suggest an address towhich a response is to be mailed. - 10 -
Implementors and administrators may choose to allowcontrol messages to be automatically carried out, or toqueue them for manual processing. However, manuallyprocessed messages should be dealt with promptly.3.1 Cancel cancel <message ID>If an article with the given message ID is present on thelocal system, the article is cancelled. This mechanismallows a user to cancel an article after the article hasbeen distributed over the network.Only the author of the article or the local super user isallowed to use this message. The verified sender of amessage is the Sender line, or if no Sender line ispresent, the From line. The verified sender of the cancelmessage must be the same as either the Sender or Fromfield of the original message. A verified sender in thecancel message is allowed to match an unverified From inthe original message.3.2 Ihave/Sendme ihave <message ID list> <remotesys> sendme <message ID list> <remotesys>This message is part of the "ihave/sendme" protocol,which allows one site (say "A") to tell another site("B") that a particular message has been received on A.Suppose that site A receives article "ucbvax.1234", andwishes to transmit the article to site B. A sends thecontrol message "ihave ucbvax.1234 A" to site B (byposting it to newsgroup "to.B"). B responds with thecontrol message "sendme ucbvax.1234 B" (on newsgroupto.A) if it has not already received the article. Uponreceiving the Sendme message, A sends the article to B.This protocol can be used to cut down on redundant trafficbetween sites. It is optional and should be used only ifthe particular situation makes it worthwhile. Frequently,the outcome is that, since most original messages areshort, and since there is a high overhead to start sendinga new message with UUCP, it costs as much to send theIhave as it would cost to send the article itself.One possible solution to this overhead problem is to batchrequests. Several message ID's may be announced orrequested in one message. If no message ID's are listedin the control message, the body of the message should bescanned for message ID's, one per line. - 11 -
3.3 Newgroup newgroup <groupname>This control message creates a new newsgroup with the namegiven. Since no articles may be posted or forwarded untila newsgroup is created, this message is required before anewsgroup can be used. The body of the message isexpected to be a short paragraph describing the intendeduse of the newsgroup.3.4 Rmgroup rmgroup <groupname>This message removes a newsgroup with the given name.Since the newsgroup is removed from every site on thenetwork, this command should be used carefully by aresponsible administrator.3.5 Sendsys sendsys (no arguments)The "sys" file, listing all neighbors and whichnewsgroups are sent to each neighbor, will be mailed tothe author of the control message (Reply-to, if present,otherwise From). This information is considered publicinformation, and it is a requirement of membership inUSENET that this information be provided on request,either automatically in response to this control message,or manually, by mailing the requested information to theauthor of the message. This information is used to keepthe map of USENET up to date, and to determine wherenetnews is sent.The format of the file mailed back to the author should bethe same as that of the "sys" file. This format has oneline per neighboring site (plus one line for the localsite), containing four colon separated fields. The firstfield has the site name of the neighbor, the second fieldhas a newsgroup pattern describing the newsgroups sent tothe neighbor. The third and fourth fields are not definedby this standard. A sample response: From cbosgd!mark Sun Mar 27 20:39:37 1983 Subject: response to your sendsys request To: mark@cbosgd.UUCP - 12 -
Responding-System: cbosgd.UUCP cbosgd:osg,cb,btl,bell,net,fa,to,test ucbvax:net,fa,to.ucbvax:L: cbosg:net,fa,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews/cbosg cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb sescent:net,fa,bell,btl,cb,to.sescent:F:/usr/spool/outnews/sescent npois:net,fa,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois mhuxi:net,fa,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi3.6 Senduuname senduuname (no arguments)The "uuname" program is run, and the output is mailed tothe author of the control message (Reply-to, if present,otherwise From). This program lists all uucp neighbors ofthe local site. This information is used to make maps ofthe UUCP network. The sys file is not the same as theUUCP L.sys file. The L.sys file should never betransmitted to another party without the consent of thesites whose passwords are listed therein.It is optional for a site to provide this information.Some reply should be made to the author of the controlmessage, so that a transmission error won't be blamed. Itis also permissible for a site to run the uuname program(or in some other way determine the uucp neighbors) andedit the output, either automatically or manually, beforemailing the reply back to the author. The file shouldcontain one site per line, beginning with the uucp sitename. Additional information may be included, separatedfrom the site name by a blank or tab. The phone number orpassword for the site should NOT be included, as the replyis considered to be in the public domain. (The uunameprogram will send only the site name and not the entirecontents of the L.sys file, thus, phone numbers andpasswords are not transmitted.)The purpose of this message is to generate and maintainUUCP mail routing maps. Thus, connections over which mailcan be sent using the site!user syntax should be included,regardless of whether the link is actually a UUCP link atthe physical level. If a mail router should use it, itshould be included. Since all information sent inresponse to this message is optional, sites are free toedit the list, deleting secret or private links they donot wish to publicise.3.7 Version version (no arguments)The name and version of the software running on the localsystem is to be mailed back to the author of the article(Reply-to if present, otherwise From). - 13 -
4. Transmission MethodsUSENET is not a physical network, but rather a logicalnetwork resting on top of several existing physicalnetworks. These networks include, but are not limited to,UUCP, the ARPANET, an Ethernet, the BLICN network, an NSCHyperchannel, and a Berknet. What is important is thattwo neighboring systems on USENET have some method to geta new article, in the format listed here, from one systemto the other, and once on the receiving system, processedby the netnews software on that system. (On UNIX systems,this usually means the "rnews" program being run withthe article on the standard input.)It is not a requirement that USENET sites have mailsystems capable of understanding the ARPA Internet mailsyntax, but it is strongly recommended. Since From,Reply-To, and Sender lines use the Internet syntax,replies will be difficult or impossible without aninternet mailer. A site without an internet mailer canattempt to use the Path header line for replies, but thisfield is not guaranteed to be a working path for replies.In any event, any site generating or forwarding newsmessages must have an internet address that allows them toreceive mail from sites with internet mailers, and theymust include their internet address on their From line.4.1 Remote ExecutionSome networks permit direct remote command execution. Onthese networks, news may be forwarded by spooling thernews command with the article on the standard input. Forexample, if the remote system is called "remote", newswould be sent over a UUCP link with the command "uux -remote!rnews", and on a Berknet, "net -mremote rnews".It is important that the article be sent via a reliablemechansim, normally involving the possibility of spooling,rather than direct real-time remote execution. This isbecause, if the remote system is down, a direct executioncommand will fail, and the article will never bedelivered. If the article is spooled, it will eventuallybe delivered when both systems are up.4.2 Transfer by MailOn some systems, direct remote spooled execution is notpossible. However, most systems support electronic mail,and a news article can be sent as mail. One approach isto send a mail message which is identical to the newsmessage: the mail headers are the news headers, and themail body is the news body. By convention, this mail issent to the user "newsmail" on the remote machine. - 14 -
One problem with this method is that it may not bepossible to convince the mail system that the From line ofthe message is valid, since the mail message was generatedby a program on a system different from the source of thenews article. Another problem is that error messagescaused by the mail transmission would be sent to theoriginator of the news article, who has no control overnews transmission between two cooperating hosts and doesnot know who to contact. Transmission error messagesshould be directed to a responsible contact person on thesending machine.A solution to this problem is to encapsulate the newsarticle into a mail message, such that the entire article(headers and body) are part of the body of the mailmessage. The convention here is that such mail is sent touser "rnews" on the remote system. A mail message bodyis generated by prepending the letter "N" to each lineof the news article, and then attaching whatever mailheaders are convenient to generate. The N's are attachedto prevent any special lines in the news article frominterfering with mail transmission, and to prevent anyextra lines inserted by the mailer (headers, blank lines,etc.) from becoming part of the news article. A programon the receiving machine receives mail to "rnews",extracting the article itself and invoking the "rnews"program. An example in this format might look like this: Date: Monday, 3-Jan-83 08:33:47 MST From: news@cbosgd.UUCP Subject: network news article To: rnews@npois.UUCP NRelay-Version: B 2.10 2/13/83 cbosgd.UUCP NPosting-Version: B 2.9 6/21/82 sask.UUCP NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek NFrom: derek@sask.UUCP (Derek Andrew) NNewsgroups: net.test NSubject: necessary test NMessage-ID: <176@sask.UUCP> NDate: Monday, 3-Jan-83 00:59:15 MST N NThis really is a test. If anyone out there more than 6 Nhops away would kindly confirm this note I would Nappreciate it. We suspect that our news postings Nare not getting out into the world. NUsing mail solves the spooling problem, since mail mustalways be spooled if the destination host is down.However, it adds more overhead to the transmission process(to encapsulate and extract the article) and makes itharder for software to give different priorities to newsand mail. - 15 -
4.3 BatchingSince news articles are usually short, and since a largenumber of messages are often sent between two sites in aday, it may make sense to batch news articles. Severalarticles can be combined into one large article, usingconventions agreed upon in advance by the two sites. Onesuch batching scheme is described here; its use is stillconsidered experimental.News articles are combined into a script, separated by aheader of the form: ##! rnews 1234where 1234 is the length, in bytes, of the article. Eachsuch line is followed by an article containing the givennumber of bytes. (The newline at the end of each line ofthe article is counted as one byte, for purposes of thiscount, even if it is stored as CRLF.) For example, a batchof articles might look like this: #! rnews 374 Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP Posting-Version: version B 2.10 2/13/83; site eagle.UUCP Path: cbosgd!mhuxj!mhuxt!eagle!jerry From: jerry@eagle.uucp (Jerry Schwarz) Newsgroups: net.general Subject: Usenet Etiquette -- Please Read Message-ID: <642@eagle.UUCP> Date: Friday, 19-Nov-82 16:14:55 EST Here is an important message about USENET Etiquette. #! rnews 378 Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP Posting-Version: version B 2.10 2/13/83; site eagle.UUCP Path: cbosgd!mhuxj!mhuxt!eagle!jerry From: jerry@eagle.uucp (Jerry Schwarz) Newsgroups: net.followup Subject: Notes on Etiquette article Message-ID: <643@eagle.UUCP> Date: Friday, 19-Nov-82 17:24:12 EST There was something I forgot to mention in the last message.Batched news is recognized because the first character inthe message is "#". The message is then passed to theunbatcher for interpretation. - 16 -
5. The News Propagation AlgorithmThis section describes the overall scheme of USENET andthe algorithm followed by sites in propagating news to theentire network. Since all sites are affected byincorrectly formatted articles and by propagation errors,it is important for the method to be standardized.USENET is a directed graph. Each node in the graph is ahost computer, each arc in the graph is a transmissionpath from one host to another host. Each arc is labelledwith a newsgroup pattern, specifying which newsgroupclasses are forwarded along that link. Most arcs arebidirectional, that is, if site A sends a class ofnewsgroups to site B, then site B usually sends the sameclass of newsgroups to site A. This bidirectionality isnot, however, required.USENET is made up of many subnetworks. Each subnet has aname, such as "net" or "btl". The special subnet"net" is defined to be USENET, although the union of allsubnets may be a superset of USENET (because of sites thatget local newsgroup classes but do not get net.all). Eachsubnet is a connected graph, that is, a path exists fromevery node to every other node in the subnet. Inaddition, the entire graph is (theoretically) connected.(In practice, some political considerations have causedsome sites to be unable to post articles reaching the restof the network.)An article is posted on one machine to a list ofnewsgroups. That machine accepts it locally, thenforwards it to all its neighbors that are interested in atleast one of the newsgroups of the message. (Site A deemssite B to be "interested" in a newsgroup if thenewsgroup matches the pattern on the arc from A to B.This pattern is stored in a file on the A machine.) Thesites receiving the incoming article examine it to makesure they really want the article, accept it locally, andthen in turn forward the article to all their interestneighbors. This process continues until the entirenetwork has seen the article.An important part of the algorithm is the prevention ofloops. The above process would cause a message to loopalong a cycle forever. In particular, when site A sendsan article to site B, site B will send it back to site A,which will send it to site B, and so on. One solution tothis is the history mechanism. Each site keeps track ofall articles it has seen (by their message ID) andwhenever an article comes in that it has already seen, theincoming article is discarded immediately. This solutionis sufficient to prevent loops, but additionaloptimizations can be made to avoid sending articles tosites that will simply throw them away. - 17 -
One optimization is that an article should never be sentto a machine listed in the Path line of the header. Whena machine name is in the Path line, the message is knownto have passed through the machine. Another optimizationis that, if the article originated on site A, then site Ahas already seen the article. (Origination can bedetermined by the Posting-Version line.)Thus, if an article is posted to newsgroup "net.misc",it will match the pattern "net.all" (where "all" is ametasymbol that matches any string), and will be forwardedto all sites that subscribe to net.all (as determined bywhat their neighbors send them). These sites make up the"net" subnetwork. An article posted to "btl.general"will reach all sites receiving "btl.all", but will notreach sites that do not get "btl.all". In effect, thearticles reaches the "btl" subnetwork. An articleposted to newsgroups "net.micro,btl.general" will reachall sites subscribing to either of the two classes. - 18 -
[8]ページ先頭