Network News Transfer Protocol
Learn how to configure NNTP to bring newsgroups to your users.
July 9, 2001
Understanding and configuring NNTP in your organization
The Information Age offers many options for interacting with various electronic communities. You can subscribe to these communities to interact with peers, keep abreast of how others are using technologies, or simply get the latest sport scores. Email and Web portals aren't the only way to obtain this information. Another established and standardized source is an Internet newsgroup. To effectively use newsgroups, you need to know how the Network News Transfer Protocol (NNTP) works and how to configure your server to use it.
NNTP and USENET
NNTP is the protocol for implementing USENET newsgroups, which are a hierarchy of discussion lists covering a wide array of topics. A newsgroup's name is similar to a domain name in that it uses the dot (.) notation to separate each level of the hierarchy (e.g., rec.food.drink.beer, rec.food.drink.coffee). Originally, USENET had nine root hierarchies, which Figure 1 shows, but as USENET has grown in popularity, countries, groups, and organizations have started more roots. Recently, I found more than 46,000 different newgroups in which you can participate; a few groups appear at the bottom of Figure 1. Today, the top five roots, based on the number of subcategories, are alt, Microsoft, comp, clari, and rec.
The discussion groups are hosted on a series of servers, and clients (i.e., newsreaders) connect to the servers to post or retrieve articles. NNTP defines the way in which the newsreaders interact with the server, as well as how the servers distribute articles among themselves. Unlike DNS, NNTP has no specific set of root servers. In the USENET world, servers are generally peers and have knowledge of one or two other servers with whom they trade articles, as Figure 2 shows. As articles propagate, servers hold all the postings for the newsgroups, and no one server is considered the master source for the content. As new messages accumulate on a news server, the server periodically connects to its peer servers, then sends all new articles. The exchange of articles between servers is referred to as a newsfeed. The sidebar "The Two Types of NNTP Feeds," page 12, explains the mechanisms the servers use to exchange articles.
An example illustrates both the growth of USENET and the volume of traffic that circulates. Just a few years ago, storing a full newsfeed (i.e., one that includes all available newsgroups) for about 5 days required about 50GB of disk storage. Today, you might need 75GB or more per day for a full feed, depending on the amount of multimedia content in the postings.
Although a core set of servers contains all the discussion hierarchy, news servers aren't required to store and provide access to all 46,000 plus newsgroups. The administrator of each server can determine which newsgroups to host and offer to newsreaders. The only catch is that a feeder server must carry the desired newsgroups. If the desired groups aren't available from one feed, you can usually acquire them by establishing a feed with a second source. Sometimes, this process is how the loops in Figure 2 form.
To illustrate how articles propagate, consider a message that originates as a post on Server A in Figure 2. Assume that the lines between the servers represent feed configurations that support bidirectional article propagation. Because Server A is connected to two peers, it propagates the post to Servers B and D. The message then flows to Server E over two routes: A-D-E and A-B-C-E. When a user generates a message for a newsreader to post, the newsreader creates a unique article ID for the message. When articles are sent from one server to another, the sending server transmits the article ID. The receiving server checks a history database to determine whether that server has received the message. If the message ID exists in the history database, the server rejects the message. In this example, if the article reaches Server E over the A-D-E route, when the same article travels by way of the A-B-C-E route, Server E will reject the article because Server E has the article ID in its history file. This fact doesn't mean that articles will always reach Server E over the A-D-E route. Factors such as the frequency and speed of the connections between the various peers influence the propagation path.
To carry the illustration further, assume that Servers A through D carry only the original USENET root hierarchies and that Server H hosts the Microsoft newsgroups. If users on Server E want the original hierarchies and the Microsoft hierarchies, Server E can establish multiple feeds so that it can provide a broader spectrum of content. Because each feed determines exactly which hierarchies flow between servers, the Microsoft groups aren't propagated out to Servers D and C; likewise, the USENET content doesn't flow out to Server H.
Planning to Host Newsfeeds
The first step in setting up a news server is to determine whether your ISP can provide you with a newsfeed and which newsgroups are available in that feed. Usually you will review what is called an active file to determine which newsgroups are available. An active file lists all the newsgroups that are active or available from an NNTP server. The Web sidebar "How to Interpret the NNTP Active File," http://www.exchangeadmin.com, InstantDoc ID 21476, explains the format of the active file. Many ISPs make the active file available on their FTP sites or email you a copy when you inquire about establishing a newsfeed. You might also find a copy on the Web site on which the ISP describes its NNTP service offerings.
The next step is to determine whether you're going to host a full feed or only selected discussions on your servers. The most important determining factors are the amount of disk space required to store the news content, the amount of bandwidth required to transfer that content from your ISP to your news server, and the appropriateness of the newsgroups.
The amount of disk space needed will vary depending on the type and quantity of groups you decide to host. For example, the alt.binaries groups will use more disk space than groups that contain mostly discussion text. How long you store the articles will also affect your disk requirements. Most people keep articles for between 5 and 10 days, but you can also set different expiration intervals for different discussion groups (e.g., set articles to expire every 2 days for groups with high posting volumes or message sizes and keep articles for groups with smaller posting volumes for 14 days). You probably won't know precisely the amount of bandwidth you'll need, because the bandwidth required depends on the number of posts made per day and the content of those posts. In most cases, you probably won't pull a full newsfeed but instead will choose specific discussions such as microsoft.public.exchange.connectivity or specific hierarchies such as comp or sci.
If you want to subscribe to a limited number of discussions, you can get a sense of what type of activity you'll see. If your ISP has configured a news server for newsreader connections, use a newsreader (e.g., Microsoft Outlook Express) to monitor the posting volume and content by connecting directly to the ISP's news server. If you're planning to host a large variety or number of discussions, you need to start out slowly and include only a few discussions in your initial feed. When you have a feel for the bandwidth and storage requirements, ask the ISP to expand the feed to include more discussion groups.
Next, you need to decide how your user community will access the news articles and whether to distribute the load by hosting newsgroups on several machines. With Exchange, common practice is to host newsgroups in public folders. Exchange 2000 Server gives you the option of hosting the newsgroups and content as folders and files on the file system, but public folders are your best choice because public folders let you index the content and provide for multiprotocol access with NNTP, Messaging API (MAPI), Outlook Web Access (OWA), and IMAP clients.
If your users access the newsgroups through the public folder hierarchy by using a MAPI client such as Outlook, they don't need to know which server hosts the content. If you choose to replicate the Internet newsgroups public folders to other servers, your users also receive the benefit of having redundant access. If your users connect through a newsreader such as Outlook Express or TIN (a popular UNIX newsreader), they don't have transparent or redundant access and will generally need to know the name of the server or servers hosting the content. If you decide to provide more than one news server by replicating public folders, you must consider the bandwidth required for replication of the folder content. As the Web sidebar "Using NNTP to Share Public Folders Between Exchange Organizations,"http://www.exchangeadmin.com, InstantDoc ID 21475, explains, NNTP lets you do more than host USENET newsgroups.
Configuring a Newsfeed
You need to perform the same configuration tasks no matter which Exchange version you're using, but the way in which you accomplish each task differs. To configure a newsfeed, you must accomplish these tasks:
You need to create the hierarchy of newsgroup folders on the server to hold the articles.
If you want to use multiple feeds, you need to decide which feeds will support which groups, inbound and outbound.
You need to specify the ISP's news server and the feed type and configure any username and password necessary for authentication.
Exchange 2000 and Exchange Server 5.5 require slightly different procedures for setting up the newsfeeds. Both use wizards to prompt you for vital information and lead you through the configuration. Figure 3 shows the wizard page for defining the feed type in Exchange 5.5.
Exchange 2000 uses separate wizards to define the newsgroup hierarchy and the feeds to the ISP, whereas Exchange 5.5 uses one wizard to configure the feed information and create the newsgroups. Another way that the two versions differ is in how they let you create the newsgroup containers en masse in the Public Information Store. The Exchange 2000 wizard lets you create newsgroups only one at a time, but Microsoft provides a separate VBScript file called rgroup.vbs, which resides by default in %winnt%system32inetsrv, and lets you import the active file (as a separate step) to create the public folders. The Exchange 5.5 wizard lets you specify an active file from which the wizard will create the necessary public folders. If you don't have the file, the wizard fetches one from your ISP's news server.
Regardless of which Exchange version you use to host your newsgroups, I recommend that you get the active file before you run the wizards. Edit the file so that it contains only the groups that you'll host. Web Table 1 on the Exchange & Outlook Administrator Web site shows the steps needed to configure newsfeeds that support both inbound and outbound traffic, assuming the same server will handle inbound and outbound traffic on the ISP's side.
Some settings that you specify, such as the feed type, depend on your ISP's policies. In most cases, an ISP will let you establish only a push feed because push feeds let the ISP have more control over the load on its system.
Many ISPs provide you with at least two NNTP server names and define inbound and outbound roles for those servers. When you use the new newsfeed wizard to define your push/accept feed, you can specify that you want the designated hosts to handle both inbound and outbound traffic. You're creating two separate feeds: the push feed and the accept feed. Exchange 2000 groups the push and accept feeds into the same entry for easier administration and organization. If your ISP has separate servers for inbound and outbound feeds, you must configure separate feeds—one specifying only the inbound push server and the other specifying only the outbound accept server.
A few points about the configuration process are worth noting. First, you and the ISP must select the same groups on both sides of the feed. If you attempt to push to the ISP groups that it isn't expecting, the ISP's news server will reject posts to those groups.
Second, the Exchange 2000 newsfeed wizard prompts you to specify a server role as peer, master, or slave. You need master and slave roles when you have a bridgehead NNTP server communicating with your ISP and one or more NNTP servers to which newsreaders connect. In most cases, you specify the server role as a peer.
Third, when you use the rgroup.vbs script to create newsgroups in bulk, you must ensure that the active file is formatted correctly. The script looks for an NNTP command response code 215 on the first line of the file. The Web sidebar "How to Interpret the NNTP Active File" explains this code. If this code isn't present, the script won't add any groups. Also, be sure that you don't duplicate any group names in the file. If the script encounters a duplicate name, it will terminate without adding any newsgroups after the duplicate name. You can restart the process by removing the groups that have already been processed and rerunning the script. But remember to leave the 215 line at the top of the file.
Fourth, although you have configured a push/accept feed, your server, by default, is unsecured because other NNTP hosts can establish a pull feed from your server. You usually want to prevent a pull feed because pull feeds can create a significant burden on the server. In Exchange 2000, you can prevent pull feeds by clearing the Allow servers to pull articles check box on the NNTP virtual server's settings property page. To disable pull feeds in Exchange 5.5, you need to set a registry parameter. The Microsoft article "XFOR: Disabling Support for Pull Newsfeeds" (http://support.microsoft.com/support/kb/articles/q173/9/46.asp) explains the procedure.
Where to Go from Here
The sheer volume of data that you would store if you accepted a full feed is intimidating. Make sure that you understand what you're getting yourself into if you decide to accept a full feed. Don't forget about the bandwidth required to bring the news content into your site. A huge amount of data will be transmitted when the feed first starts. In almost all cases, you need to set expiration policies on the news content and carefully monitor the feed.
For links to background articles, see the Web sidebar "Useful Resources About NNTP," http://www.exchangeadmin.com, InstantDoc ID 21477. As with all processes, careful monitoring, management, and maintenance of the resource will keep your system running and your users happy.
About the Author
You May Also Like