The Basics of Index Server
Learn the basics of Microsoft Index Server, including catalogs, scopes, and adding or deleting directories.
June 9, 2000
Last month, I compared features between Microsoft Index Server and the Microsoft FrontPage wide-area Internet search (WAIS) search engine and briefly discussed the WAIS search engine's capabilities. This month, I talk about Index Server.
In contrast to the simplicity of the FrontPage WAIS search engine, Index Server has a lot more to offer but requires more planning and setup, especially if you have more than one virtual server on your machine and you want users to be able to search different virtual servers individually (e.g., as in a hosting situation). For example, I host http://www.americandrivingsociety.org and http://www.ideva.com on the same server machine. When users search http://www.ideva.com, they don't see results from directories in http://www.ads.org.
One of my Microsoft IIS administrator friends told me that he used an Ouija board to set up Index Server: Apparently, it provided better answers than any other source he could find. I'll attempt to do better here. (If you're already familiar with the basics and want to read about more advanced topics, see Ken Spencer's Windows 2000 Magazine article "Indexing Service at Your Fingertips," July 2000.)
Documentationally Challenged
Index Server 2.0 comes with Windows 2000 and the Microsoft Windows NT 4.0 Option Pack, but it isn't part of the default installation. All you have to do to install Index Server is select it. However, the product documentation that I received in the Option Pack is for Index Server 1.0, not 2.0, which brings me to an Index Server challenge—the lack of documentation.
To find help about using Index Server, choose Start, Programs, Windows NT 4.0 Option Pack, Product Documentation. When you turn to the chapters about Index Server, you'll have a start—but only a start. To see what I mean, go to http://www.microsoft.com and search for Index Server. Chances are, you'll find only a glossy press release touting the enthralling new features of 2.0, a couple of articles about advanced topics, a few bug reports, and a slew of (very short) frequently asked questions. The bottom line is, getting started is hard if you don't already know something about Index Server.
Index Server Catalogs
Index Server lets you use catalogs to conduct searches both on local directories and on directories on other machines. A catalog contains the index of the contents of the directories in its scope (i.e., a collection of directories that is indexed and searched as a unit). In other words, a catalog is the highest level of organization in Index Server—it contains the index for one or more scopes, which are simply references to subdirectories that you want to include in the catalog. For example, I keep a catalog called techarticles that contains all the directories in which my company stores technical articles (this catalog is really the scope). I have a search page that searches the indexes of this catalog. Index Server lets me use any of a number of attributes (e.g., title, author, keywords) to search for articles. You probably want to keep multiple catalogs to facilitate specific types of searches; you also want to be able to search across multiple catalogs. For this function, I heartily recommend Microsoft Site Server, which is built on Index Server and provides the next step up the functional chain. Site Server lets you create catalogs that span multiple Web sites and virtual servers; it also lets you search one or more catalogs at the same time. (For more information about Site Server Search, see Tim Huckaby, "Implementing Site Server Search on Your E-Commerce Site," June 2000.)
The Index Server Services (ISS) snap-in for the Microsoft Management Console (MMC) lets you add directories to existing catalogs and create new catalogs. Index Server also has a set of HTML administration pages that give you virtual root information and index statistics. In addition, you'll find several sample search pages under Start, Windows NT 4.0 Option Pack, Index Server, Index Server Sample Query Form. These samples range from simple HTML query forms to complex Microsoft SQL Server ad hoc query builders. From these samples, you can build your search pages.
Taming the Default Web Catalog
Index Server requires more administration than the FrontPage WAIS search engine, which is limited to the FrontPage web in which it's defined. In the beginning, the concept was to index a Web site, and that was the limit of the scope of a search. Index Server can index much more than just the contents of a Web site or virtual server, but a virtual sever is still the logical starting point for most catalogs.
When you install Index Server, it automatically creates a catalog called Web. This default catalog tracks the contents of the machine's default Web site. (If you're running MMC 1.0, Index Server is in the MMC with the IIS snap-in. If you've upgraded to Index Server 2.0 and MMC 1.1, then you administer Index Server from its instance of the MMC.) Figure 1 shows all the directories that Index Server added to Web on my machine by default plus the two at the end on drive H (I'll talk about them later in the article).
To understand more about the Web catalog, right-click Web and select Properties. The Properties dialog box contains three tabs—Location, Web, and Generation. The Location tab gives you the name of the catalog (e.g., Web). You also see the name and size of the directory in which Index Server stores the catalog files. If you browse the directory, you find a Catalog.wci subdirectory that contains all the index files for this catalog. When you create a new catalog, one of the parameters you specify is the location of the storage directory that will keep your catalog files.
The Web tab (not to be confused with the name of this catalog) lets you specify whether you want the catalog to track a particular virtual root (i.e., virtual server) or news server. In this case, you're tracking the default Web site.
From the Generation tab, you specify whether you want to filter files with unknown extensions. If you select the Filter Files with unknown extension check box, the Indexer ignores files that aren't on its list. If you clear this check box, the Indexer attempts to index every file it finds in every directory in the scope. The other option (my personal favorite) is to generate characterizations and a maximum size. The characterization is the bit of text that appears in the hit list under the document title and is labeled Abstract, as Figure 2 shows.
Adding Directories to the Scope
You can add directories to a catalog in either the ISS snap-in or the IIS snap-in in the MMC. To add a directory to a catalog in the ISS snap-in, right-click the Directories folder under the catalog (e.g., Web), then select New Directory. From the Add Directory dialog box, which Figure 3 shows, you can add to the scope virtual directories that aren't under the virtual root. In other words, the scope for a catalog contains all the nonexcluded virtual directories in the virtual server that it's tracking and any directories that you've added. But the directories of the virtual server that the catalog is tracking appear in the scope only if you've selected the Index this directory check box on the virtual server's Properties dialog box.
Now I come back to the last two directories that you see in Figure 1. Notice that I've added virtual directories to this catalog that aren't in the default Web site—in fact, they aren't even on this machine. So, this index and its search capabilities can span the entire machine and can even include virtual directories from other machines. This capability is great for an intranet setting in which you can dedicate the entire machine to a set of related virtual servers, such as bug tracking, development, and support. However, it isn't helpful in a hosting scenario in which each virtual server wants to seem autonomous. Index Server can handle both of these scenarios.
I set up the Web catalog to track my machine's default Web site. The catalog includes everything in the default Web site, which in turn includes all the virtual directories, including some of the most private directories on the server (e.g., INETPUB, IISADMIN, ISSADMIM, IISSamples). Notice also that the Site Server directories appear in the directory listing for Web. They appear because they're virtual directories under the default Web site.
From a performance perspective, Index Server can have a major effect, especially if you have a lot of users using the search pages. The indexing goes on quietly behind the scenes whenever a change occurs. For production, I prefer not to keep the Web catalog (for both performance and privacy reasons), so I can simply delete it.
To delete a catalog, stop Index Server, then right-click the catalog (e.g., Web) and select Delete. If you forget to stop Index Server, an information window appears telling you to stop Index Server to delete the catalog.
Deleting a directory from a scope in the ISS snap-in. This question is the one I receive most frequently about Index Server. If you explicitly add a virtual directory to the catalog with the ISS snap-in, you can remove it from the catalog by simply deleting it. If the virtual directory is part of a Web site that you've indexed, you can remove it from the catalog by clearing the Index this directory check box for the virtual directory you want to remove.
First, if any excluded child directories existed under the directory for which you turned indexing off, you'll see the Inheritance Overrides dialog box, which Figure 4 shows. Because a catalog tracks only one virtual server, by default, all or most of the directories in the virtual root become part of the scope. In Index Server 2.0, you can mark directories for exclusion. To exclude these directories, add them as new directories in the IIS snap-in and select the Exclude check box. When the virtual directory is a FrontPage web, FrontPage automatically excludes those directories that are private (e.g., FrontPage marks private directories such as _ vti _ bin for exclusion so that Index Server excludes them from the scope). The Inheritance Overrides dialog box gives you the opportunity to change the default Excluded status on any directory it lists. (You also see this dialog box any time you change the Index status of a virtual directory or virtual server in the IIS snap-in.)
The second result is that Index Server begins to update its indexes to reflect your changes. For example, when I clear the Index this directory check box on the default Web site, all that is left are the two virtual directories that I added in the ISS snap-in. (You need to close and reopen the ISS snap-in to make the new scope show up.)
And Finally
What about all those other virtual directories in the machine's other virtual servers for which you've selected the Index this directory check box? Where do their indexes go? I'll tell you what I know: You can't search them unless they belong to the scope of a catalog. So, even if Index Server has indexed them, you'll have to add them to a catalog before you can search them. Now that you have read all this, adding a catalog to your virtual server and controlling its scope will be a snap! Go to http://localhost/iissamples/isssamples/default.htm on your local IIS server for detailed instructions about creating, configuring, moving, and deleting catalogs. (This URL appears on your server when you install Index Server from the Option Pack.)
Next Month
Index Server provides a rich set of cataloging and searching options. But someone must do the work to create and maintain the catalogs and the search pages. Good catalogs require some planning and, of course, you must update them whenever you add new directories or Web sites to the server. In an intranet setting in which the document assets are important, building and maintaining good catalogs is definitely worth the effort.
Next month, I'll show you how to use Index Server's sample search pages and tie them to your catalog. I'll also discuss more advanced Index Server features and how you can use Site Server Search to access them easily.
About the Author
You May Also Like