Index Server and the FrontPage WAIS Search Engine
This month in the first article in our Index Server series, Marnie Hutcheson compares two Web site search engines and gives you some guidelines for choosing which engine is right for you.
May 9, 2000
Finding the information you're looking for on a Web site is crucial to the success of that site. Search engines let users query content indexes of local and distributed content and databases as well as retrieve requested data. The capabilities of a search engine and the cost to implement it are key factors in satisfying the search requirements of a Web site. The task of choosing a search engine that provides the right balance of function, features, and administrative overhead often falls to the administrator.
This month, I discuss two search engines that are available from Microsoft—Microsoft Index Server and the Microsoft FrontPage Wide Area Information Server (WAIS) search engine. Both search engines have their advantages and disadvantages. I present two scenarios to illustrate when you might want to use one or the other. I also discuss the features and functions of the FrontPage WAIS search engine, explain how it works, and describe how to configure IIS to let FrontPage use the WAIS search engine on a server running Index Server.
Index Server comes with both Windows 2000 (Win2K) and the Windows NT 4.0 Option Pack, and it has a snap-in for the Microsoft Management Console (MMC). The server lets you manage (i.e., add, change, and delete) directories in the index and create catalogs for specialized indexing. It also comes with some great sample search pages.
FrontPage comes with a built-in WAIS search engine that indexes Web content files and responds to client connections and queries by returning information on the files shared in the WAIS data directory. The server-side components of the WAIS search engine are installed when you install the FrontPage Server Extensions. You can find more information about the WAIS search engine at http://msdn.microsoft.com/isapi/msdnlib.idc?theurl=/library/winresource/dnwinnt/s7762.htm. You can also download the standalone version of the WAIS engine and the WAIS Toolkit, which includes the server's documentation, from this site. Table 1 provides a comparative overview of the two search engines' features.
Choosing the Right Search Engine for the Job
The selection of an appropriate search engine depends on how you've set up your Web domains. The following two scenarios describe when you would choose Index Server and when you would choose the FrontPage WAIS search engine.
Building a single-source document repository. In a setting in which the goal is a single-source document repository, you need to index many directories both on the server machine and across multiple server machines. You need a central index server that can index not only Web content (HTML), but other types of documents as well, such as Microsoft Word and Microsoft Excel documents. A catalog is an index of the contents of a set of directories. For example, I keep a catalog called techarticles, which contains all the directories in which my company stores technical articles. I have a search page that searches the indexes of this catalog. That page lets me use several attributes (e.g., title, author, keywords) to search for articles. You probably want to keep multiple catalogs to facilitate specific types of searches, but you also probably want to be able to search across all catalogs. Index Server lets you do both.
Index Server is a powerful indexing engine that lets you conduct searches across multiple domains on local and mapped directories. You can create catalogs that contain sets of directories. You can also search a catalog as an entity. For example, if you have a catalog for products and a catalog for technical support articles, you can structure search forms that let users search either or both catalogs.
However, Index Server indexes the private server directories on the server when you install it (e.g., IIS Help files and samples). As a result, casual users, Web site authors, and potential intruders can find this information. If you don't want that kind of index available on your production server, you must remove all these directories in the Index Server default Web catalog. This situation highlights a serious disadvantage to using Index Sever: You need to set up and maintain each catalog. You also need to update catalogs whenever you add new directories or Web sites to the server. In an intranet setting in which the document assets are important, it's worth the effort to build and maintain catalogs with Index Server. However, not all Web sites warrant this type of effort. Take, for example, the hosting server with its 1000 or so small hosted Web sites. The cost of setting up and maintaining a catalog for each site and maintaining it would be too high for most small companies. The FrontPage WAIS search engine is a better tool for small companies that want search capability but can't afford to use Index Server.
Creating autonomous Web sites. If you partition your server into multiple, distinct Web domains (as in an Internet hosting scenario), each Web domain needs to be a world of its own. In this type of setup, finding the time to manually build and maintain catalogs for each Web domain would be a problem. In addition, you don't want users who are searching one catalog in one Web site to receive search results from another catalog that happens to have its Web site on the same server. The built-in WAIS search engine in FrontPage automatically takes care of these problems. With the WAIS search engine, each Web site administrator manages the site's autonomous searches.
How the FrontPage WAIS Search Engine Works
The FrontPage 2000 WAIS search engine stores Web site catalogs that you create in the private directory _vti_txtdefault.wti. These files aren't visible in FrontPage Explorer, but you can see them in Windows Explorer and in the IIS snap-in in the MMC. The search engine completely controls the content of this directory, so it will overwrite any edits you perform manually on these files the next time it indexes the site. The all.cat file has a listing of all the pages in the catalog for this FrontPage Web site. Sub-Web sites each have their own _vti_txt directories. You can also see other catalogs in this directory, such as _cusudi.cat, which is the catalog file for a discussion group that exists on this Web site. FrontPage and the WAIS search engine automatically maintain the discussion group catalog and its index as part of the FrontPage Discussion Group component. (The FrontPage Discussion Group Wizard creates a search page automatically for a discussion group when the Web site author creates the discussion group. The FrontPage Discussion Group Server Extensions cause the WAIS search engine to automatically reindex the discussion group every time there is a submission to it.)
When FrontPage uses the WAIS search engine, each FrontPage web and subweb has its own index. For example, adsresults2000 (http://www.americandrivingsociety.org/adsresults2000) is a subweb under the virtual server ads: If you use the search page in ads to conduct a search, it won't show files from the adsresults2000 subweb, even though adsresults2000 is also a subdirectory under ads.
You can regenerate an index manually from the IIS snap-in in the MMC. Right-click the site, and select Task, Recalculate Web. FrontPage Web site authors can also force an index to update by selecting Recalculate Hyperlinks from the Tools menu in FrontPage Explorer.
Using WAIS to Create a Search Page for a FrontPage Web Site
You can use FrontPage Explorer to create a search page for a FrontPage Web site. To create a search page in FrontPage, choose File, New, Page. Select the Search Page from the list of available pages. Click the search form to select it, then right-click the selected form. When the Options dialog box opens, select Search Form Properties from the drop-down list to get to the dialog box that Screen 1 shows. You can set properties both for the search form and for the search results from this dialog box. The Search Form Properties are simply the labels that appear on the form when a user views that form. The Search Results properties, which Screen 2 shows, set the scope of the search and what information will appear in the results. When you initiate a query, FrontPage places the results on the search page below the search form, as Screen 3 shows.
One functional observation that I have about this lightweight index engine is that most FrontPage Web site authors aren't computer people, so they aren't accustomed to use naming conventions, content-management techniques, and so on. Consequently, the indexer can embarrass them by finding unlinked files in the Web domain (i.e., files in one of the directories on the FrontPage Web site). Take pains to warn authors to keep a clean house. When you have a search engine, a file that has no links to it still has the potential of showing up on a search results page.
Index Server and the FrontPage WAIS Search Engine on the Same Machine
If you've never installed Index Server on the server machine, FrontPage automatically sets up the indexing directory and the WAIS search engine. If you've installed Index Server on a server when you install the FrontPage Server Extensions, the WAIS search engine installs, but FrontPage uses Index Server by default. This default use of Index Server has caused a couple of problems in my business. First, Web site authors can still create search pages in FrontPage, but those pages don't work because FrontPage created search pages that query only the WAIS search engine (which isn't the default if the server machine has Index Server running on it). You must create Index Server search pages and import them to the Web site before any searching can occur. Second, be advised that the way in which Index Server catalogs a site is completely different from the way the WAIS search engine catalogs a site.
When you install Index Server on a server machine and look at a Web site's properties in the IIS snap-in in the MMC, you'll see that the Index this directory check box is selected. Each new Web site, virtual server running FrontPage, and all the sub-Web sites will have the Index this directory property set automatically to on when you create the Web site. One problem is that this default doesn't specify which catalog Index Server places the directory in, which creates a lot of confusion for administrators, authors, and developers.
A situation came up recently in which I published a new Web site to a new IIS server running the Option Pack. I discovered after the site went live that the carefully tested search pages didn't work—they were looking for Index Server .idq files instead of WAIS files. During installation, Index Server had been installed by mistake, then uninstalled. Enough entries were left in the Registry to tell FrontPage to use Index Server instead of the WAIS server.
You can have FrontPage use the WAIS search engine instead of Index Server. If you want to use the FrontPage WAIS search engine on one or all of the FrontPage Web sites on your server, you have to change the FrontPage configuration variables. In FrontPage 2000, you make the variable changes in the Registry. Although I'll be going over the steps about how to change the Registry to enable the FrontPage WAIS engine on one or all servers, manipulating the Registry is risky. Thus, I recommend that you also read the Microsoft article "Using FrontPage 2000 WAIS Search Instead of Index Server" at http://support.microsoft.com/support/kb/articles/q201/5/24.asp. (If you're using FrontPage 98, you need to change the variables in the fpconfig.ini file. See the Microsoft article "How to Integrate FrontPage 98 and Index Server" at http://support
.microsoft.com/support/kb/articles/q194/3/91.asp for information about changing these variables and the additional options you can set.)
Enabling the FrontPage WAIS engine on all servers. To enable the FrontPage WAIS search engine on all the virtual servers on your IIS server (you must have the FrontPage Server Extensions installed on those servers), follow these steps:
Choose Start, Run; type regedit.
Locate the HKEY_LOCAL_MACHINESOFTWAREMicrosoftShared ToolsWeb Server ExtensionsAll Ports Registry key.
Right-click All Ports, and select New, String Value.
For the name of the string, type noindexserver.
Double-click the string.
For the value of the string, type 1.
Exit regedit.
Recalculate the hyperlinks for each virtual server. (You can access the Recalculate Hyperlinks tool through the FrontPage client under the Tools menu. You can also access it in the MMC by right-clicking the virtual server, selecting Tasks, and then selecting Recalculate Web.)
Enabling the FrontPage WAIS engine on a specific server. If your servers have many virtual servers, you first need to find out the W3SVC number of the virtual server you want to enable before you call up regedit.exe. You can find this number in several ways, the easiest of which is to go to the IIS snap-in in the MMC, right-click a virtual server, and select Properties. Then, click the Properties button next to the Active log format field at the bottom of the Web Site Properties dialog box, which opens the Extended Logging Properties dialog box for the log file format in effect for that server, as Screen 4 shows. The log file name at the bottom of the dialog box contains the W3SVC number. (In Screen 4, the number is 2.) You need this number to find the correct server instance in the Registry for this virtual server. When you know the number of the virtual server, follow these steps:
Choose Start, Run; type regedit.
Locate the Registry key for the server you want to change. For example, to change Port /LM/W3SVC/1, the first virtual server, find the HKEY_LOCAL_MACHINESOFTWAREMicrosoftShared ToolsWeb Server ExtensionsPortsPort /LM/W3SVC/1 Registry key.
Right-click Port /LM/W3SVC/1, select New, then select String Value.
For the name of the string, type noindexserver.
Double-click the string.
For the value of the string, type 1.
Repeat steps 1 through 6 for each virtual server you want to change.
Exit regedit.
Recalculate the hyperlinks for each virtual server.
In my experience, there is no problem having both search engines index a Web site. Index Server keeps its catalogs in a directory that you specify. Unlike FrontPage WAIS, Index Server usually stores all catalogs in one central directory, so no conflict occurs in the indexes. I've tested the simple Index Server Active Server Pages (ASP) search pages on my FrontPage Web sites and haven't had any problems.
Next Month
The FrontPage WAIS search engine is compact and fast but offers only simple text searches, and its scope is limited to one FrontPage Web site and its subdirectories. The search engine also indexes only text (.html) files. However, it's completely automatic. The WAIS search engine automatically creates and maintains the indexes in a private directory within the FrontPage web directory structure. In addition, any FrontPage Web site author can create a search page. After you've set up a page to run on the server, it has virtually no administrative overhead.
Index Server is a heavy-duty, industrial-strength search engine with huge scope capabilities. You can build catalogs of local and remote directories for it to index, and it can index a wide range of document types. You have a lot of options for creating query pages for your users to use in their searches. The downside is that you have to set up all this functionality, and the indexing process is large and can cause server slowdowns. Next month, I'll discuss the way in which Index Server creates its catalogs and the things you can add, change, and delete from those catalogs.
About the Author
You May Also Like