Search Google With ASP.NET
Use the Google Web Service and a proxy object to add Web-wide searchingfunctionality to your Web apps quickly and easily.
October 30, 2009
XML hasbeen one of the biggest marketing buzzwords in the technology world for thepast few years. With the introduction of Web Services, though, XML has somecompetition in the buzzword category. The question is, with all of the hypesurrounding XML and Web Services, are they really that useful?
Theanswer to this question depends upon how you use the technologies to solve realbusiness problems, such as exchanging data between distributed sources. In thisarticle, you'll see how you can use XML and Web Services to integrate dataprovided by Google.com into ASP.NET applications.
The Google Web Service API
If youhaven't been to Google.com before, the site provides an excellent applicationthat includes the capability of searching through billions of Web documents andnewsgroup posts. In an effort to make the data Google.com archives availablefor others to consume, Google has developed a Web Service and has provided aproxy object written in C# that can be used in .NET applications. The WebServices Description Language (WSDL) document used to create the proxy islocated at http://api.google.com/search/GoogleSearch.wsdl.
If youhaven't worked with Web Service proxy objects in .NET before, they act as themiddlemen between your .NET application and a remote Web Service and canserialize and deserialize Simple Object Access Protocol (SOAP) messages for youautomatically. FIGURE 1 shows an example of a typical Web Service architectureand identifies the proxy object.
FIGURE 1: A typical Web Service architecturecan involve several different technologies, including WSDL; UniversalDescription, Discovery, and Integration (UDDI); the Web Service itself; and oneor more proxy objects used to integrate with the Web Service and send orreceive SOAP messages.
Theproxy object Google provides allows you to tie into Google's Web Servicethrough three methods, shown in FIGURE 2.
Method | Description |
---|---|
doGoogleSearch() | Search through indexed Web content. |
doGetCachedPage() | Access pages Google caches during its Web crawling process. |
doSpellingSuggestion() | Provide spelling suggestions for specific text. |
FIGURE2: The methodsexposed by the Google Web Service allow you to search through the GoogleWebpage index, access complete Web pages by Google bots as the pages areparsed, and obtain spelling suggestions for search words.
Althoughyou can create the proxy that is used to access these methods yourself usingVisual Studio .NET or the WSDL.exe command-line utility, the proxy class (namedGoogleSearchService.cs) provided byGoogle is ready to use right out of the box and already contains the additionalhelper classes used to access the Web Service. FIGURE 3 shows these additionalclasses.
[System.Xml.Serialization.SoapTypeAttribute("GoogleSearchResult", "urn:GoogleSearch")]public class GoogleSearchResult { public bool documentFiltering; public string searchComments; public int estimatedTotalResultsCount; public bool estimateIsExact; public ResultElement[] resultElements; public string searchQuery; public int startIndex; public int endIndex; public string searchTips; public DirectoryCategory[] directoryCategories; public System.Double searchTime;} /// [System.Xml.Serialization.SoapTypeAttribute("ResultElement", "urn:GoogleSearch")]public class ResultElement { public string summary; public string URL; public string snippet; public string title; public string cachedSize; public bool relatedInformationPresent; public string hostName; public DirectoryCategory directoryCategory; public string directoryTitle;} /// [System.Xml.Serialization.SoapTypeAttribute("DirectoryCategory", "urn:GoogleSearch")]public class DirectoryCategory { public string fullViewableName; public string specialEncoding;}
FIGURE3: Google providesa C# proxy class that can be used to tie into its Web Service. The proxycontains several custom classes, such as the GoogleSearchResult, ResultElement,and DirectoryCategory classes shown here that can be used to interact with theWeb Service.
Thecomplete proxy object is available with this article's downloadable code (seethe end of the article for details about downloading code). You also candownload the proxy object from http://www.google.com/apis. Toaccess it through Google, simply register with Google at the aforementionedURL. After registering, you will receive a unique key that must be used whencalling the Web Service.
Search Google
Now thatyou've seen some of the functionality the Google Web Service provides, I'llexplain how you can use the proxy object to perform a Web search. The firststep is to compile the supplied proxy object using the csc.exe command-linecompiler utility (alternatively, you can use Visual Studio .NET):
csc.exe /t:library /out:GoogleSearchService.dll /r:System.Web.dll GoogleSearchService.cs
(Note:you need to enter the above on a single command line.) Once the proxy iscompiled into a .NET assembly, a new ASP.NET page can be created. First, you'llwant to add a text box and button to the ASP.NET page so users can specify akeyword or phrase for which to search. When the button is clicked, your codeneeds to create an instance of the proxy class and call its doGoogleSearch method. This methodaccepts several different parameters. FIGURE 4 shows the key parameters.
Parameter | Description |
---|---|
key | Subscription key used to access the Google Web Service API. Visit http://www.google.com/apis/ to obtain a key. |
q | Query text sent to the Google Web Service, which is used to search Web pages in the index. The end user will supply this parameter value. |
maxResults | Number of results to return. The Web Service currently limits the maximum number to 10. |
Start | Used to specify which record to start with when performing the search. By changing this number, paging functionality can be added. |
FIGURE4: ThedoGoogleSearch method accepts several different parameters that are used by theWeb Service to search through the Google Webpage index.
The doGoogleSearch method returns a GoogleSearchResultobject (the code for this object is in FIGURE 3). This object exposes acollection of ResultElement objects through its resultElementsproperty, which can be bound directly to a Web server control, such as the DataList.Each ResultElement object has specific properties that allow you toobtain the URL for each item, title, directory category, and additionalinformation. The code to call the Web Service's doGoogleSearch method and bind the resulting data is shown inFIGURE 5.
private void CallGoogleService(int record) { try { // Create a Google Search object GoogleSearchService s = new GoogleSearchService(); // If you want to implement this service you // MUST get your own key from Google. // URL:http://www.google.com/apis/ GoogleSearchResult r = s.doGoogleSearch(key, txtSearchText.Text, record, 10, false,"", false, "", "", ""); //Make proper controls visible this.pnlResults.Visible = true; //Perform Data binding dlResults.DataSource = r.resultElements; dlResults.DataBind(); this.lblTotalRecords.Text = r.estimatedTotalResultsCount.ToString(); } catch (Exception exp) { this.lblError.Text = exp.Message; }}
FIGURE5: ThedoGoogleSearch method accepts several different parameters (see FIGURE 4) thatare used to search the Google Webpage index. It returns a resultElementscollection that can be bound to standard ASP.NET Web server controls, such as aDataList.
As thedata binding takes place between each ResultElement and the DataList,you must cast the bound DataItem to a ResultElement, so you canaccess the appropriate properties. I have performed this cast using thestandard <%# %> data-binding syntax, as shown inbold in FIGURE 6.
<%# ((ResultElement)Container.DataItem).title %> ( Get Cached Page )
FIGURE6: Binding theresultElements collection to a DataList control is accomplished by casting eachdata item in the collection that is being bound to a ResultElement object. Oncethe object is cast, its URL and title property can be accessed and bound to theHyperLink control.
Bycasting the DataItem to a ResultElement, you can access the URLand title properties and bind them directly to the DataList control'schild HyperLink control. The output generated by calling the doGoogleSearch method is shown inFIGURE 7.
FIGURE 7: This image shows the outputgenerated by calling the Google Web Service search functionality using thedoGoogleSearch method. Paging functionality has been added so an end user canpage through multiple records.
AlthoughI won't discuss the paging techniques I used in the ASP.NET page to allow the DataListto page through the Google results, the downloadable code for this articlecontains all of the necessary programming logic to accomplish this task.
Access Google's Cached Pages
Googleindexes different Web pages in its searchable collection by sending out Webcrawlers to walk through pages and find specific keywords. As the crawlers dothis, Google caches a snapshot of the page being indexed. These cached versionsof pages also can be accessed through the Google Web Service as a byte array bycalling the doGetCachedPagemethod. Then, the returned byte array can be converted to a character array byusing the System.Text namespace, and the resulting character array canbe converted to a string, which then can be written out. The code to accomplishthis is shown in FIGURE 8.
public void dlResults_ItemCommand(Object sender, DataListCommandEventArgs e) { try { this.pnlResults.Visible = false; this.pnlCache.Visible = true; string url = ((HyperLink)e.Item.FindControl("hlLink")).NavigateUrl; System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding(); GoogleSearchService s = new GoogleSearchService(); byte[] pageBytes = s.doGetCachedPage(key,url); char[] pageChars = enc.GetChars(pageBytes); this.lblCachedPage.Text = new String(pageChars); } catch (Exception exp) { this.lblCachedPage.Text = exp.Message; }}
FIGURE8: The ItemCommandevent can be used to capture events raised by controls, such as a LinkButtonthat is nested within a DataList. When the LinkButton is clicked, the WebService proxy object is instantiated, and the doGetCachedPage method is called.The returned byte array is converted to a string.
The codeshown in FIGURE 8 is executed when the linkbutton named lnkCachedURLwithin the DataList isclicked, causing the DataList control's ItemCommand event to be fired. The string that is createdafter converting the byte array is then written to a Label control namedlblCachedPage in theASP.NET page. FIGURE 9 shows an example of the output returned from calling thedoGetCachedPage method.
FIGURE 9: Google caches pages through whichit crawls and makes the data available through the doGetCachedPage() method.This figure shows a portion of a cached page.
Retrieve Spelling Suggestions
Asidefrom providing the ability to search the Google Webpage index and access cachedpages, the Web Service also can make spelling suggestions for search keywords.This can be useful when an individual isn't exactly sure how to spell aparticular word for which he or she wants to search.
Themethod within the proxy object that makes this possible is named doSpellingSuggestion and is easyto use. It takes two parameters: the Google key and the text for whichyou'd like spelling suggestions:
private void lnkSpelling_Click(object sender, System.EventArgs e) { try { GoogleSearchService s = new GoogleSearchService(); string suggestion = s.doSpellingSuggestion(key,this.txtSearchText.Text); if (suggestion != String.Empty && suggestion != null) { this.txtSearchText.Text = suggestion; } } catch {}}
Forexample, a user could type the text "humin," and the spelling suggestion servicewould return "human."
Googlecertainly could use other methods to expose the functionality discussed in thisarticle. But, by using XML and Web Services data, consumers are not required towrite large amounts of code or even understand much about Web Services asidefrom how to use the proxy object and its associated methods.
BecauseWeb Services are platform-neutral, a variety of consumers also can access thefunctionality Google exposes, with virtually any type of programming language.This allows the exchange of data between distributed applications to occurwithout resorting to manual data feeds or more complex technologies, such asDistributed Component Object Model, Common Object Request Broker Architecture,or Java Remote Method Invocation.
Fortunatelyfor ASP.NET developers, Web Services are built directly into the .NET Frameworkso we all have a powerful mechanism for building and consuming Web Services.For a live example of consuming the Google Web Service APIs from an ASP.NETpage, visit http://www.xmlforasp.net/codeSection.aspx?csID=56.
The files referenced in this article are available for download
Dan Wahlinis the president of Wahlin Consulting, and he founded the XML for ASP.NETDevelopers Web site (http://www.XMLforASP.NET), whichfocuses on using XML and Web services in Microsoft's .NET platform. He is alsoa corporate trainer and speaker, and he teaches XML and ASP.NET trainingcourses around the United States. Dan co-authored Professional Windows DNA (Wrox) and ASP.NET Tips and Tricks (SAMS), and he authored XML for ASP.NET Developers (SAMS). Readers may reach Dan at mailto:[email protected].
Tell us what you think! Please send any comments about thisarticle to [email protected] include the article title and author.
Read more about:
Alphabet Inc.About the Author
You May Also Like