Culling Web Pages with ActiveX

With Windows Scripting Host, you can access ActiveX components. Learn how to use an ActiveX object to create a WSH script that controls Internet Explorer (IE) and accesses content from two Web sites.

Robert Richardson

January 31, 1999

9 Min Read
ITPro Today logo in a gray background | ITPro Today


Many scripts in the Windows Scripting Host (WSH) user community focus on managing user accounts and resources. However, WSH scripts aren't limited to these functions. WSH is a powerful tool that lets you access scripting components across the entire range of Windows applications, including Internet Explorer (IE).

With WSH's support for Microsoft's Component Object Model (COM), you can also access ActiveX components. The Internet offers a wide array of ActiveX components, so you'll likely find an ActiveX object that performs the task you need to accomplish.

In this article, I'll show you how to use an ActiveX object to create a script that controls Internet Explorer (IE) and accesses content from Web pages. (If you're unfamiliar with how to use ActiveX components, see "Using ActiveX Objects to Extend WSH's Functionality," January 1999.) Along the way, I'll discuss two workhorse Visual Basic Script (VBScript) functions—InStr and Mid—that you can use to manipulate string variables.

The Task
I access certain Web pages frequently, because their content changes daily. To save time, I decided to create a script that goes to these Web pages, pulls specific snippets of text (i.e., news headlines), and places those snippets into a customized digest page within IE.

When I started this project, I wanted to write a script that loaded each Web page directly into IE, where the script would copy and paste the headlines into the digest page. However, I ran into a problem: I could use VBScript with IE objects to access the contents of a Web page for display, but the IE objects didn't include properties and methods for manipulating the text in that page.

To solve this problem, I had to use IE objects to display the script's output and an ActiveX object to read the Web pages. The result is the WebExample.vbs script in Listing 1. (You can download Listing 1 from http://www.winntmag.com/newsletter/scripting.)

The ActiveX component I used was Microsoft's Internet Transfer Control (ITC), an Object Linking and Embedding (OLE) custom control (OCX) file. This multipurpose tool ships with the Microsoft Office 97 Developer Edition and with most versions of Visual Basic and Visual Studio. Several commercial ActiveX components also provide HTTP services. If you don't have a Microsoft package that includes ITC, a third-party ActiveX component might better meet your needs. (For information about how to find and install OCX files, see "Using ActiveX Objects to Extend WSH's Functionality.")

Whether you use ITC or a similar ActiveX component, the approach and the code are similar. You need to set the stage, create the customized digest page, and fetch and copy the text.

Setting the Stage
You begin WebExample.vbs with an Option Explicit statement, which requires that you declare your variables. You use Dim statements to declare six variables: oIE, oInetCtrl, i, n, length, and buffer. The prefix o in oIE and oInetCtrl specify they are object variables.

The i and n variables are shorthand for integer and number, respectively. Following the practice commonly used in C and C++, you typically use i and n variables to count or to remember where you are in an array or a loop. These variables don't have meaning in terms of the script's purpose. They simply represent a value.

The length variable represents a string's length. The buffer variable is a long string variable.

Creating the Customized Digest Page
The code to create the digest page begins with the creation of an instance of the object you want to use. Specifically, you use VBScript's Set statement with the CreateObject function to create an instance of IE's top-level object ("InternetExplorer.Application") and assign it to the oIE variable.

Next, you use several IE object methods and properties to load a Web page, make that page visible, and write a header to it. To load a Web page, you use oIE's Navigate method. In this case, you must load a blank page, as the argument "about:blank" specifies, because you need to add text to it. If you were to put a URL as the argument, the Web page at that URL would load. However, as I mentioned previously, you wouldn't be able to read the contents of this page from within your program.

To make the IE window visible to users, you need to set oIE's Visible property to 1. A value of 0 means the page is invisible to users.

To insert the header, you use the oIE's Document property to access the Document object. You then use the Document object's writeIn method to add the Customized Digest Page header. The writeIn method automatically follows the text with a carriage return. You again use the writeIn method to add the paragraph break tag (

) so that the digest page has a line break in it. (Web browsers ignore line breaks in HTML pages.)

When you write the header, keep in mind that the strings of text you'll write need to constitute an HTML formatted page. For the HTML purist, formatting the page involves writing in a document header between tags, including a

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like