Querying Active Directory Records
There are several ways to query Microsoft Active Directory (AD) from Microsoft SQL Server. One common method involves using the sp_add_linked_server stored procedure to create a linked server, then using T-SQL's OPENQUERY clause to directly query Active Directory. Another common method is to use an ADO.NET connection manager in a SQL Server Integration Services (SSIS) package so that you can create an ADO.NET connection to Active Directory, then query it. A third common method is to use a Script component in a SQL Server Integration Services package so that you can include a Visual Basic .NET script that uses the Directory Services .NET assembly to access and query Active Directory. This article explains how to perform each method and the results you can expect. Surprisingly, the results can differ, depending on the method you use and your environment.
August 11, 2009
Querying Active Directory (AD) from SQL Server can provide useful data. Three common methods to query AD are
Using a linked server
Using a ADO.NET connection manager in a SQL Server Integration Services (SSIS) package
Using a Script component in an SSIS package
However, these three methods don't always provide the same results. Let's walk through how to perform each of them and compare their results.
Linked Server Method
To query AD using a linked server, you first need to use the sp_add_linked_server stored procedure to create that server. You then need to use an OPENQUERY clause to query AD directly. You shouldn't import AD data into a table and query that table because the data might end up being outdated. For example, suppose you import AD data into a table for querying and shortly afterward the network administrator changes the groups that a user belongs to. The data in your table will be outdated and inaccurate. For accurate, up-to-date data, you need to directly query AD with OPENQUERY.
To query a linked server, the SQL Server service startup account must be running under a domain account and not a local system account. In addition, the account must have read access privileges to query AD. To verify the account settings, select Administrative Tools in the Control Panel, select Services, and double-click SQL Server (MSSQLSERVER) Service. Account settings are on the Log on tab. If you need to change the account, the service needs to be restarted in order for the changes to take effect.
After the account settings are in order, you can use the code in Listing 1 to create a linked server and query AD for user account information (e.g., first name, last name, distinguished name, username).
Listing 1: Querying AD Using a Linked Server |
---|
To use this code, you need to replace LDAP://DC=MyDomain,DC=com in callout A with the LDAP path to your AD domain. When I ran this code against my AD environment, the query returned 2,500 records.
ADO.NET Connection Manager Method
An ADO.NET connection manager allows an SSIS package to access data sources through a .NET provider. One of those providers is the OLE DB Provider for Microsoft Directory Services. You can use this provider to establish an ADO.NET connection to AD so that you can query AD for data. Here are the steps to create an SSIS package that uses an ADO.NET connection manager for this purpose:
Script Component Method
When you use the Script component method, this limit isn't present. The lack of a limit is due to one line in ActiveDirectory_Source_Script.txt:
searcher.PageSize = 1000
In fact, when I commented out this line and reran that SSIS package, only 2,500 records were returned. You can better control the behavior of AD queries when you have the various Microsoft .NET Framework classes, methods, and properties at your disposal. In this case, having the System.DirectoryServices.dll's PageSize property available is key. This property is available in the Script component method but not the linked server and ADO.NET connection manager methods.
Which AD query you should use depends on your situation. If you have a large AD database, using the Script component method is clearly best. However, if you have a small AD database, the linked server and ADO.NET connection manager methods might be better because they're easier to implement.
As I mentioned previously, you can find the ActiveDirectory_Source_Script in the "Download the Code" file at the top of the page. The file also contains sample SSIS packages for the ADO.NET connection manager and Script component methods as well as the code in Listings 1 and 2. I created and tested the linked server code and the two SSIS packages on SQL Server 2005 SP2 (x64). Note that ActiveDirectory_Source_Script.txt was developed for SSIS 2005. The script might need to be modified to work with SSIS 2008 because SSIS 2005 uses VSA 8.0 as the scripting engine for the Script component, whereas SSIS 2008 uses Microsoft Visual Studio Tools for Applications (VSTA) 2.0 as the scripting engine for the Script component.
Start the SQL Server Business Intelligence Development Studio (BIDS). On the File menu, select New, then Project. After making sure the project type is Business Intelligence Projects, select the Integration Services Project template. Enter your project's name.
Add a data flow. Click Toolbox on the View menu. Drag and drop the Data Flow Task icon from the Toolbox's Control Flow Items section to the Control Flow design surface. Double-click the newly added data flow to switch to the Data Flow tab.
Add a connection manager. On the SSIS menu, select New Connection, as Figure 1 shows. (If the SSIS menu options in Figure 1 aren't showing, click any item on the SSIS design surface to enable them.)
Figure 1: The SSIS menu in BIDs (click to enlarge)
In the Add SSIS Connection Manager dialog box, select ADO.NET and click Add. Select New to bring up the Connection Manager dialog box in Figure 2.Figure 2: Adding a connection manager
In the Provider drop-down box, navigate to .Net Providers for OleDbOLE DB Provider for Microsoft Directory Services. In the Server or File name field, enter your domain's name. Click Test Connection. If you can connect to AD, click OK twice to return back to the Data Flow design surface.Add a data source. Drag and drop the DataReader Source icon from the Toolbox's Data Flow Sources section to the Data Flow design surface. Double-click the data source to configure it. In the Connection Managers grid, select the newly added ADO.NET connection from the drop-down list. In the Component Properties grid, enter your AD query in the SqlCommand field. Listing 2 shows a sample query.
Listing 2: AD Query Used with DataReader Source
To use this code, you need to replace LDAP://DC=MyDomain,DC=com in callout A with the LDAP path to your AD domain. Click OK twice.Add a destination. In this example, the destination for the data returned from the AD query is a text file. Drag and drop the Flat File Destination icon from the Toolbox's Data Flow Destination section to the Data Flow design surface. Select the Source component. Drag and drop the green output arrow to the flat file destination, then double-click the flat file destination. In the Flat File Destination Editor, select New, which brings up the Flat File Format dialog box. Choose Delimited, and click OK. In the Flat File Connection Manager Editor that appears, enter C:AD_Output_ADONET.txt in the File name field. Select the Unicode check box because the OLE DB Provider for Microsoft Directory Services returns Unicode text. In the Text Qualifier field, enter a double quote ("). A double quote is used as a text qualifier because some of the AD property values might contain commas. Select the Column names in the first data row check box. The rest of the fields in the Flat File Connection Manager Editor can be left at their default values (e.g., having the row delimiter set to {CR}{LF} and the column delimiter set to a comma). Click OK. In the Flat File Destination Editor, click the Mappings option in the left pane and click OK.
Save the SSIS package by choosing Save Selected Items on the File menu.
Run the package by selecting Start Debugging on the Debug menu. The Data Flow tab will look similar to the one in Figure 3.
Figure 3: Sample results from the ADO.NET connection manager method
As you can see, this AD query also returned 2,500 records when I ran it.Click the Stop Debugging button to return back to editing the SSIS package.
Script components provide a way to include custom scripts in data flows in SSIS packages. In this case, a Script component is used to include a Visual Basic .NET script that uses the Directory Services .NET assembly (System.DirectoryServices.dll) to access and query AD. Here are the steps to create this SSIS package:
Follow steps 1 to 2 in the "ADO.NET Connection Manager Method" section.
Add two variables. The gLDAPQuery and gLDAPFilter variables store information that's used in the AD query. On the SSIS menu, select Variables. In the Variables dialog box, click the Add Variables button. First, add the gLDAPQuery, giving it a string data type. Set gLDAPQuery's value to the LDAP path to your AD domain (e.g., LDAP://DC=MyDomain,DC=com). Then, add the gLDAPFilter, giving it a string data type. Set gLDAPFilter's value to any LDAP filter that needs to be applied to the AD query, such as (&(objectCategory=Person)(objectClass=user)). With these variables, you don't need to modify the Script component when you want to change the AD domain or LDAP filter in an AD query. You simply change the variables' values.
Add a Script component. Drag and drop the Script Component icon from the Toolbox's Data Flow Transformations section to the Data Flow design surface. When you add the Script component, the Select Script Component Type dialog box will appear. Select Source because this Script component will be used as the source of the AD data. Click OK to add the Script component to the Data Flow design surface. Double-click the Script component to open the Script Transformation Editor, then select Inputs and Outputs in the left pane. In the middle pane, select Output 0. In the right pane, change the name from Output 0 to ActiveDirectoryOutput in the Common Properties grid so that the name is more meaningful.
Add the list of output columns that will be provided to rest of the transformation flow. For this example, you need to add a list of AD columns. In the middle pane, expand ActiveDirectoryOutput and select Output Columns. For each of the AD properties listed in Table 1, click the Add Column button and update the Name, DataType, and Length values accordingly.
Table 1: Output Column Values
Column Name
DataType
Length
Name
string[DT_STR]
200
GivenName
string[DT_STR]
200
DistinguishedName
string[DT_STR]
200
SN
string[DT_STR]
200
Mail
string[DT_STR]
150
SamAccountName
string[DT_STR]
150
EmployeeID
string[DT_STR]
50
ADsPath
string[DT_STR]
150
PropertiesList
string[DT_STR]
200
MemberOfList
string[DT_STR]
6000
As Figure 4 shows, you'll find the Name field in the Common Properties grid and the DataType and Length fields in the Data Type Properties grid.Figure 4: Adding output columns (click to enlarge)
Add the variable names in the Script Transformation Editor. Select Script in the left pane. In the Custom Properties grid, enter gLDAPQuery,gLDAPFilter in the ReadOnlyVariables field. (This needs to be entered exactly as shown.) Click the Design Script button to bring up the Microsoft Visual Studio for Applications (VSA) IDE.
Add a reference to the System.DirectoryServices.dll, which will be used to query AD. In the VSA IDE, right-click References in Project Explorer and select Add Reference. In the Add Reference dialog box, select System.DirectoryServices.dll from the .NET component list, as Figure 5 shows.
Figure 5: Adding a reference to the System.DirectoryServices.dll (click to enlarge)
Click Add, then OK.Replace the default script with the custom Visual Basic .NET script. Select Project Explorer on the View menu, and double-click ScriptMain to display the default Visual Basic .NET ScriptMain script. Replace the default script with the one in ActiveDirectory_Source_Script.txt. You can download ActiveDirectory_Source_Script.txt by clicking the 102200.zip hotlink. Note that ActiveDirectory_Source_Script.txt includes the statement Imports System.DirectoryServices because the script doesn't fully qualify the class names with their corresponding .NET namespaces. If you want to remove the Imports System.DirectoryServices statement, you would have to prefix all the DirectoryServices classes in the script with their namespace, which is System.DirectoryServices. For example, the line
Dim de As New DirectoryEntry
would have to be replaced with
Dim de As NewSystem.DirectoryServices.DirectoryEntry
Click Close & Return on the File menu to close the VSA Script Editor and return back to the SSIS design surface. Click OK.
Add a destination. Follow step 5 in the "ADO.NET Connection Manager Method" section. However, change the filename to C:AD_Output_Script.txt and don't select the Unicode check box.
Add a data viewer so that you can view the data when the SSIS package is running in design mode. In the Data Flow design surface, right-click the green output arrow and select Data Viewers, as Figure 6 shows.
Figure 6: Adding a data viewer
Click Add, then click OK twice. Note that if you decide to add new AD properties (i.e., add new output columns) after you've added the data viewer, you'll need to reconfigure it. To do so, right-click the green output arrow, select Data Viewers, and choose Configure. After you select the columns to view, click OK twice.Save the SSIS package by choosing Save Selected Items on the File menu.
Run the package in the SSIS Designer by pressing the F5 key. You'll see an output screen similar to the one in Figure 7.
Figure 7: Sample results from Script component method
The pipeline retrieves the AD data in sets of buffers. The data viewer's status bar displays the number of buffers used and total number of rows retrieved so far. The number of rows in the current buffer appears after "Rows displayed=". Pressing the Play button retrieves the next buffer. When I ran the package against my AD domain, 3,598 records were returned.
If you compare the number of records returned from each query method, you'll notice that the Script component method returned a lot more records than the other two methods (3,598 versus 2,500). The linked server and ADO.NET connection manager methods have a limit to the number of records that can be returned from AD. (The record limit is designed to prevent denial-of-service attacks and network overloading.) Although my limit was 2,500, the limit in your environment might differ, depending on your hardware, software, and network.
About the Author
You May Also Like