Finding Columns for Table Joins
A reader shares the stored procedures he uses when he needs to map an unfamiliar database to find columns to use for joins.
July 23, 2002
I wrote my first queries the hard way: I had no database schemas or models to work from and no graphical representations of table joins like those Enterprise Manager and other tools now provide. Whenever the company I work for purchased a SQL Serverbased application and asked me to design custom reports from this wealth of new data, my first task was to spend hours learning where that data might be buried in the application's several hundred tables.
Since then, Microsoft has added information schema views, which simplify accessing system-table data. In fact, Microsoft strongly recommends that developers not access system tables directly. But I still find that I can control certain tasks (e.g., searching for specific columns to join tables) better by typing multiline queries rather than clicking field selections and dragging joins with a mouse, as you do with the graphical tools. So when I need to map an unfamiliar database to find columns for joins, I rely on Query Analyzer and a couple of my own stored procedures. I've used these stored procedures while working on several kinds of projects, usually when I need to convert a database from one third-party application to another. In the conversions I've done, printed data dictionaries or schemas are often hard to come by.
The first stored procedure I use is a slightly modified version of the sp_columns system stored procedure. The default fields that sp_columns returns are TABLE_QUALIFIER (database name), TABLE_OWNER, TABLE_NAME, and COLUMN_NAME. But the only information I need from sp_columns is COLUMN_NAME. So, I recreated sp_columns, renamed it sp_fields, then removed the first three unnecessary fields so that the only information returned is COLUMN_NAME. In SQL Server 7.0 and later versions of Query Analyzer, the Object Browser pane eliminates much of the need for this stored procedure. However, I still use it because if I know the table name, typing the name is quicker for me than using the graphical Object Browser.
When I discover a field that might be useful for creating a join, I use my second stored procedure, sp_fieldtab, to determine which tables in my database might use that field name to join data elements. For example, I surmise by its name that the CustomerID field in the Northwind database is likely a field that joins tables. So, CustomerID is a field that I'd use as a parameter in my second stored procedure.
Listing 1 shows sp_fieldtab, which is a SELECT statement that simply returns every table and view that a particular field is in. This small stored procedure takes a field name as input. The information that sp_fieldtab returns helps narrow down the list of tables that contain a specific field, such as CustomerID. (Typically, database tables use the same field names in joins, although that's not a requirement.)
Sp_fieldtab has the added benefit of allowing LIKE as a field parameter, so you can use wildcards such as the percent sign (%) to return a broader range of field names. For example, if I input the CustomerID field as a parameter to sp_fieldtab, as the following EXEC statement shows:
EXEC sp_fieldtab customerid
I get a list of all the tables or views that contain the field CustomerID, as Figure 1 shows. If I input cust%, I get other field names such as CustomerName and CustomerTypeID. Armed with the information that Figure 1 shows, I can now make an educated stab at creating a query that joins the Customers table to the Orders table on CustomerID. Now, if I could just automate the process of writing all the conversion queries. Hey—I could build a query template, with variables set in sp_fieldtab. I'll send in that tip when I've worked it out.
—Rodney Landrum
[email protected]
About the Author
You May Also Like