Simplifying the Search
Performing searches on a database’s Name columns can be time-consuming, as Robert discovered. Readers describe how to make his searches case-sensitive and faster.
May 7, 2002
Congratulations to Peter Lin, software engineer at Energistic in Port Klang, Malaysia, and Walid Rifai, software developer for Engineering Design and Manufacturing in Beirut, Lebanon. Peter won first prize of $100 for the best solution to the May Reader Challenge, "Simplifying the Search." Walid won second prize of $50. Here’s a recap of the problem and the solution to the May Reader Challenge.
Problem
Robert programs for a company that uses SQL Server 2000 as its database platform. One of the company’s database tables contains a list of names and other related columns. The following statement creates the table and its relevant columns and sample data:
CREATE TABLE Test ( "ID" int PRIMARY KEY, "Name" varchar(500) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, "Description" varchar(2048) COLLATE SQL_Latin1_General_CP1_CI_AS null)INSERT INTO Test ( "ID", "Name", "Description" )VALUES( 1, 'This is a test name', 'Description for test name.' ) INSERT INTO Test ( "ID", "Name", "Description" )VALUES ( 2, 'Long name.' + replicate( '*', 400), 'Description for long name.' )INSERT INTO Test ( "ID", "Name", "Description" )VALUES ( 3, 'Another name', 'Description for another name.' )INSERT INTO Test ( "ID", "Name", "Description" )VALUES ( 4, 'Another Name', 'Description for another name.' )
Robert wants to optimize a query that is based primarily on the table’s Name column. Additionally, he wants to make the searches case sensitive only for this table and query. (All the searches that users perform on this table are equality searches.) Help Robert redesign the table and query so that he can efficiently achieve those two objectives.
Solution
Robert can use the new collation features in SQL Server 2000 to change the Name column’s collation to case-sensitive. First, to modify the table’s schema so that it allows case-sensitive data comparison, he can use the following ALTER TABLE statement:
ALTER TABLE Test ALTER column "Name" varchar(500) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
Next, Robert can use the CHECKSUM() or BINARY_CHECKSUM() functions to create a hash index on the Name column; creating the hash index will let him search the values more efficiently and keep the index small. He can create the hash index by defining a computed column for the CHECKSUM() value and indexing the column. To add the computed column and hash index, he can write the following statement:
ALTER TABLE Test ADD NameHash AS ( CHECKSUM( "Name" ) )SET ansi_nulls, ansi_warnings, ansi_padding, arithabort, concat_null_yields_null ONSET numeric_roundabort OFFCREATE INDEX Test_NameHash ON Test( NameHash )
Finally, Robert can modify his Name-column searches to include the hash index by using the CHECKSUM() function on the search value as well as the equality condition (which includes the COLLATE clause) on the Name column. The following query performs a simple match on a name value:
DECLARE @Name varchar(1024)SET @Name = 'Another name'SELECT * FROM Test WHERE "Name" = @Name COLLATE SQL_Latin1_General_CP1_CS_AS AND NameHash = CHECKSUM( @Name COLLATE SQL_Latin1_General_CP1_CS_AS
In this query, the variable’s collation has to be the same as the column’s collation because the server is configured for the default case-insensitive collation. Also, the query has to perform the additional check for the Name column because the CHECKSUM values aren’t necessarily unique and therefore different name values might have the same CHECKSUM value. Consequently, the query might encounter the same hash value for two different name values.
Now, test your SQL Server savvy in the June Reader Challenge, "Passing the Values" (below). Submit your solution in an email message to [email protected] by May 15. SQL Server MVP Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.
Here’s the challenge: As the database architect for a company that uses SQL Server 2000, Michael has been assigned the task of designing routines for auditing changes that the company’s Web application makes to the data. He decides to use triggers to implement the auditing routines and to use stored procedures for performing all data-manipulation operations in the application. Let’s simulate Robert’s scenario by using the following four CREATE statements to create two sample tables with relevant columns, a trigger, and a stored procedure:
CREATE TABLE Test ( i int primary key )CREATE TABLE TestAudit ( i int, Source varchar(10) )GO
CREATE TRIGGER AuditTestInsertON TestAFTER INSERTASBEGIN DECLARE @CallingProc varchar(128)-- Robert needs to initialize this variable DECLARE @Source varchar(30)-- Robert needs to initialize this variableINSERT INTO TestAudit ( i ) SELECT i from insertedENDGOCREATE PROCEDURE InsertTest ( @Source varchar(30), @i int )ASINSERT INTO test values( 1 )GO
For a specific business purpose, Michael wants to learn two kinds of information: From within the trigger code, he wants to find out the name of the stored procedure that fires the INSERT trigger, and from the audit trigger, he wants to discover the value of the @Source parameter that the application passes to the stored procedure each time the trigger fires. Michael expects that after the Web application calls the stored procedure with the following parameters:
EXEC InsertTest 'App1', 1
the audit table will contain the newly inserted row with its App1 source value. However, the row appears in the audit table without the information he needs, so Michael has to modify the code—preferably with minimal changes—so that it populates all the necessary columns. Help Michael design an efficient solution for passing values from the stored procedure to the trigger.
About the Author
You May Also Like