Enforcing Uniqueness Without the Unique Constraint
September 12, 2006
Congratulations to Steve Kass and Lay Chew. Steve won first prize of $100 for the best solution to the September Reader Challenge, "Fixing a Faulty Lookup Query." Lay won second prize of $50. Here’s a recap of the problem and the solution to the September Reader Challenge.
Jeff is a database developer who needs to develop an application that keeps track of Unicode strings in a database table. The Unicode strings are 40 characters long and include trailing blanks, which are significant for comparison purposes when looking up specific values in the table. Jeff uses the pubs sample database and creates the test table and data by using the following T-SQL batches:
USE pubsGOSELECT identity(int) AS id, CAST(au_lname AS nvarchar(40)) AS name1, CAST(au_fname AS nvarchar(20)) AS name2INTO _aFROM dbo.authorsGOCREATE CLUSTERED INDEX idx__a_name ON _a(name1)GOINSERT INTO _a VALUES(N'Test Name 1 ', N'Test Name 1')INSERT INTO _a VALUES(N'Test Name 1', N'Test Name 1')INSERT INTO _a SELECT a1.name1, a1.name2 FROM _a a1, _a a2GO
To test how he can look up specific items using the name1 column from the table, Jeff writes the following simple query:
DECLARE @n nvarchar(40)SET @n = N'Test Name 1 'SELECT *, datalength(name1) FROM _aWHERE name1 = @nGO
But the query doesn't give the results he expected; Jeff gets the entries with and without trailing blanks. For the sample data shown, the query returns rows that have name1 as Test Name 1 without a trailing blank and Test Name 1 with a trailing blank. However, Jeff expects the equality operator to perform an exact match thereby returning only the row with name1 as Test Name 1.
What is the flaw in the query? Why doesn't the equality operator consider the blanks significant? And how can Jeff modify the query so that he gets the expected results efficiently?
Solution:
Jeff realizes that comparison operators such as =, >=, and
DECLARE @n nvarchar(40)SET @n = N'Test Name 1 'SELECT *, datalength(name1) FROM _aWHERE name1 LIKE @nGO
However, if Jeff uses LIKE to perform the equality search, the search can be slow depending on the number of rows in the table and the execution plan. Alternatively, Jeff can add the predicate that uses LIKE as an additional search condition, as the following example query shows:
DECLARE @n nvarchar(40)SET @n = N'Test Name 1 'SELECT *, datalength(name1) FROM _aWHERE name1 = @nAND name1 LIKE @nGO
This query rewrite will efficiently use the index on the name1 column to look up the rows that match the searched name and using LIKE in the additional search condition ensures that the search will filter out values that don't contain trailing blanks.
Alternatively, Jeff can also rewrite the query using one of the approaches below:
DECLARE @n NVARCHAR(40)SET @n = N'Test Name 1 'SELECT *, DATALENGTH(name1) FROM _aWHERE name1 = @nAND DATALENGTH(name1) = DATALENGTH(@n)GO
DECLARE @n NVARCHAR(40)SET @n = N'Test Name 1 'SELECT *, DATALENGTH(name1) FROM _aWHERE name1 = @nAND name1 + N'*' = @n + N'*'GO
The first query checks the length in bytes of the column using DATALENGTH built-in function and the search value in addition to the equality predicate. The second query correctly concatenates a non-blank character at the end of the column & search value to eliminate the values that differ only in trailing blanks.
OCTOBER READER CHALLENGE:
Now, test your SQL Server savvy in the October Reader Challenge, "Enforcing Uniqueness Without the Unique Constraint " (below). Submit your solution in an email message to [email protected] by September 21. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.
Problem:
Paul is a database architect in a company that provides Web-based message services. He must design a schema to store and retrieve forum messages from the database, and he plans to use SQL Server 2000 as the database server. Among the tables that contain the messages is a table that has a unique-per-message identifier column. The table's schema is shown in the following code:
CREATE TABLE messages (msg_id INT NOT NULL PRIMARY KEY,msg_hdr VARCHAR(1024) NOT NULL)
Paul wants to enforce uniqueness on the msg_hdr column and tries to define a unique constraint on the column by using the following script:
ALTER TABLE messages ADD CONSTRAINT uq_messages_id UNIQUE(msg_hdr)
The ALTER TABLE statement produces the following warning message:
Warning! The maximum key length is 900 bytes. The index 'uq_messages_id' has maximum length of 1024 bytes. For some combination of large values, the insert/update operation will fail.
Using generated sample data for the table, Paul performs tests that reveal that the msg_hdr value might exceed 900 bytes, so he can't use the unique constraint approach. (A unique index in SQL Server enforces a unique constraint, and, as the warning message says, index keys are restricted to a maximum of 900 bytes.)
Help Paul to efficiently enforce uniqueness on the msg_hdr column in the Messages table without the unique constraint.
About the Author
You May Also Like