Data Tales #4: The Case of the Phantom Duplicate
Our intrepid DBA is attempting to enter a new value into a table and is getting an error due to violation of a unique constraint even though the value does not yet exist in the table. What could possibly be causing this?
November 3, 2015
Background
As very few people seem to have worked with SQLCMD mode, I thought I'd start this article with a little background.
SQL Server 2005 added SQLCMD as a new command line tool to replace OSQL. It was an OLE-DB based tool rather than an ODBC based tool and had a richer set of options. For example, you could execute operating system commands within your scripts.
SQL Server Management Studio also added the ability to execute queries in SQLCMD mode. This allowed other types of commands to be embedded within SQL scripts. For example, the following script would not normally run in a query window:
SELECT TOP(2) * FROM sys.tables;!!DIR
However, we could change the query mode to SQLCMD:
The !!DIR command now shows highlighted, and the script will execute:
There are a reasonable number of useful commands in SQLCMD mode:
[!!:]GO[count]!! :exit(statement):Quit:r :setvar :connect server[instance] [-l login_timeout] [-U user [-P password]]:on error [ignore|exit]:error |stderr|stdout:out |stderr|stdout
The command that I've found to be the most useful over the years is the ability to change which server commands are sent to. For example, if I am currently connected to SERVER1, I can execute the following script:
:CONNECT ProdServerSELECT @@SERVERNAME;
This script will connect to the server called ProdServer and run execute the command there. The server name can also be a SQL Server alias. Using aliases can help your scripts to be more independent of server names. For authentication, if Windows authentication is all that’s required, you don’t need anything else. Otherwise, for SQL Server authentication, you can supply the username and password in the statement.
Today's Case
OK, with the background out of the way, it's time to look at today's case. Our intrepid DBA (today we'll call her Mary) was managing a database called ReferenceData. The company had four versions of this database running in different environments: Development, Staging, UAC, and Production.
Periodically Mary needed to add new reference values into the same table on all four copies of the database. For example, she might have needed to add a new country to a Countries table. The world's newest country is South Sudan, so let's use that country as an example.
On each server, Mary needed to execute the following query:
INSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');
SQLCMD mode seemed like a good way to achieve this. Mary wrote the following script.
:CONNECT DevServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT StagingServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT UATServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT ProdServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');
When this script was executed from within the ReferenceData database on the DevServer, the following error was returned:
The table was defined as follows:
CREATE TABLE dbo.Countries( CountryID int IDENTITY(1,1) NOT NULL CONSTRAINT PK_dbo_Countries PRIMARY KEY, CountryName nvarchar(100) NOT NULL CONSTRAINT UQ_dbo_Countries_CountryNamesMustBeUnique UNIQUE, ShortISOCode nvarchar(2) NOT NULL CONSTRAINT UQ_dbo_Countries_ShortISOCodesMustBeUnique UNIQUE, LongISOCode nvarchar(3) NOT NULL CONSTRAINT UQ_dbo_Countries_LongISOCodesMustBeUnique UNIQUE, PhonePrefix nvarchar(10) NOT NULL);GO
Mary was very puzzled, as she was sure that this data was not already present in the table, but clearly the unique constraint on the LongISOCode column was failing. A quick check proved that she was correct:
SELECT * FROM dbo.CountriesWHERE LongISOCode = 'SSD';
Every time that she executed the script, she received an error telling her that the data was already there.
It was time to take a look at what was happening.
If ever I see primary key violations or unique constraint violations, there are normally only two possibilities:
The data that you are inserting is already in the table
Somehow, you are trying to insert the data more than once, possibly even in the same statement.
We knew that the first reason wasn't valid so it had to be the second problem. But how could the INSERT operations end up trying to insert the same value more than once?
The reason for this lies in how SQL Server Management Studio (SSMS) processes batches. A T-SQL script can comprise one or more batches. The GO statement isn't really a T-SQL statement but is a batch separator. So when you execute a T-SQL script such as:
SELECT @@VERSION;GOSELECT @@SERVERNAME;GOSELECT GETDATE();GO
SSMS doesn't send the entire script to the database engine. It locates all the GO separators and uses it to break up the script. It looks like the entire script is run at once but that's not what's happening. The SELECT @@VERSION; command is sent to the server and executed, then the SELECT @@SERVERNAME; command is sent to the server and executed, and finally the SELECT GETDATE(); command is sent to the server and executed.
It's important to note that the GO statement is never sent to SQL Server. It is not a T-SQL command. It has meaning only to the SSMS tool. In fact you can change it to a different value. Under the Tools menu, in Options, you can see:
I often mention my friend who loves Star Trek, and uses the word ENGAGE as his batch separator. His scripts look interesting but aren't much use to anyone else. (As an aside, an interesting prank for a new DBA is to set the separator to SELECT and try to watch them work out what's wrong).
Understanding that GO is implemented by SSMS and not by the SQL Server engine is important. It’s also why you can add a count to your statements:
Notice that the intellisense system doesn’t understand the count after the GO statement. All that is happening is that SSMS is sending the batch to SQL Server 100 times.
Now the problem in Mary's case is that the :CONNECT statement in SQLCMD isn't a batch separator, so executing the following script is a problem:
:CONNECT DevServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT StagingServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT UATServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');:CONNECT ProdServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');
What happens is that :CONNECT is also implemented by SSMS not by SQL Server and all these commands are considered to be a single batch. So the end result is that all four INSERT statements are executed against ProdServer.
I consider this behaviour to be both bizarre and unexpected. In Wikipedia, the Principle of Least Astonishment says that "If a necessary feature has a high astonishment factor, it may be necessary to redesign the feature." I have seen this problem in the past and suggested to the SQL Server team that it should be fixed (https://connect.microsoft.com/SQLServer/feedback/details/611144/sqlcmd-connect-to-a-different-server-should-be-an-implicit-batch-separator) but it was closed as "Won't Fix". If you agree, you might want to leave a comment to let them know as well.
So what Mary should have done was to execute the following command:
:CONNECT DevServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');GO:CONNECT StagingServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');GO:CONNECT UATServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');GO:CONNECT ProdServerINSERT ReferenceData.dbo.Countries (CountryName, ShortISOCode, LongISOCode, PhonePrefix) VALUES (N'South Sudan', N'SS', N'SSD', N'211');GO
That would have worked as expected.
About the Author
You May Also Like