Row-Concatenation Solutions Aren't All Equal

A reader's tests reveal that although various row-concatenation solutions produce the same results, the time it takes to get those results can vary significantly.

Readers

September 19, 2007

2 Min Read
ITPro Today logo


 Executive Summary:

There are many ways to concatenate row values in Microsoft SQL Server databases. With the hope of finding the most efficient row-concatenation solution, a SQL Server Magazine reader tested two common approaches: using a self-reference variable within a SELECT clause and using FOR XML. He found that although these solutions produce the same result, the time it takes to get those results varies significantly.

Let's say you have a table with a column namedcolor_name and you want to store all the rowvalues as a list of comma-delimited values. For example,if the column contains the row values of Yellow,Blue, and Red, the final result would look like:@cv = {Yellow,Blue,Red}

There are many ways to concatenate row values toget results like this. Although the various solutionsproduce the same result, the time it takes to getthose results can vary significantly. With the hopeof finding the most efficient row-concatenationsolution, I tested two common approaches: using aself-reference variable within a SELECT clause andusing FOR XML. See also, "Using T-SQL to Concatenate Data Into a Variable."

I began by running the SampleData.sql script inListing 1 to create a table named T1 and populateit with 50,000 rows. (If you run this code so thatyou can test the solutions on your system, note thatSampleData.sql takes a few minutes to run.)

Next, I turned on the Discard results after executionoption in SQL Server Management Studio(SSMS) so that the time it took to generate theoutput wouldn't be taken into consideration. Afterenabling that option, I cleared the data cache andran Solution1.sql, which Listing 2 shows, to concatenaterows by using a self-reference variable withina SELECT clause. It took 10 seconds for this codeto run on my system.

It's worth noting that some people prefer to usethe following two system-defined functions to concatenaterows:

DECLARE @cv varchar(MAX)
SELECT @cv =
COALESCE(@cv + ',', ') +
ISNULL(col1,'') FROM T1
SELECT @cv

This code took even longer to run on my system (16seconds) because of the functions' overhead.

I then tested Solution2.sql, which Listing 3 shows, to concatenate rows by using FOR XML.It took less than 1 second for this code to run onmy system, which is 10 times faster than the firstsolution.

The first solution is not only slower but alsoleads to undefined results. This is due to thekind of assignment used within the SELECTclause. As Microsoft's Eugene Zabokritski notes,"The results of an assignment in the SELECTlist... depend on what plan the optimizer choosesto execute the query, so technically speaking,the results are undefined." (For more informationabout undefined results, go to http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=126129.)

When you have a small number of row values, itprobably doesn't matter which row-concatenation solutionyou choose. However, when you're working withtables that contain a large amount of data, considerusing FOR XML to concatenate row values. See also, "Custom Aggregations: The Pivoting Technique."

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like