Solutions to T-SQL Sorting Challenge
Itzik provides solutions to last week’s T-SQL sorting challenge.
September 8, 2008
Last week I provided a T-SQL challenge involving sorting dot separated lists of values. You can find the challenge details here. I’d like to thank all those who sent solutions: Razvan Socol, Steve Kass, Pawel Potasinski, Peter DeBetta, Uri Dimant and bsteen.
I’ll discuss three types of solutions based on the versions of SQL Server that support them.
I’ll start with solutions that are supported as of SQL Server 2000. In SQL Server 2000 you can create a user defined function that accepts a string with the dot separated list of values as input and returns a string that is sorted “correctly” among others based on the requirements of the puzzle. For example, for each integer in the input string you can produce a fixed sized string value in the output string that would sort the same way the integer value would. The solution by Steve Kass implements this logic:
create function orderString(
@v varchar(500)
) returns varchar(6000) as begin
declare @result varchar(6000);
set @result = '';
declare @dotpos int;
while len(@v) > 0 begin
set @dotpos = charindex('.',@v);
if @dotpos = 0 set @dotpos = len(@v)+1;
set @result = @result+str(10000000000.0+substring(@v,1,@dotpos-1),11,0)
set @v = substring(@v,@dotpos+1,len(@v));
end;
return @result;
end;
go
select val from dbo.t1 order by dbo.orderString(val);
I used similar logic in my SQL Server 2000 compatible solution, only I produced a binary string as the sort string instead of a character one:
create function dbo.fn_sortval(@s as varchar(500))
returns varbinary(2500)
as
begin
declare @pos as int, @r as varbinary(2500), @element as int;
set @r = 0x;
set @s = @s + '.';
set @pos = charindex('.', @s);
while @pos > 0
begin
set @element = cast(left(@s, @pos - 1) as int);
set @r = @r +
case
when @element = -2147483648 then 0x0000000000
when sign(@element) = -1
then 0x01 + cast(2147483647 + @element as binary(4))
else 0x02 + cast(@element as binary(4))
end;
set @s = stuff(@s, 1, @pos, '');
set @pos = charindex('.', @s);
end
return @r;
end;
go
select val from dbo.t1 order by dbo.fn_sortval(val);
As for a SQL Server 2005 compatible solution, you start by creating a table function that accepts a string with a dot separated list of values and returns a table result with a row for each element. The function makes use of an auxiliary table of numbers that you can create and populate by running the following code:
-- Create and populate an auxiliary table of numbers
IF OBJECT_ID('dbo.Nums') IS NOT NULL DROP TABLE dbo.Nums
CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);
SET NOCOUNT ON;
DECLARE
@max AS INT,
@rc AS INT;
SET @max = 1000000;
SET @rc = 1;
BEGIN TRAN;
INSERT INTO dbo.Nums(n) VALUES(1);
WHILE @rc * 2 <= @max
BEGIN
INSERT INTO dbo.Nums(n)
SELECT n + @rc FROM dbo.Nums;
SET @rc = @rc * 2;
END
INSERT INTO dbo.Nums(n)
SELECT n + @rc FROM dbo.Nums
WHERE n + @rc <= @max;
COMMIT TRAN;
The following code has the definition of the function that splits the string to its elements:
-- Function that splits array elements
CREATE FUNCTION dbo.fn_split(@arr AS VARCHAR(8000), @sep AS CHAR(1))
RETURNS TABLE
AS
RETURN
SELECT
(n-1) - LEN(REPLACE(LEFT(@arr, n-1), @sep, '')) + 1 AS pos,
SUBSTRING(@arr, n, CHARINDEX(@sep, @arr+@sep, n) - n) AS element
FROM dbo.Nums
WHERE n <= LEN(@arr) + 1
AND SUBSTRING(@sep+@arr, n, 1) = @sep;
GO
You can then use the split function in a subquery to split a string to its elements, and produce for each element a corresponding value that positions it correctly among others in terms of sorting. Then, with the FOR XML PATH option you can concatenate the individual sort elements to a single string. Here’s an example for how this can be done when the integers in the string are guaranteed to be nonnegative:
-- No negative values
select *
from dbo.t1
order by
(select right('000000000' + element, 10) as [text()]
from dbo.fn_split(val, '.')
order by pos
for xml path(''));
And here’s how you add some extra logic if you need to support negative values:
-- Support negative values
select *
from dbo.t1
order by
(select
case
when element = -2147483648 then '00000000000'
when element < 0
then '1' + right('000000000'
+ cast(2147483647 + element as varchar(10)), 10)
when element >= 0
then '2' + right('000000000' + element, 10)
end as [text()]
from dbo.fn_split(val, '.')
order by pos
for xml path(''));
In SQL Server 2008 things are drastically simpler. With slight adjustment of the dot separated list of values, you get the canonical string representation a HIERARCHYID value. Either replace all dots in the string with slashes and add a slash at the beginning and end of the string, or simply add a slash at the beginning and end of the string. Either way, convert the adjusted string to HIERARCHYID, and the values would naturally sort correctly based on the integer values in the string segments. Several people came with such a solution: Razvan Socol, Pawel Potasinski, Peter DeBetta and myself. Here are two examples based on this approach:
select val
from dbo.t1
order by cast('/' + replace(val, '.', '/') + '/' as hierarchyid);
select val
from dbo.t1
order by cast('/' + val + '/' as hierarchyid);
Cheers,
BG
About the Author
You May Also Like