Solutions to T-SQL Sorting Challenge

Itzik provides solutions to last week’s T-SQL sorting challenge.

Itzik Ben-Gan

September 8, 2008

17 Min Read
ITPro Today logo in a gray background | ITPro Today

Last week I provided a T-SQL challenge involving sorting dot separated lists of values. You can find the challenge details here. I’d like to thank all those who sent solutions: Razvan Socol, Steve Kass, Pawel Potasinski, Peter DeBetta, Uri Dimant and bsteen.

I’ll discuss three types of solutions based on the versions of SQL Server that support them.

I’ll start with solutions that are supported as of SQL Server 2000. In SQL Server 2000 you can create a user defined function that accepts a string with the dot separated list of values as input and returns a string that is sorted “correctly” among others based on the requirements of the puzzle. For example, for each integer in the input string you can produce a fixed sized string value in the output string that would sort the same way the integer value would. The solution by Steve Kass implements this logic:

create function orderString(

   @v varchar(500)

) returns varchar(6000) as begin

   declare @result varchar(6000);

   set @result = '';

   declare @dotpos int;

   while len(@v) > 0 begin

     set @dotpos = charindex('.',@v);

     if @dotpos = 0 set @dotpos = len(@v)+1;

     set @result = @result+str(10000000000.0+substring(@v,1,@dotpos-1),11,0)

     set @v = substring(@v,@dotpos+1,len(@v));

   end;

   return @result;

end;

go

 

select val from dbo.t1 order by dbo.orderString(val);

I used similar logic in my SQL Server 2000 compatible solution, only I produced a binary string as the sort string instead of a character one:

create function dbo.fn_sortval(@s as varchar(500))

  returns varbinary(2500)

as

begin

  declare @pos as int, @r as varbinary(2500), @element as int;

 

  set @r = 0x;

  set @s = @s + '.';

  set @pos = charindex('.', @s);

  while @pos > 0

  begin

    set @element = cast(left(@s, @pos - 1) as int);

 

    set @r = @r +

      case

        when @element = -2147483648 then 0x0000000000

        when sign(@element) = -1
          then 0x01 + cast(2147483647 + @element as binary(4))

        else 0x02 + cast(@element as binary(4))

      end;

 

    set @s = stuff(@s, 1, @pos, '');

    set @pos = charindex('.', @s);

  end

 

  return @r;

end;

go

 

select val from dbo.t1 order by dbo.fn_sortval(val);

As for a SQL Server 2005 compatible solution, you start by creating a table function that accepts a string with a dot separated list of values and returns a table result with a row for each element. The function makes use of an auxiliary table of numbers that you can create and populate by running the following code:

-- Create and populate an auxiliary table of numbers

IF OBJECT_ID('dbo.Nums') IS NOT NULL DROP TABLE dbo.Nums

 

CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);

 

SET NOCOUNT ON;

 

DECLARE

  @max AS INT,

  @rc AS INT;

 

SET @max = 1000000;

SET @rc = 1;

 

BEGIN TRAN;

  INSERT INTO dbo.Nums(n) VALUES(1);

 

  WHILE @rc * 2 <= @max

  BEGIN

    INSERT INTO dbo.Nums(n)

      SELECT n + @rc FROM dbo.Nums;

 

    SET @rc = @rc * 2;

  END

 

  INSERT INTO dbo.Nums(n)

    SELECT n + @rc FROM dbo.Nums

    WHERE n + @rc <= @max;

COMMIT TRAN;

 

The following code has the definition of the function that splits the string to its elements:

-- Function that splits array elements

CREATE FUNCTION dbo.fn_split(@arr AS VARCHAR(8000), @sep AS CHAR(1))

  RETURNS TABLE

AS

RETURN

  SELECT

    (n-1) - LEN(REPLACE(LEFT(@arr, n-1), @sep, '')) + 1 AS pos,

    SUBSTRING(@arr, n, CHARINDEX(@sep, @arr+@sep, n) - n) AS element

  FROM dbo.Nums

  WHERE n <= LEN(@arr) + 1

    AND SUBSTRING(@sep+@arr, n, 1) = @sep;

GO

You can then use the split function in a subquery to split a string to its elements, and produce for each element a corresponding value that positions it correctly among others in terms of sorting. Then, with the FOR XML PATH option you can concatenate the individual sort elements to a single string. Here’s an example for how this can be done when the integers in the string are guaranteed to be nonnegative:

-- No negative values

select *

from dbo.t1

order by

  (select right('000000000' + element, 10) as [text()]

   from dbo.fn_split(val, '.')

   order by pos

   for xml path(''));

And here’s how you add some extra logic if you need to support negative values:

 

-- Support negative values

select *

from dbo.t1

order by

  (select

     case

       when element = -2147483648 then '00000000000'

       when element < 0

         then '1' + right('000000000'
           + cast(2147483647 + element as varchar(10)), 10)

       when element >= 0

         then '2' + right('000000000' + element, 10)

     end as [text()]

   from dbo.fn_split(val, '.')

   order by pos

   for xml path(''));

In SQL Server 2008 things are drastically simpler. With slight adjustment of the dot separated list of values, you get the canonical string representation a HIERARCHYID value. Either replace all dots in the string with slashes and add a slash at the beginning and end of the string, or simply add a slash at the beginning and end of the string. Either way, convert the adjusted string to HIERARCHYID, and the values would naturally sort correctly based on the integer values in the string segments. Several people came with such a solution: Razvan Socol, Pawel Potasinski, Peter DeBetta and myself. Here are two examples based on this approach:

select val

from dbo.t1

order by cast('/' + replace(val, '.', '/') + '/' as hierarchyid);

 

select val

from dbo.t1

order by cast('/' + val + '/' as hierarchyid);

Cheers,

BG

 

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like