UDF Back Doors

Undocumented features of user-defined functions

Itzik Ben-Gan

March 22, 2004

10 Min Read
ITPro Today logo in a gray background | ITPro Today

Secrets are always intriguing, and programming back doors—secret entrances into application code—are no different. I'm covering T-SQL back doors lately for several reasons. First, programmers frequently use undocumented features, which you need to be familiar with in case you need to maintain code written by others. Second, some of the undocumented features are useful, and SQL Server might fully support them someday. And third, of course, the subject is exciting. In "T-SQL Back Doors" (February 2004, InstantDoc ID 41044), I started the discussion by looking at back doors related to stored procedures. This month, I continue the subject by delving into back doors involving user-defined functions (UDFs). I must stress again that these undocumented features are unsupported, so I don't recommend using them. Microsoft might change the functionality of undocumented features in future releases of SQL Server or even in a service pack. Let's start with the first UDF back door—creating your own special UDFs.

Special UDFs

Stored procedures whose names are prefixed with sp_ have a supported special functionality: They exist only in the master database, but you can invoke them from any database without prefixing their names with master. Microsoft introduced UDFs in SQL Server 2000 but didn't provide UDF support for this special functionality. Programmers certainly need such functionality, especially for functions that don't access objects in a specific database but instead implement some generic algorithm. One example of a useful generic scalar UDF (a UDF that returns a single value) is a function that removes all occurrences of a certain pattern from a character string. An example of a useful generic table-valued UDF (a UDF that returns a table) is a function that returns a table containing a requested number of consecutive integers. (This UDF is an alternative for a permanent auxiliary table of numbers, which I've used in many previous articles.)

The SQL Server installation program creates several UDFs that feature the special functionality. For example, the scalar function fn_serverid() returns from the sysservers system table the ID for a server name you provide to the function as an argument:

SELECT fn_serverid(server_name)

Besides being able to invoke this system-supplied function from any database without prefixing its name with master, you don't even need to specify the owner, unlike with regular scalar UDFs. For example, you use the following code to execute the table-valued function fn_helpcollations(), which returns a table containing all collation names and their descriptions:

SELECT * FROM ::fn_helpcollations()

Note that to invoke a special table-valued UDF, you must prefix its name with double colons.

The SQL Server installation program runs regular .sql script files to create the special UDFs, so you can create them, too. The characteristics of special UDFs are that

  • they're created in the master database

  • they have the owner system_function_schema

  • their names begin with fn_

  • their names are all lowercase, even in a case-insensitive environment

However, if you try creating a function that has all these characteristics, you'll get an error message saying that the owner system_function_schema doesn't exist or that you don't have permissions to create an object with that owner. The system_function_schema user does exist in the master database's sysusers table, but you're not usually allowed to use it. Investigating the installation script files produces the missing key—you need to turn on the allow updates server configuration option to get permission to create your special UDFs, specifying the system_function_schema user as the owner. But be sure to turn the allow updates option off immediately afterward so that SQL Server won't accept direct changes to system tables.

Run the script that Listing 1, page 23, shows to create the special scalar UDF fn_removepattern(), which you can use to remove all occurrences of a pattern from a given character string. The function accepts three arguments: @str is the input character string, @pattern is the pattern you want to locate, and @deletelength is the number of characters you want to delete from the positions the pattern is in. You need to format the @pattern argument the same way you format patterns for the LIKE predicate or the PAT- INDEX() function. (For details, see SQL Server Books OnlineBOL—under LIKE.)

The reason for requesting the @deletelength parameter and not having the function's code calculate the length of the searched pattern is that the pattern length doesn't necessarily match the length of the substrings that you want to remove. For example, you might want to locate all occurrences of the pattern '%bc%' and remove the two characters bc. In that case, you'd specify 2 in the @deletelength argument. The fn_removepattern() function uses the PATINDEX() function to locate the pattern within the input character string, then uses the STUFF() function to remove characters from it repeatedly until it doesn't find the pattern anymore. To test the fn_removepattern() function, run the following code from any database:

SELECT fn_removepattern('abcabcabcabcabc', '%bc%', 2)

You get 'aaaaa' as a result.

To use table columns as arguments for the function, run the following query:

SELECT CustomerID, Address,fn_removepattern(Address, '%[^0-9]%', 1) AS AddrNumsOnlyFROM Northwind.dbo.Customers

Examine the abridged results that Figure 1 shows and notice in the result column AddrNumsOnly that the function removed all the non-numeric characters from the Address column.

As an example of creating a table-valued special UDF, run the code that Listing 2 shows to create the fn_nums() function, which returns a table containing the requested number of consecutive integers, starting with 1. The function first inserts the value 1 into the table variable @Nums. It then enters a loop and uses an INSERT SELECT statement in each iteration. The statement takes the existing rows from @Nums, adds a factor equal to the current number of rows in @Nums to the column n, and inserts the result back into @Nums. This loop multiplies the number of rows in @Nums in each iteration, stopping when the value of column n reaches the requested number of integers. The INSERT loop is efficient because it involves very few INSERT statements. To test the fn_nums() function, run the following query, which requests a table containing the numbers 1 through 10:

SELECT * FROM ::fn_nums(10)

This query returns the results that Figure 2 shows.

UDFs That Modify Data

One main difference between a stored procedure and a function is that, by definition, stored procedures can change the state of the database (modify permanent data; create, alter, or drop permanent objects; and so on) and functions can't. Functions can only perform calculations and return the results. But some programmers have found a back door that lets them modify permanent data from within a UDF by using the OPENQUERY() rowset function. The OPENQUERY() function accepts a linked-server name and a character string as input arguments, submits the code to the linked server as pass-through code without parsing or examining it, and returns the results of running the code against the linked server to the calling program. OPENQUERY() is allowed within a UDF, and because SQL Server doesn't examine the character string you provided to the function as an argument, you can use OPENQUERY() to modify data on the linked server. You can even refer to your local server as a linked server by turning on the Data Access server option:

EXEC sp_serveroption [server_name], 'Data Access', true

As an example of a function that modifies data, I'll use a much-requested implementation of a custom sequence function. Suppose you have two invoice tables: Invoices1 holds invoices for domestic customers, and Invoices2 holds invoices for foreign customers. Run the following code to create both tables in the tempdb database:

USE tempdbCREATE TABLE Invoices1(invoiceid   int, datacol varchar(10) NOT   NULL)CREATE TABLE Invoices2(invoiceid   int, datacol varchar (10) NOT   NULL)

You want SQL Server to generate the invoiceid values similar to the way it generates IDENTITY values but across two tables instead of one. So, a new invoice inserted into either of the tables should receive the next incremented invoice ID. One technique that programmers use in such a case is to create a table that has just one column, which holds the last sequence value used. Run the following code to create the custom sequence table Seq and populate it with the seed sequence value:

CREATE TABLE Seq(val int NOT NULL   DEFAULT 0)INSERT INTO Seq DEFAULT VALUES

From your application code (or from a stored procedure or a trigger), you'd use the following code to increment the custom sequence value, then retrieve the new value and use it in an INSERT statement:

DECLARE @invid AS intUPDATE Seq SET @invid = val = val + 1INSERT INTO Invoices1   VALUES(@invid, 'a')

After running the preceding code, query the Seq and Invoices1 tables. Notice that the sequence value in Seq is now 1 and that the new invoice has 1 as its invoiceid value.

However, you can't use this technique in a multirow INSERT or UPDATE query because coming up with a multirow modification that increments a sequence for each row of the modification can be highly complex and inefficient. Suppose you wanted to issue an INSERT statement to assign new invoiceid values automatically. Usually, a programmer's first thought in this situation is to create a function that increments the sequence value and returns it to the invoking code, as Listing 3 shows. However, as I mentioned, you can't modify permanent data in the database from within a function, so Listing 3's code returns the following error:

Server: Msg 443, Level 16, State 2, Procedure fn_getinvid, Line 6Invalid use of 'UPDATE' within a function.

But if you use OPENQUERY() to perform the update, as Listing 4 shows, the UDF works. The fn_getinvid1 UDF uses the OPENQUERY() function to issue an UPDATE statement that increments the sequence value. Then, the UDF commits the transaction because, by default, OPENQUERY() opens an implicit transaction behind the scenes when submitting any code to the target server. Finally, the code in the OPENQUERY() function returns the new sequence value as a table containing a single column called newinvid and one row. The fn_getinvid1() UDF returns the new sequence value to the caller of the function.

After creating fn_getinvid1(), test it by running the multirow INSERT queries that Listing 5 shows against Invoices1 and Invoices2. Query both tables, and notice that all rows received different invoiceid values no matter which table they were in. To use the UDF to test a multirow update, run the following code:

UPDATE Invoices1 SET invoiceid =   dbo.fn_getinvid1()    WHERE datacol LIKE    'A%'

Then, select all rows from Invoices1, and notice that all rows containing a string starting with A in the datacol column received new invoice IDs.

The fn_getinvid1() function can run into concurrency problems when multiple processes invoke it at the same time. You can avoid these problems by using the IDENTITY property with the val column in the custom sequence table. Instead of updating the row, with each invocation, the function will insert a row, roll back the transaction, and return the value of SCOPE_IDENTITY()—a function that returns the last IDENTITY value inserted by the session in the current scope. (Routines that the calling code invokes aren't considered part of the code's scope.) The rollback trick spares you from having to keep lots of unused rows in the sequence table or from needing to delete its content periodically. Rollbacks don't affect SCOPE_IDENTITY(), so you effectively get what you need—a new sequence value—while the sequence table remains empty.

Run the code that Listing 6 shows to create a new sequence table that uses the IDENTITY column property for the sequence value and to create a second version of the function, fn_getinvid2(). You use this function the same way you use the fn_getinvid1() function, so you can test it by using similar INSERT and UPDATE queries that you used to test the first version.

Missing Features

This article covers only a couple of UDF-related back doors. The missing abilities to create special UDFs that you can invoke from any database and to modify data through UDFs would be useful as supported features in SQL Server. If you think that SQL Server should support certain features that are currently back doors only, send them to [email protected]. SQL Server program managers review requests and consider them for future releases of SQL Server.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like