Strim Strips Out Non-Alphanumeric Characters

Although SQL Server offers a variety of string functions, it doesn't have a function that strips out non-alphanumeric characters. Here's a user-defined function to fill that gap.

Readers

November 19, 2007

3 Min Read
ITPro Today logo


SQL Server offers a variety of string functions, but it doesn’t have a function that strips out non-alphanumeric characters, such as a hyphen (-), an exclamation point (!), a percent sign (%), or a semicolon (;). I created a function named Strim, which is short for strip trim, to fill this gap.

Related: 8 T-SQL String Functions

The Strim function takes advantage of the fact that SQL Server “provides additional ordering guarantees beyond ANSI.” Specifically, a variable assignment in a query such as

SELECT @a = @a + colFROM Table ORDER BY col2

will be performed in the expected order.

Listing 2 shows the Strim function. This function takes one input parameter (@textin), which is limited to 8,000 characters. Strim first converts the string into a derived table labeled A. The derived table is cross-joined with a table of numbers named Tally to parse the string. Listing 3 shows the code that creates the Tally table. The WHERE clause at callout A in Listing 2 filters the intermediate output from the cross join. The clause uses ASCII codes to accept only numbers (ASCII codes 48 through 57), uppercase letters (ASCII codes 65 through 90), and lowercase letters (ASCII codes 97 through 122). The function then converts the intermediate output to a derived table labeled D. For example, if

@textin=”abcdefg”

table D’s rows would be:

i

letter

1

a

2

b

3

c

4

d

5

e

6

f

7

g

Finally, the string is reconstituted as abcdefg, assigned to @textout, and returned.

For comparison, let’s look at what happens if a string contains non-alphanumeric characters. If

@textin=”pubs-dev”

table D’s rows would be:

i

letter

1

p

2

u

3

b

4

s

5

d

6

e

7

v

and the string would be reconstituted as pubsdev.

I’ve used the Strim function to remove nonvisible characters in imported data (e.g., IBM EBCDIC data). Another use involves creating filenames for our database backup files. Our company uses Idera’s SQL Compliance Manager, which creates files with filenames such as SQLcompliance_MyDatabase.SQL.DOMAIN.COM. Our custom backup script uses the Strim function to strip out the unwanted characters in the database names when it creates the filenames for the backup files so that the filenames follow our standard format of myservername_ myinstancename_SQLcomplianceMyDatabaseSQLDOMAINCOM_ backuptype_yyyymmddhhmmss.bak. The Strim function works on SQL Server 2005, SQL Server 2000, and SQL Server 7.0.

Related: Conversions between Character and Binary Values in Katmai

—Louis Nguyen, DBA, Centex Homes

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like