T-SQL UDF Quickly Determines How Often a Substring Appears

To determine the number of occurrences of a substring within a string, a user-defined function (UDF) performs a simple arithmetic calculation.

Eli Leiba

October 24, 2007

2 Min Read
ITPro Today logo in a gray background | ITPro Today



I created a function, dbo.occurrences, that calculates how many times a substring appears in a source string. Rather than looping through the source string and checking whether the substring still exists in the part of the string that hasn’t been scanned yet, dbo.occurrences performs a simple arithmetic calculation. The number of occurrences of a substring within a string is expressed by the formula

Occur (s,sub ) =
{ len(s) – len (replace(s,sub,’’) }
/ len(sub) {len(sub) 0}

This formula calculates the number of occurrences of a string within a string by first replacing each substring inside the source string with an empty string. Next, the formula finds the difference between the length of the original source string and the length of the resulting string after you perform the replacement operation just described. Finally, the difference is divided by the length of the substring, providing the substring isn’t empty. Listing 1 shows how dbo.occurrences implements this formula.

To use dbo.occurrences, you pass in a source string and a substring as varchar(max) parameters. For example, if you wantto see how many times the letter i is in the string Eli Leiba is the king, you use the call

select dbo.occurrences
(‘Eli Leiba is the king’,’i’)

which produces the result of 4. If you want to see how many times the substring ll is in the string Hello tell me hello, you use the call

select dbo.occurrences
(‘Hello tell me hello’, ‘ll’)

which produces the result of 3. If you try other pairs of source strings and substrings, you’ll realize that this simple calculation does the job!

The Occurrences.sql file contains the dbo.occurrences function. You can download Occurrences.sql from SQL Server Magazine’s Web site. (Go to www .sqlmag.com, enter 96786 in the InstantDoc ID text box, then click the 96786.zip hotlink.) I tested this function on SQL Server 2005, SP1. To make it work on SQL Server 2000, you need to use varchar(8000) instead of varchar(max) for the two parameters’ type.

—Eli Leiba, Senior Application DBA, Israel Electric Company

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like