T-SQL Challenge - January 18, 2012

If you're looking for a good querying challenge, try to come up with your own solutions to computing PERCENTILE_CONT and PERCENTILE_DISC for a given inputargument @pct. Here's how standard SQL defines the computations when applied as ordered set functions:

Itzik Ben-Gan

January 18, 2012

2 Min Read
percentage sign in the clouds

If you're looking for a good querying challenge, try to come up with your own solutions to computing PERCENTILE_CONT and PERCENTILE_DISC for a given input argument @pct. Here's how standard SQL defines the computations when applied as ordered set functions:

  • PERCENTILE_DISC: By treating the group as a window partition of the CUME_DIST window function, using the specified ordering of the value expression as the window ordering, return the first value expression whose cumulative distribution value is greater than or equal to the argument.

  • PERCENTILE_CONT: By considering the pair of consecutive rows that are indicated by the argument, treated as a fraction of the total number of rows in the group, interpolate the value of the value expression evaluated for these rows.

The definition for PERCENTILE_DISC is straightforward. As for PERCENTILE_CONT, you need more specifics. The standard defines it like so:

If PERCENTILE_CONT is specified, then:

1. Let ROW0 be the greatest exact numeric value with scale 0 (zero) that is less than or equal to @pct*(N-1). Let ROWLIT0 be a literal representing ROW0.

2. Let ROW1 be the least exact numeric value with scale 0 (zero) that is greater than or equal to @pct*(N-1). Let ROWLIT1 be a literal representing ROW1.

3. Let FACTOR be an approximate numeric literal representing @pct*(N-1)-ROW0.

4. The result is the result of the scalar subquery:

 

( WITH TEMPTABLE(X, Y) AS(SELECTROW_NUMBER() OVER (ORDER BY WSP) - 1,TXCOLNAMEFROM TXANAME)SELECT CAST ( T0.Y + FACTOR * (T1.Y - T0.Y) AS DT )FROM TEMPTABLE T0, TEMPTABLE T1WHERE T0.ROWNUMBER = ROWLIT0AND T1.ROWNUMBER = ROWLIT1 ) 


You can use the ExamScores table as your input data and compare your results to the ones provided for the median calculation in Figure 1. Of course, the median is just an example where @pct is equal to 0.5; your solution should work for any input percent. The solutions should work in SQL Server versions prior to SQL Server 2012, meaning you can't use any of the new functions. I'll cover such solutions next month. Good luck!

 

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like