Use GROUP BY to Transform Tables

To transform tables, Brian Moran explores a handy trick that uses the GROUP BY clause.

Brian Moran

January 21, 2003

3 Min Read
ITPro Today logo in a gray background | ITPro Today


I'm new to SQL Server and T-SQL, and I'm trying to transform TableA, which is a custom table that describes any simple user-defined list or table, into TableB. TableA contains multiple rows with the same value for the column KeyField. TableB pivots this data so there is a single row for each value of KeyField. Figure 1 shows the structure of TableA, and Figure 2 shows the structure of TableB. The TableA column FieldName stores the name of the user-defined field in a list and can be an alphanumeric, numeric, or date field. For example, field X can hold only alphanumeric values, and field Z can store only date values. TableA is populated in realtime, but the number of rows added per year should be low. TableB needs to be populated as near to realtime as possible, but I can get away with a 30-second update interval.

A Experienced T-SQL developers might see a simple solution to this problem right away. However, SQL Server is attracting many converts who don't always have deep training in relational database query techniques. This problem lets us explore a handy trick that uses the GROUP BY clause.

Although the question is "how do I transform TableA into TableB," I'm going to assume that we don't actually need a genuine table for TableB. Based on the user's needs, a view or a simple query is a better solution for transforming the data. In general, generating an answer on the fly is more efficient than storing duplicate versions of the data in multiple tables. Rather than pre-storing the values, generating calculated values at runtime reduces the possibility of accidentally storing the wrong answer in a calculated field. In some cases, you might choose to store the answer as a table rather than a view if you find that the cost of generating the result set in the view is expensive and users will frequently access the view. However, you can easily use the view in this solution to populate a table if necessary. Listing 1 shows the table definitions and INSERT statements that build the model tables that the question describes.

Here's a simple query that generates the result set defined as TableB in the question:

SELECT  KeyFieldId  ,MAX(AlphaNumeric) AS X  ,MAX(JustNumeric) AS  Y  ,MAX(Datefield) AS ZFROM  OriginalGROUP BY  KeyFieldId

Most people know how to use GROUP BY to create a result set that aggregates many rows into a single row. Many T-SQL novices might try to write the query without the MAX() function, such as:

SELECT  KeyFieldId  ,AlphaNumeric AS X  ,JustNumeric AS  Y  ,Datefield AS ZFROM  OriginalGROUP BY  KeyFieldId

But this query won't work because you must include all nonaggregate columns in your GROUP BY clause. In this case, you won't get the correct answer if you include the AlphaNumeric and related columns in the GROUP BY clause. Many people forget that you can use an aggregate function such as MAX() to select a single value even when you have only one value to choose from. In this case, the value of column X in TableB should be Trays A10/6—that's the only value that exists for KeyField = 1 and FieldName = X. You'll never have more than one value for that combination, so it might not seem obvious to use a MAX() function, but it works in this case.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like