## Column Aggregates

We all know how to use the aggregate functions MIN, MAX, COUNT, etc. to calculate aggregates across rows. For example, using MAX we can determine the maximum value for group of rows. But what if we want to see the maximum value for each row across columns? There is no aggregate function that accepts list of columns…

Let’s look at example to illustrate this. Here is sample table with four data columns:

``CREATE TABLE Foo ( keycol INT NOT NULL PRIMARY KEY, col1 INT NOT NULL, col2 INT NOT NULL, col3 INT NOT NULL, col4 INT NOT NULL); INSERT INTO Foo VALUES(1, 5, 0, 1, 10); INSERT INTO Foo VALUES(2, 0, 0, 3, 1);INSERT INTO Foo VALUES(3, 0, 0, 0, 0);INSERT INTO Foo VALUES(4, 9, 1, 22, 8);INSERT INTO Foo VALUES(5, 8, 8, 8, 8);``

Our goal is to calculate the maximum value for the four data columns for each row. With two or three columns it may be an easy task using a CASE expression. But adding more columns will make a very long and complex CASE expression. Is there an easier way?

One solution that provides a shortcut is utilizing the SQL Server capabilities to generate an XML result and then using XQuery to find the max value. Here is the query:

``-- max across columns with XQuerySELECT c.query('keycol').value('.', 'INT') AS keycol,       c.value('local-name(./*[data(.)=                max(../*[not(local-name(.)="keycol") ])])',                 'VARCHAR(20)') AS max_col,       c.value('max(*[not(local-name(.)="keycol")])', 'FLOAT') AS max_valFROM (SELECT keycol, col1, col2, col3, col4      FROM Foo      FOR XML PATH, TYPE) AS T(x)CROSS APPLY x.nodes('row') AS R(c);``

The tricky part here is the filtering of the key column from the evaluation for max. Here are the results:

``keycol  max_col  max_val------- -------- --------1       col4     102       col3     33       col1     04       col3     225       col1     8``

How about counting the number of none zero values across columns? Here is a solution for that using very similar query:

``-- count non-zero columns with XQuerySELECT c.query('keycol').value('.', 'INT') AS keycol,       c.value('count(*[not(local-name(.)="keycol")                 and not(.=0)])',                'INT')  AS cnt_non_zeroFROM (SELECT keycol, col1, col2, col3, col4      FROM Foo      FOR XML PATH, TYPE) AS T(x)CROSS APPLY x.nodes('row') AS R(c);``

And the results:

``keycol  cnt_non_zero------- ------------1       32       23       04       45       4``

This shows the power of XQuery to solve this problem in untraditional way. The catch? Look at the execution plan of the queries…

More efficient method to solve the first problem is to use unpivoting. Here is solution using the UNPIVOT operator (the same can be done with ANSI query using cross join with numbers table):

``-- max across columns with UNPIVOTSELECT keycol, col AS max_col, val AS max_valFROM (SELECT keycol, val, col,        ROW_NUMBER() OVER(PARTITION BY keycol ORDER BY val DESC, col) AS rkFROM FooUNPIVOT(val FOR col IN (col1, col2, col3, col4)) AS U) AS TWHERE rk = 1;``

And very similar solution using unpivoting to solve the second problem:

``-- count non-zero columns with UNPIVOTSELECT keycol, COUNT(NULLIF(val, 0)) AS cnt_non_zeroFROM FooUNPIVOT(val FOR col IN (col1, col2, col3, col4)) AS UGROUP BY keycol;``

## Unpivoting Multiple Columns

Unpivoting is the process of normalizing data. In earlier post I discussed unpivoting a single column. This demonstration is to show how to unpivot multiple columns.

The task is to normalize the following denormalized data, which represents product sales volumes by quarter:

``product_nbr qtr1   sales_qtr1  qtr2   sales_qtr2  qtr3   sales_qtr3  qtr4   sales_qtr4----------- ------ ----------- ------ ----------- ------ ----------- ------ -----------1           2008Q1 100         2008Q2 20          2008Q3 15          2008Q4 102           2008Q1 80          2008Q2 15          2008Q3 20          2008Q4 103           2008Q1 70          2008Q2 5           2008Q3 10          2008Q4 15``

Normalized data set should look like this:

``product_nbr qtr    sales----------- ------ -----------1           2008Q1 1001           2008Q2 201           2008Q3 151           2008Q4 102           2008Q1 802           2008Q2 152           2008Q3 202           2008Q4 103           2008Q1 703           2008Q2 53           2008Q3 103           2008Q4 15``

The first method uses CROSS JOIN with table with numbers (needs one number for each quarter) and CASE expressions to select the appropriate value (quarter or sales volume) for each quarter.

``SELECT product_nbr,        CASE n            WHEN 1 THEN qtr1            WHEN 2 THEN qtr2            WHEN 3 THEN qtr3            WHEN 4 THEN qtr4       END AS qtr,       CASE n            WHEN 1 THEN sales_qtr1            WHEN 2 THEN sales_qtr2            WHEN 3 THEN sales_qtr3            WHEN 4 THEN sales_qtr4       END AS salesFROM QuarterlySales AS SCROSS JOIN (SELECT 1 UNION  SELECT 2 UNION SELECT 3 UNION SELECT 4) AS Nums(n);``

Alsmost seems natural that we should be able to do the same much easier using the UNPIVOT operator (introduced in SQL Server 2005). However, one of the limitations of the UNPIVOT operator is that it works only with a single column. But because SQL Server allows multiple table operators in the FROM clause, we can use two UNPIVOT operators. The catch is that the second UNPIVOT operator applies to the virtual table results from the first unpivot operator. That requires using a little trick to extract and match the quarter from the results of each UNPIVOT operator in order to produce the final result. Here is the query to unpivot using the UNPIVOT operator.

``SELECT product_nbr, qtr, salesFROM  (SELECT product_nbr,          qtr1, sales_qtr1,          qtr2, sales_qtr2,          qtr3, sales_qtr3,         qtr4, sales_qtr4  FROM QuarterlySales) AS SUNPIVOT (qtr FOR qtrx IN (qtr1, qtr2, qtr3, qtr4)) AS U1UNPIVOT (sales FOR sales_qtrx IN (sales_qtr1, sales_qtr2,                            sales_qtr3, sales_qtr4)) AS U2WHERE RIGHT(sales_qtrx, 1) = RIGHT(qtrx, 1);``

## Unpivoting Data

Unpivoting data is a very useful technique for normalizing denormalized tables. It refers to the process of transposing multiple columns across into rows in a single column. In other words, taking the data below:

`sales_year  first_quarter  second_quarter   third_quarter   fourth_quarter----------  -------------  --------------   -------------   --------------2006        211203.50      381594.95        439187.00       503155.802007        231205.10      451101.25        601209.40       531907.30`

and converting to look like this:

`sales_year   sales_quarter  sales_amount-----------  -------------  ------------2006         1              211203.502006         2              381594.952006         3              439187.002006         4              503155.802007         1              231205.102007         2              451101.252007         3              601209.402007         4              531907.30`

Here are different methods to achieve that:

`-- Create sample tableCREATE TABLE QuarterlySales ( sales_year INT PRIMARY KEY, first_quarter DECIMAL(10, 2), second_quarter DECIMAL(10, 2), third_quarter DECIMAL(10, 2), fourth_quarter DECIMAL(10, 2)); -- Insert dataINSERT INTO QuarterlySales VALUES(2006, 211203.50, 381594.95, 439187.00, 503155.80); INSERT INTO QuarterlySales VALUES(2007, 231205.10, 451101.25, 601209.40, 531907.30); -- Method 1) Using UNIONSELECT sales_year,        1 AS sales_quarter,        first_quarter AS sales_amountFROM QuarterlySalesUNION ALLSELECT sales_year, 2, second_quarterFROM QuarterlySalesUNION ALLSELECT sales_year, 3, third_quarterFROM QuarterlySales UNION ALLSELECT sales_year, 4, fourth_quarterFROM QuarterlySalesORDER BY sales_year, sales_quarter; -- Method 2) Using cross join with table with numbersSELECT sales_year,       qtr AS sales_quarter,       CASE qtr          WHEN 1 THEN first_quarter         WHEN 2 THEN second_quarter         WHEN 3 THEN third_quarter         WHEN 4 THEN fourth_quarter       END AS sales_amountFROM QuarterlySalesCROSS JOIN(SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) AS Quarters(qtr)ORDER BY sales_year, sales_quarter; -- Method 3) Using the UNPIVOT operator in SQL Server 2005SELECT sales_year,       CASE sales_quarter         WHEN 'first_quarter' THEN 1         WHEN 'second_quarter' THEN 2         WHEN 'third_quarter' THEN 3         WHEN 'fourth_quarter' THEN 4       END AS sales_quarter,       sales_amount     FROM QuarterlySalesUNPIVOT(sales_amount FOR  sales_quarter IN  (first_quarter, second_quarter,   third_quarter, fourth_quarter)) AS UORDER BY sales_year, sales_quarter;`

It is good no note that unpivoting multiple columns (like amount1, date1, amount2, date2 to amount and date columns) is not supported by the SQL Server 2005 UNPIVOT operator. This can be achieved only by using the first two methods.