unpivot Archives - Tangra Inc.

Tag Archive for: unpivot

Column Aggregates

August 21, 2009/10 Comments/in SQL Server, XML/by Plamen Ratchev

We all know how to use the aggregate functions MIN, MAX, COUNT, etc. to calculate aggregates across rows. For example, using MAX we can determine the maximum value for group of rows. But what if we want to see the maximum value for each row across columns? There is no aggregate function that accepts list of columns…

Let’s look at example to illustrate this. Here is sample table with four data columns:

CREATE TABLE Foo (

 keycol INT NOT NULL PRIMARY KEY,

 col1 INT NOT NULL,

 col2 INT NOT NULL,

 col3 INT NOT NULL,

 col4 INT NOT NULL);

 

INSERT INTO Foo VALUES(1, 5, 0, 1, 10); 

INSERT INTO Foo VALUES(2, 0, 0, 3, 1);

INSERT INTO Foo VALUES(3, 0, 0, 0, 0);

INSERT INTO Foo VALUES(4, 9, 1, 22, 8);

INSERT INTO Foo VALUES(5, 8, 8, 8, 8);

Our goal is to calculate the maximum value for the four data columns for each row. With two or three columns it may be an easy task using a CASE expression. But adding more columns will make a very long and complex CASE expression. Is there an easier way?

One solution that provides a shortcut is utilizing the SQL Server capabilities to generate an XML result and then using XQuery to find the max value. Here is the query:

-- max across columns with XQuery

SELECT c.query('keycol').value('.', 'INT') AS keycol,

       c.value('local-name(./*[data(.)=

                max(../*[not(local-name(.)="keycol") ])][1])', 

                'VARCHAR(20)') AS max_col,

       c.value('max(*[not(local-name(.)="keycol")])', 'FLOAT') AS max_val

FROM (SELECT keycol, col1, col2, col3, col4

      FROM Foo

      FOR XML PATH, TYPE) AS T(x)

CROSS APPLY x.nodes('row') AS R(c);

The tricky part here is the filtering of the key column from the evaluation for max. Here are the results:

keycol  max_col  max_val

------- -------- --------

1       col4     10

2       col3     3

3       col1     0

4       col3     22

5       col1     8

How about counting the number of none zero values across columns? Here is a solution for that using very similar query:

-- count non-zero columns with XQuery

SELECT c.query('keycol').value('.', 'INT') AS keycol,

       c.value('count(*[not(local-name(.)="keycol") 

                and not(.=0)])', 

               'INT')  AS cnt_non_zero

FROM (SELECT keycol, col1, col2, col3, col4

      FROM Foo

      FOR XML PATH, TYPE) AS T(x)

CROSS APPLY x.nodes('row') AS R(c);

And the results:

keycol  cnt_non_zero

------- ------------

1       3

2       2

3       0

4       4

5       4

This shows the power of XQuery to solve this problem in untraditional way. The catch? Look at the execution plan of the queries…

More efficient method to solve the first problem is to use unpivoting. Here is solution using the UNPIVOT operator (the same can be done with ANSI query using cross join with numbers table):

-- max across columns with UNPIVOT

SELECT keycol, col AS max_col, val AS max_val

FROM (

SELECT keycol, val, col, 

       ROW_NUMBER() OVER(PARTITION BY keycol ORDER BY val DESC, col) AS rk

FROM Foo

UNPIVOT

(val FOR col IN (col1, col2, col3, col4)) AS U) AS T

WHERE rk = 1;

And very similar solution using unpivoting to solve the second problem:

-- count non-zero columns with UNPIVOT

SELECT keycol, COUNT(NULLIF(val, 0)) AS cnt_non_zero

FROM Foo

UNPIVOT

(val FOR col IN (col1, col2, col3, col4)) AS U

GROUP BY keycol;

Unpivoting Multiple Columns

February 28, 2009/20 Comments/in SQL Server/by Plamen Ratchev

Unpivoting is the process of normalizing data. In earlier post I discussed unpivoting a single column. This demonstration is to show how to unpivot multiple columns.

The task is to normalize the following denormalized data, which represents product sales volumes by quarter:

product_nbr qtr1   sales_qtr1  qtr2   sales_qtr2  qtr3   sales_qtr3  qtr4   sales_qtr4

----------- ------ ----------- ------ ----------- ------ ----------- ------ -----------

1           2008Q1 100         2008Q2 20          2008Q3 15          2008Q4 10

2           2008Q1 80          2008Q2 15          2008Q3 20          2008Q4 10

3           2008Q1 70          2008Q2 5           2008Q3 10          2008Q4 15

Normalized data set should look like this:

product_nbr qtr    sales

----------- ------ -----------

1           2008Q1 100

1           2008Q2 20

1           2008Q3 15

1           2008Q4 10

2           2008Q1 80

2           2008Q2 15

2           2008Q3 20

2           2008Q4 10

3           2008Q1 70

3           2008Q2 5

3           2008Q3 10

3           2008Q4 15

The first method uses CROSS JOIN with table with numbers (needs one number for each quarter) and CASE expressions to select the appropriate value (quarter or sales volume) for each quarter.

SELECT product_nbr, 

       CASE n

            WHEN 1 THEN qtr1

            WHEN 2 THEN qtr2

            WHEN 3 THEN qtr3

            WHEN 4 THEN qtr4

       END AS qtr,

       CASE n

            WHEN 1 THEN sales_qtr1

            WHEN 2 THEN sales_qtr2

            WHEN 3 THEN sales_qtr3

            WHEN 4 THEN sales_qtr4

       END AS sales

FROM QuarterlySales AS S

CROSS JOIN 

(SELECT 1 UNION 

 SELECT 2 UNION

 SELECT 3 UNION

 SELECT 4) AS Nums(n);

Alsmost seems natural that we should be able to do the same much easier using the UNPIVOT operator (introduced in SQL Server 2005). However, one of the limitations of the UNPIVOT operator is that it works only with a single column. But because SQL Server allows multiple table operators in the FROM clause, we can use two UNPIVOT operators. The catch is that the second UNPIVOT operator applies to the virtual table results from the first unpivot operator. That requires using a little trick to extract and match the quarter from the results of each UNPIVOT operator in order to produce the final result. Here is the query to unpivot using the UNPIVOT operator.

SELECT product_nbr, qtr, sales

FROM 

 (SELECT product_nbr, 

         qtr1, sales_qtr1, 

         qtr2, sales_qtr2, 

         qtr3, sales_qtr3,

         qtr4, sales_qtr4

  FROM QuarterlySales) AS S

UNPIVOT

 (qtr FOR qtrx IN (qtr1, qtr2, qtr3, qtr4)) AS U1

UNPIVOT

 (sales FOR sales_qtrx IN (sales_qtr1, sales_qtr2, 

                           sales_qtr3, sales_qtr4)) AS U2

WHERE RIGHT(sales_qtrx, 1) = RIGHT(qtrx, 1);

Unpivoting Data

November 23, 2007/10 Comments/in SQL Server/by Plamen Ratchev

Unpivoting data is a very useful technique for normalizing denormalized tables. It refers to the process of transposing multiple columns across into rows in a single column. In other words, taking the data below:

sales_year  first_quarter  second_quarter   third_quarter   fourth_quarter
----------  -------------  --------------   -------------   --------------
2006        211203.50      381594.95        439187.00       503155.80
2007        231205.10      451101.25        601209.40       531907.30

and converting to look like this:

sales_year   sales_quarter  sales_amount
-----------  -------------  ------------
2006         1              211203.50
2006         2              381594.95
2006         3              439187.00
2006         4              503155.80
2007         1              231205.10
2007         2              451101.25
2007         3              601209.40
2007         4              531907.30

Here are different methods to achieve that:

-- Create sample table
CREATE TABLE QuarterlySales (
 sales_year INT PRIMARY KEY,
 first_quarter DECIMAL(10, 2),
 second_quarter DECIMAL(10, 2),
 third_quarter DECIMAL(10, 2),
 fourth_quarter DECIMAL(10, 2));
 
-- Insert data
INSERT INTO QuarterlySales VALUES(2006, 211203.50, 381594.95, 439187.00, 503155.80); 
INSERT INTO QuarterlySales VALUES(2007, 231205.10, 451101.25, 601209.40, 531907.30);
 
-- Method 1) Using UNION
SELECT sales_year, 
       1 AS sales_quarter, 
       first_quarter AS sales_amount
FROM QuarterlySales
UNION ALL
SELECT sales_year, 2, second_quarter
FROM QuarterlySales
UNION ALL
SELECT sales_year, 3, third_quarter
FROM QuarterlySales 
UNION ALL
SELECT sales_year, 4, fourth_quarter
FROM QuarterlySales
ORDER BY sales_year, sales_quarter;
 
-- Method 2) Using cross join with table with numbers
SELECT sales_year,
       qtr AS sales_quarter,
       CASE qtr 
         WHEN 1 THEN first_quarter
         WHEN 2 THEN second_quarter
         WHEN 3 THEN third_quarter
         WHEN 4 THEN fourth_quarter
       END AS sales_amount
FROM QuarterlySales
CROSS JOIN
(SELECT 1 UNION ALL
 SELECT 2 UNION ALL
 SELECT 3 UNION ALL
 SELECT 4) AS Quarters(qtr)
ORDER BY sales_year, sales_quarter;
 
-- Method 3) Using the UNPIVOT operator in SQL Server 2005
SELECT sales_year,
       CASE sales_quarter
         WHEN 'first_quarter' THEN 1
         WHEN 'second_quarter' THEN 2
         WHEN 'third_quarter' THEN 3
         WHEN 'fourth_quarter' THEN 4
       END AS sales_quarter,
       sales_amount     
FROM QuarterlySales
UNPIVOT
(sales_amount FOR 
 sales_quarter IN 
 (first_quarter, second_quarter, 
  third_quarter, fourth_quarter)) AS U
ORDER BY sales_year, sales_quarter;

It is good no note that unpivoting multiple columns (like amount1, date1, amount2, date2 to amount and date columns) is not supported by the SQL Server 2005 UNPIVOT operator. This can be achieved only by using the first two methods.

Additional Resources:

Using PIVOT and UNPIVOT
http://msdn2.microsoft.com/en-us/library/ms177410.aspx

Tag Archive for: unpivot

Column Aggregates

Unpivoting Multiple Columns

Unpivoting Data

SITE MAP

SEARCH