Tag Archive for: cross tab

Dynamic Pivoting

SQL Server 2005 added the PIVOT operator which makes creating cross-tab queries very easy. However, as of this writing none of the SQL Server versions has built-in support for dynamic pivoting. The PIVOT operator only provides basic pivoting capabilities on a static list of values. In practice it is very often needed to perform this for dynamic list of values.

Here is one solution for dynamic pivoting that uses the ability in SQL Server 2005/2008 to concatenate row values using FOR XML PATH with blank element name. This method performs a query on the distinct values to pivot and creates a column list based on that. Then the resulting column list is used in a dynamic query utilizing the PIVOT operator and executed as dynamic SQL.

In this scenario the goal is to pivot order amounts by month.

-- Table with orders

CREATE TABLE Orders (

  order_id INT NOT NULL PRIMARY KEY,

  order_date DATETIME NOT NULL

            DEFAULT CURRENT_TIMESTAMP,

  amount DECIMAL(8, 2) NOT NULL DEFAULT 0

        CHECK (amount >= 0));

 

-- Sample data

INSERT INTO Orders

(order_id, order_date, amount)

SELECT 1, '20070101', 10.50

UNION ALL

SELECT 2, '20070126', 12.50

UNION ALL

SELECT 3, '20070130', 12.00

UNION ALL

SELECT 4, '20070214', 13.75

UNION ALL

SELECT 5, '20070220', 10.00

UNION ALL

SELECT 6, '20070306', 15.00

UNION ALL

SELECT 7, '20070310', 17.50

UNION ALL

SELECT 8, '20070329', 20.00;

 

-- Build list of column values to pivot

DECLARE @cols NVARCHAR(1000);

SELECT @cols =

STUFF((SELECT N'],[' + year_month

       FROM (SELECT DISTINCT CONVERT(NCHAR(7), order_date, 126)

            FROM Orders) AS O(year_month)

       ORDER BY year_month

       FOR XML PATH('')

      ), 1, 2, '') + N']';

 

-- Build dynamic SQL query for pivoting  

DECLARE @sql NVARCHAR(2000);

SET @sql =

N'SELECT order_year, ' + @cols +

N'FROM (SELECT DATEPART(yyyy, order_date) AS order_year, ' +

            N'CONVERT(NCHAR(7), order_date, 126) AS year_month, ' +

            N'amount ' +

      N'FROM Orders) AS O ' +

N'PIVOT ' +

N'(SUM(amount) FOR year_month IN (' + @cols + N')) AS P;';

 

EXEC(@sql);

 

/*

 

Results:

 

order_year 2007-01 2007-02 2007-03

----------- -------- -------- --------

2007        35.00    23.75    52.50

 

*/

This query can be further improved by using the QUOTENAME function to prevent any possibility of SQL injection.

Pivoting data in SQL Server

Very often there is a need to pivot (cross-tab) normalized data for some reporting purposes. While this is best done with reporting tools (Excel is one example with powerful pivoting capabilities), sometimes it needs to be done on the database side.

The discussion here is limited to static pivoting (that is when the values to pivot are known). Dynamic pivoting is accomplished by using dynamic SQL and is based on the same techniques, just dynamically building the pivoting query.

Here are a few methods and techniques to implement that in SQL Server.

Given the following data:

 OrderId    OrderDate  Amount
----------- ---------- ------
1 2007-01-01 10.50
2 2007-01-26 12.50
3 2007-01-30 12.00
4 2007-02-14 13.75
5 2007-02-20 10.00
6 2007-03-06 15.00
7 2007-03-10 17.50
8 2007-03-29 20.00

We would like to return this result set:

 OrderYear   Jan   Feb   Mar
----------- ----- ----- -----
2007 35.00 23.75 52.50

Creating the sample Orders table:

CREATE TABLE Orders (

 order_id INT IDENTITY(1, 1) NOT NULL PRIMARY KEY,

 order_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,

 amount DECIMAL(8, 2) NOT NULL DEFAULT 0

        CHECK (amount >= 0));

 

INSERT INTO Orders

  (order_date, amount)

SELECT '20070101', 10.50

UNION ALL

SELECT '20070126', 12.50

UNION ALL

SELECT '20070130', 12.00

UNION ALL

SELECT '20070214', 13.75

UNION ALL

SELECT '20070220', 10.00

UNION ALL

SELECT '20070306', 15.00

UNION ALL

SELECT '20070310', 17.50

UNION ALL

SELECT '20070329', 20.00;

Using CASE

CASE can be very handy for pivoting. It allows writing flexible logic and to transform the data as needed. Also, it is a very efficient method as it requires a one pass scan of the data.

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(CASE WHEN DATEPART(m, order_date) = 1

                THEN amount ELSE 0 END) AS 'Jan',

       SUM(CASE WHEN DATEPART(m, order_date) = 2

                THEN amount ELSE 0 END) AS 'Feb',

       SUM(CASE WHEN DATEPART(m, order_date) = 3

                THEN amount ELSE 0 END) AS 'Mar'

FROM Orders

GROUP BY DATEPART(yyyy, order_date);

Using Matrix table

This method requires creating a matrix table with 1 for the significant columns and 0 for columns to be ignored. Then joining the matrix table to the data table based on the pivoting column will produce the results:

CREATE TABLE MonthMatrix (

  month_nbr INT NOT NULL PRIMARY KEY

            CHECK (month_nbr BETWEEN 1 AND 12),

  jan INT NOT NULL DEFAULT 0

      CHECK (jan IN (0, 1)),

  feb INT NOT NULL DEFAULT 0

      CHECK (feb IN (0, 1)),

  mar INT NOT NULL DEFAULT 0

      CHECK (mar IN (0, 1)),

  apr INT NOT NULL DEFAULT 0

      CHECK (apr IN (0, 1)),

  may INT NOT NULL DEFAULT 0

      CHECK (may IN (0, 1)),

  jun INT NOT NULL DEFAULT 0

      CHECK (jun IN (0, 1)),

  jul INT NOT NULL DEFAULT 0

      CHECK (jul IN (0, 1)),

  aug INT NOT NULL DEFAULT 0

      CHECK (aug IN (0, 1)),

  sep INT NOT NULL DEFAULT 0

      CHECK (sep IN (0, 1)),

  oct INT NOT NULL DEFAULT 0

      CHECK (oct IN (0, 1)),

  nov INT NOT NULL DEFAULT 0

      CHECK (nov IN (0, 1)),

  dec INT NOT NULL DEFAULT 0

      CHECK (dec IN (0, 1)));

 

-- Populate the matrix table

INSERT INTO MonthMatrix (month_nbr, jan) VALUES (1, 1);

INSERT INTO MonthMatrix (month_nbr, feb) VALUES (2, 1);

INSERT INTO MonthMatrix (month_nbr, mar) VALUES (3, 1);

INSERT INTO MonthMatrix (month_nbr, apr) VALUES (4, 1);

INSERT INTO MonthMatrix (month_nbr, may) VALUES (5, 1);

INSERT INTO MonthMatrix (month_nbr, jun) VALUES (6, 1);

INSERT INTO MonthMatrix (month_nbr, jul) VALUES (7, 1);

INSERT INTO MonthMatrix (month_nbr, aug) VALUES (8, 1);

INSERT INTO MonthMatrix (month_nbr, sep) VALUES (9, 1);

INSERT INTO MonthMatrix (month_nbr, oct) VALUES (10, 1);

INSERT INTO MonthMatrix (month_nbr, nov) VALUES (11, 1);

INSERT INTO MonthMatrix (month_nbr, dec) VALUES (12, 1);

 

-- Use the matrix table to pivot

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(amount * jan) AS 'Jan',

       SUM(amount * feb) AS 'Feb',

       SUM(amount * mar) AS 'Mar'

FROM Orders AS O

JOIN MonthMatrix AS M

  ON DATEPART(m, O.order_date) = M.month_nbr

GROUP BY DATEPART(yyyy, order_date);

The David Rozenshtein method

This method is based on the formula: 1 – ABS(SIGN(x – y)). This formula returns 1 when x = y, and 0 otherwise. In a way it mimics the matrix table approach.

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 1)))) AS 'Jan',

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 2)))) AS 'Feb',

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 3)))) AS 'Mar'

FROM Orders

GROUP BY DATEPART(yyyy, order_date);

The PIVOT operator in SQL Server 2005

SQL Server 2005 provides built-in pivoting mechanism via the PIVOT operator.

SELECT OrderYear,

       [1] AS 'Jan',

       [2] AS 'Feb',

       [3] AS 'Mar'

FROM (SELECT DATEPART(yyyy, order_date),

            DATEPART(m, order_date),

            amount

      FROM Orders) AS O (OrderYear, month_nbr, amount)

PIVOT

(SUM(amount) FOR month_nbr IN ([1], [2], [3])) AS P;

There are some other methods to implement pivoting (like using subqueries, multiple joins, or the APPLY operator in SQL Server 2005) but I am not showing examples of those as I do not find them practical, especially with large number of values to pivot.

Additional Resources:

Using PIVOT and UNPIVOT
http://msdn2.microsoft.com/en-us/library/ms177410.aspx