Hierarchies with CTEs

Common table expressions (CTEs) have many applications. However, one of their capabilities to implement recursive queries is very useful for navigating and manipulating hierarchies.

Here is one brief example of utilizing that. Given a table with employees and their managers represented as adjacency list, provide list of employees that report to particular manager, ordered by the natural hierarchy order of listing each employee under the corresponding manager.

-- Create sample table with employees and managers

CREATE TABLE Employees(

 employee_nbr INT NOT NULL PRIMARY KEY,

 employee_name VARCHAR(35),

 manager_nbr INT NULL REFERENCES Employees(employee_nbr));

 

INSERT INTO Employees VALUES (1, 'John Doe', NULL);

INSERT INTO Employees VALUES (2, 'James Brown', 1);

INSERT INTO Employees VALUES (3, 'Keith Green', NULL);

INSERT INTO Employees VALUES (4, 'Peter Roth', 2);

INSERT INTO Employees VALUES (5, 'Hans Gruber', 2);

INSERT INTO Employees VALUES (6, 'Kris Evans', 4);

INSERT INTO Employees VALUES (7, 'Jeff Colleman', NULL);

 

-- Use recursive CTE to build binary and charachter paths

WITH EmployeeHierarchy

AS

(SELECT employee_nbr, employee_name, manager_nbr,

        CAST(employee_nbr AS VARBINARY(MAX)) AS bpath,

        CAST('.' +

            CAST(employee_nbr AS VARCHAR(4)) +

            '.' AS VARCHAR(MAX)) AS cpath

 FROM Employees

 WHERE manager_nbr IS NULL

 UNION ALL

 SELECT E.employee_nbr, E.employee_name, E.manager_nbr,

        H.bpath + CAST(E.employee_nbr AS BINARY(4)),

        H.cpath + CAST(E.employee_nbr AS VARCHAR(4)) + '.'

 FROM Employees AS E

 JOIN EmployeeHierarchy AS H

   ON E.manager_nbr = H.employee_nbr)

SELECT employee_nbr, employee_name, manager_nbr, bpath, cpath

FROM EmployeeHierarchy

WHERE cpath LIKE '%.2.%' -- filter all employees for manager 2

ORDER BY bpath;          -- order by natural hierarchy path

 

-- Results

employee_nbr employee_name  manager_nbr bpath                              cpath

------------ -------------- ----------- ---------------------------------- ----------

2            James Brown    1           0x0000000100000002                .1.2.

4            Peter Roth    2           0x000000010000000200000004        .1.2.4.

6            Kris Evans    4           0x00000001000000020000000400000006 .1.2.4.6.

5            Hans Gruber    2           0x000000010000000200000005        .1.2.5.

The method above creates two manager/employee paths: a binary path that preserves the natural ordering at each level, which can be used to sort; a character path that can be used to filter by manager.

Parameter Sniffing

What is “parameter sniffing”? When a stored procedure is first executed SQL Server looks at the input parameters and uses this as guidance to build the query plan. This is known as “parameter sniffing”.

This is good as long as the input parameters for the first invocation are typical for future invocations. But if that is not the case this will cause performance problems.

For example, a procedure is supposed to retrieve all rows for customer orders with non-clustered index on the customer column. If the first invocation returns a small set of orders it may be most efficient to use index seek. Further invocations may be for large set of orders, but the first cached plan with index seek will be used resulting in poor performance (instead of using a scan).

Here is one example stored procedure and different methods to handle parameter sniffing.

CREATE PROCEDURE GetCustomerOrders

 @customerid NCHAR(5)

AS

BEGIN

 

    SELECT orderid, customerid, orderdate, shippeddate

    FROM Orders

    WHERE customerid = @customerid;

 

END

Replace parameters with local variables

This solution is based on assigning the stored procedure parameters to local variables and then using the local variables in the query. This works because SQL Server is not sniffing local variables and using the local variables in place of parameters forces plan generated based on statistics (in effect this disables parameter sniffing).

CREATE PROCEDURE GetCustomerOrders

 @customerid NCHAR(5)

AS

BEGIN

 

    DECLARE @local_customerid NCHAR(5);

 

    SET @local_customerid = @customerid;

 

    SELECT orderid, customerid, orderdate, shippeddate

    FROM Orders

    WHERE customerid = @local_customerid;

 

END

Execute using WITH RECOMPILE

This solution forces recompile of the stored procedure on each run, that way forcing a fresh query plan for the current parameters. Note that this will recompile all statements inside the stored procedure.

EXEC GetCustomerOrders @customerid = N'CACYK' WITH RECOMPILE;

Query hint RECOMPILE

SQL Server 2005 offers the new query hint RECOMPILE which will force recompilation of the individual query. This method is better than the prior method because recompilation will affect only one statement and all other queries in the stored procedure will not be recompiled.

CREATE PROCEDURE GetCustomerOrders

 @customerid NCHAR(5)

AS

BEGIN

 

    SELECT orderid, customerid, orderdate, shippeddate

    FROM Orders

    WHERE customerid = @customerid

    OPTION (RECOMPILE);

 

END

Query hint OPTIMIZE FOR

Another new query hint in SQL Server 2005 is OPTIMIZE FOR. It allows specifying a constant that will be used to optimize the query plan instead of the variable. This could be useful if it is known that particular selective value is frequently used to invoke the stored procedure. However, any other parameter value will suffer the same performance problems.

CREATE PROCEDURE GetCustomerOrders

 @customerid NCHAR(5)

AS

BEGIN

 

    SELECT orderid, customerid, orderdate, shippeddate

    FROM Orders

    WHERE customerid = @customerid

    OPTION (OPTIMIZE FOR (@customerid = N'CACYK'));

 

END

Note: SQL Server 2008 adds a new option to specify “OPTION (OPTIMIZE FOR UNKNOWN)”. This specifies that the query optimizer will use statistical data instead of the initial value to determine the value for query optimization.

Plan Guides

Plan guides in SQL Server 2005 provide the opportunity to optimize a query without changing the actual code of the query. This is especially useful when dealing with third party vendor applications where access to code may not be available. A plan guide allows associating query hints with a query without changing the query.

EXEC sp_create_plan_guide

    @name = N'SolveParameterSniffing',

    @stmt = N'SELECT orderid, customerid, orderdate, shippeddate

               FROM Orders

               WHERE customerid = @customerid',

    @type = N'OBJECT',

    @module_or_batch = N'GetCustomerOrders',

    @params = NULL,

    @hints = N'OPTION (RECOMPILE)';

USE PLAN query hint
Another plan stability feature in SQL Server 2005 is the USE PLAN “xml_plan” query hint, which allows forcing the use of a specific plan every time the query is run.

Additional Resources:

Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005
http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx