## Shortest Path for Friend Connections

An interesting problem to solve is finding relation paths in contact management systems. This is a limited case of the Dijkstra’s algorithm for finding the shortest path in a graph. Here we always have cost of 1 for each path and it is a two-way relation between the nodes. To put this in human readable format, the problem is to find the distance between friends, based on relationships defined. If A is friend with B and then B is friend with C, the path between A and C is A->B->C with distance 2.

Here is one solution using recursive CTEs in SQL Server 2005. The method is based on building relationship paths between all connected nodes and then searching the path for friend connections. If this searching if frequent, then the path can be materialized in a column.

`-- Sample table with dataCREATE TABLE Contacts ( c_from CHAR(1),  c_to CHAR(1), PRIMARY KEY (c_from, c_to)); INSERT    INTO ContactsSELECT    'A', 'B' UNION ALLSELECT    'B', 'D' UNION ALLSELECT    'C', 'A' UNION ALLSELECT    'C', 'E' UNION ALLSELECT    'G', 'C' UNION ALLSELECT    'B', 'G' UNION ALLSELECT    'F', 'D' UNION ALLSELECT    'E', 'F'; -- Table to store pathsCREATE TABLE Paths (c_path VARCHAR(200) PRIMARY KEY); -- Recursive CTE to populate the pathsWITH PathCTE AS(SELECT c_from, c_to,        CAST('.' + CAST(c_from AS VARCHAR(8)) + '.' +         CAST(c_to AS VARCHAR(8)) + '.' AS VARCHAR(200)) AS c_path FROM Contacts AS C1 UNION ALL SELECT C.c_from, C.c_to,        CAST(P.c_path + C.c_to + '.' AS VARCHAR(200)) FROM PathCTE AS P JOIN Contacts AS C   ON P.c_to = C.c_from WHERE P.c_path NOT LIKE '%.' +                     CAST(C.c_from AS VARCHAR(10)) +                     '.' +                     CAST(C.c_to AS VARCHAR(10)) +                     '.%')INSERT INTO PathsSELECT c_path FROM PathCTE; -- Show all paths between B and DSELECT c_pathFROM PathsWHERE c_path LIKE '.B.%'  AND c_path LIKE '%.D.'; -- Shortest path distance, longest path distance, and number of pathsSELECT MIN(LEN(c_path) - LEN(REPLACE(c_path, '.', '')) - 2) AS shortest_distance,       MAX(LEN(c_path) - LEN(REPLACE(c_path, '.', '')) - 2) AS longest_distance,       COUNT(*) AS paths_cntFROM PathsWHERE c_path LIKE '.B.%'  AND c_path LIKE '%.D.'; -- Resultsc_path--------------.B.D..B.G.C.A.B.D..B.G.C.E.F.D.  shortest_distance  longest_distance  paths_cnt-----------------  ----------------  -----------1                  5                 3`

It is good to note that this method does not make effort to avoid reusing paths to reach a destination. If needed this can be handled by additional condition in the recursive CTE.

## Column Alias Based on Variable

Although formatting output belongs to the reporting or front-end interface, occasionally there could be the need to change a column alias based on variable. Since T-SQL does not support proving variable as column alias in a query, here are two methods to handle that.

`CREATE TABLE Foo ( keycol INT PRIMARY KEY, datacol CHAR(1)) INSERT INTO Foo VALUES (1, 'a')INSERT INTO Foo VALUES (2, 'b') DECLARE @column_alias VARCHAR(30)SET @column_alias = 'new_title' -- 1). Using dynamic SQLDECLARE @sql NVARCHAR(200) SET @sql = 'SELECT keycol, datacol AS ' + @column_alias + ' FROM Foo' EXEC sp_executesql @sql -- 2). Using results table and renaming the columnCREATE TABLE Results ( keycol INT PRIMARY KEY, datacol CHAR(1))  INSERT INTO ResultsSELECT keycol, datacolFROM Foo EXEC sp_rename 'Results.datacol', @column_alias, 'COLUMN' SELECT * FROM Results`

## Performing UPSERT in T-SQL

Very often there is the need to check if a key value exists to perform an update, and if it does not exist to insert new data. The upcoming SQL Server 2008 provides the MERGE statement (MERGE actually allows to do more: simultaneous UPDATE, INSERT and/or DELETE operations on the table), but until it is released we have to wait.

Here is just one way to implement in the current versions of T-SQL.

`CREATE TABLE Foo ( keycol INT PRIMARY KEY, datacol CHAR(1) NOT NULL); -- Sample dataINSERT INTO Foo VALUES (1, 'a');INSERT INTO Foo VALUES (2, 'b');INSERT INTO Foo VALUES (4, 'd'); -- New values to insert/updateDECLARE @key INT;DECLARE @data CHAR(1); -- New key, will perform insertSET @key = 3;SET @data = 'c'; BEGIN TRAN -- Try updateUPDATE Foo WITH (SERIALIZABLE)SET datacol = @dataWHERE keycol = @key; -- If no rows updated then must be new value, perform insertIF @@ROWCOUNT = 0 INSERT INTO Foo VALUES (@key, @data); COMMIT -- Existing key, will perform updateSET @key = 4;SET @data = 'x'; BEGIN TRAN -- Try updateUPDATE Foo WITH (SERIALIZABLE)SET datacol = @dataWHERE keycol = @key; -- If no rows updated then must be new value, perform insertIF @@ROWCOUNT = 0 INSERT INTO Foo VALUES (@key, @data); COMMIT SELECT keycol, datacol FROM Foo;`

The SERIALIZABLE hint here is very important to avoid deadlocks.

## Table Value Constructors in SQL Server 2008

One of the new features of SQL Server 2008 is the support for table value constructors (part of ANSI SQL). Here are a couple quick examples of using them.

`-- Populate sample table CREATE TABLE Foo (  keycol INT PRIMARY KEY,  datacol VARCHAR(30));`

In the past, populating table rows was done like this:

`INSERT INTO Foo VALUES (1, 'Books');INSERT INTO Foo VALUES (2, 'CDs');INSERT INTO Foo VALUES (3, 'DVDs');  -- or  INSERT INTO Foo (keycol, datacol)SELECT 1, 'Books'UNION ALLSELECT 2, 'CDs'UNION ALLSELECT 3, 'DVDs'; -- or using on the fly SELECT keycol, datacolFROM ( SELECT 1, 'Books'       UNION ALL        SELECT 2, 'CDs'       UNION ALL       SELECT 3, 'DVDs') AS Foo (keycol, datacol);`

Here is how the same can be done with SQL Server 2008 table value constructors:

`INSERT INTO Foo (keycol, datacol)VALUES (1, 'Books'), (2, 'CDs'), (3, 'DVDs'); -- or using on the fly SELECT keycol, datacolFROM ( VALUES (1, 'Books'),              (2, 'CDs'),              (3, 'DVDs') ) AS Foo (keycol, datacol); -- and CTE version WITH Foo (keycol, datacol)AS( SELECT *    FROM (  VALUES  (1, 'Books'),                    (2, 'CDs'),                    (3, 'DVDs') ) AS F (keycol, datacol))SELECT keycol, datacolFROM Foo;`

Another interesting option is to derive a row value from a subquery, like this:

`INSERT INTO Foo (keycol, datacol)VALUES ((SELECT MAX(keycol) + 1 FROM Foo), 'Tapes');`

Still not possible, but maybe in the next version (or Service Pack) we can see vector expressions in predicates and UPDATE, like:

`SELECT keycol, datacolFROM FooWHERE (keycol, datacol) IN (SELECT keycol, datacol FROM Foo2); -- or UPDATE FooSET (keycol, datacol) = (SELECT keycol, datacol FROM Foo2);`

## Calculating Work Hours

Calculating work hours is a very frequent need in scheduling, billing, payroll, and time and attendance applications. Normally this is best done using an auxiliary Calendar table which provides easy way for calculation and flexibility to account for holidays and other events. However, sometimes the requirements are simplified and the point of interest is only the work hours between two dates, accounting only for weekends, without holidays. Here is one way to perform this calculation using the date and time functions in SQL Server. Note that for this example the week start day is logically set to Monday by adding (@@DATEFIRST – 1) to the start date.

`CREATE FUNCTION dbo.CalcWorkHours ( @start_date DATETIME, @end_date DATETIME) RETURNS INTASBEGIN  RETURN  (SELECT ((total_days / 7) * 5 + total_days % 7 -            CASE WHEN 6 BETWEEN start_weekday AND end_weekday                THEN 1 ELSE 0 END -           CASE WHEN 7 BETWEEN start_weekday AND end_weekday                THEN 1 ELSE 0 END) * 8    FROM (SELECT total_days, start_weekday,                 start_weekday + total_days % 7 - 1          FROM (SELECT DATEDIFF(day, @start_date, @end_date) + 1,                       DATEPART(WEEKDAY, @start_date + @@DATEFIRST - 1)               ) AS T(total_days, start_weekday)         ) AS D(total_days, start_weekday, end_weekday)  );ENDGO SELECT dbo.CalcWorkHours('20080401', '20080421')  /* -- Result120 */`

## Unique Column with Multiple NULLs

A very frequent need is to define unique column that allows multiple NULL values. Since the UNIQUE constraint considers two NULL values the same, it allows only a single NULL value.

Here is one solution for this problem using a view filtering on non NULL values with UNIQUE index on the view.

`CREATE TABLE dbo.Foo ( keycol INT NULL, datacol CHAR(1)); GO CREATE VIEW UniqueFoo WITH SCHEMABINDINGASSELECT keycol, datacolFROM dbo.FooWHERE keycol IS NOT NULL; GO CREATE UNIQUE CLUSTERED INDEX ix_UniqueFooON UniqueFoo(keycol);GO -- OK, multiple NULL values allowedINSERT INTO dbo.Foo VALUES(1, 'a');INSERT INTO dbo.Foo VALUES(NULL, 'b');INSERT INTO dbo.Foo VALUES(NULL, 'c'); GO -- Error, attempt to insert duplicate keycol value INSERT INTO dbo.Foo VALUES(1, 'd'); /* Msg 2601, Level 14, State 1, Line 2Cannot insert duplicate key row in object 'dbo.UniqueFoo' with unique index 'ix_UniqueFoo'.The statement has been terminated. */`

In SQL Server 2008 this can be accomplished much easier using UNIQUE filtered index.

`CREATE UNIQUE NONCLUSTERED INDEX ix_Foo  ON Foo (keycol)  WHERE keycol IS NOT NULL;`

## Auxiliary Tables

Auxiliary tables (also referred as utility or helper tables) are a great tool for many needs in SQL. The good part is they are portable between systems, easy to create and normally do not take much storage. There are many benefits: faster SQL, simplified problem solutions, provide flexibility and options that are not easy to replicate with pure SQL code, and in many cases will help to avoid procedural coding and offer a set based way.

Here are samples for creating the two most common auxiliary tables: table with numbers and calendar table, as well as a couple brief examples of utilizing them.

`-- Create table structureCREATE TABLE Numbers ( nbr INT NOT NULL   CONSTRAINT df_numbers_nbr   DEFAULT 1   CONSTRAINT ck_numbers_nbr   CHECK (nbr > 0)   CONSTRAINT pk_numbers   PRIMARY KEY); -- Populate via cross joining CTEs-- Very fast, can be implemented as UDFWITH Num1 (n) AS (SELECT 1 UNION ALL SELECT 1),Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y),Num3 (n) AS (SELECT 1 FROM Num2 AS X, Num2 AS Y),Num4 (n) AS (SELECT 1 FROM Num3 AS X, Num3 AS Y),Num5 (n) AS (SELECT 1 FROM Num4 AS X, Num4 AS Y),Num6 (n) AS (SELECT 1 FROM Num5 AS X, Num5 AS Y),Nums (n) AS(SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num6)INSERT INTO Numbers (nbr)SELECT n FROM NumsWHERE n <= 1000000; -- Populate via generating digits and multiplication-- for tens, hundreds, thousands, etc.WITH Digits (n) AS( SELECT 0 UNION SELECT 1 UNION SELECT 2  UNION SELECT 3 UNION SELECT 4 UNION SELECT 5  UNION SELECT 6 UNION SELECT 7 UNION SELECT 8  UNION SELECT 9),Nums (n) AS( SELECT VII.n * 1000000 +         VI.n * 100000 +         V.n * 10000 +         IV.n * 1000 +         III.n * 100 +         II.n * 10 +          I.n  FROM Digits AS I   CROSS JOIN Digits AS II  CROSS JOIN Digits AS III  CROSS JOIN Digits AS IV  CROSS JOIN Digits AS V  CROSS JOIN Digits AS VI  CROSS JOIN Digits AS VII)INSERT INTO Numbers (nbr)SELECT n FROM NumsWHERE n BETWEEN 1 AND 1000000; -- One example of using table with numbers-- to parse a list string to table format-- with index for each elementDECLARE @list VARCHAR(100);DECLARE @delimiter CHAR(1); SET @list = 'Mon,Tue,Wed,Thu,Fri,Sat,Sun';SET @delimiter = ','; -- Split the list based on delimiterSELECT SUBSTRING(@list, nbr,                  CHARINDEX(@delimiter,                            @list + @delimiter, nbr) - nbr) AS list_value,       nbr + 1 - LEN(REPLACE(LEFT(@list, nbr), @delimiter, '')) AS list_indexFROM NumbersWHERE SUBSTRING(@delimiter + @list, nbr, 1) = @delimiter  AND nbr < LEN(@list) + 1; /* -- Resultslist_value list_index---------- -----------Mon        1Tue        2Wed        3Thu        4Fri        5Sat        6Sun        7  */ -- Another example to clean non-numeric characters from string-- The table with numbers is used to slice the string to characters-- then non-numeric characters are eliminated and the string-- is concatenated back using FOR XML PATH with blank element nameDECLARE @string VARCHAR(30); SET @string = '01234ab23-97z@'; SELECT SUBSTRING(@string, nbr, 1) FROM NumbersWHERE nbr <= LEN(@string)  AND SUBSTRING(@string, nbr, 1) LIKE '[0-9]'FOR XML PATH(''); /* -- Result012342397 */ -- Calendar table structureCREATE TABLE Calendar ( cdate DATETIME NOT NULL    CONSTRAINT df_calendar_cdate    DEFAULT DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), 0)   CONSTRAINT ck_calendar_cdate   CHECK (cdate = DATEADD(day, DATEDIFF(day, 0, cdate), 0))   CONSTRAINT pk_calendar    PRIMARY KEY, cweek INT NOT NULL   CONSTRAINT df_calendar_cweek   DEFAULT 1   CONSTRAINT ck_calendar_cweek   CHECK (cweek BETWEEN 1 AND 53), cweekday INT NOT NULL   CONSTRAINT df_calendar_cweekday   DEFAULT 1   CONSTRAINT ck_calendar_cweekday   CHECK (cweekday BETWEEN 1 AND 7), cyear INT NOT NULL   CONSTRAINT df_calendar_cyear   DEFAULT 1900   CONSTRAINT ck_calendar_cyear   CHECK (cyear >= 1900), cquarter INT NOT NULL   CONSTRAINT df_calendar_cquarter   DEFAULT 1   CONSTRAINT ck_calendar_cquarter   CHECK (cquarter BETWEEN 1 AND 4), cmonth INT NOT NULL   CONSTRAINT df_calendar_cmonth   DEFAULT 1   CONSTRAINT ck_calendar_cmonth   CHECK (cmonth BETWEEN 1 AND 12), cday INT NOT NULL   CONSTRAINT df_calendar_cday   DEFAULT 1   CONSTRAINT ck_calendar_cday   CHECK (cday BETWEEN 1 AND 31), cday_of_year INT NOT NULL   CONSTRAINT df_calendar_cday_of_year   DEFAULT 1   CONSTRAINT ck_calendar_cday_of_year   CHECK (cday_of_year BETWEEN 1 AND 366), cmonth_name VARCHAR(9) NOT NULL   CONSTRAINT df_calendar_cmonth_name   DEFAULT 'N/A', cday_name VARCHAR(9) NOT NULL   CONSTRAINT df_calendar_cday_name   DEFAULT 'N/A', holiday VARCHAR(35) NOT NULL   CONSTRAINT df_calendar_holiday   DEFAULT 'N/A');  -- Calculate number of days for 10 years from '20080101'SELECT DATEDIFF(day, '20080101', '20180101') AS days;  /* -- Result3653 */ -- Insert 3653 days, which is 10 years-- Another use for table with numbersINSERT INTO Calendar (cdate)SELECT DATEADD(dd, N.nbr - 1, '20080101')FROM Numbers AS NWHERE N.nbr <= 3653; -- Populate calendar columns-- First day of the week is logically set to Monday-- Can be enhanced to have column for -- ISO week as it differs from internal SQL Server -- week date partUPDATE CalendarSET cweek = DATEPART(wk, cdate - DATEPART(dw, cdate + @@DATEFIRST - 1) + 1),     cweekday = DATEPART(dw, cdate + @@DATEFIRST - 1),     cyear = DATEPART(yyyy, cdate),    cquarter = DATEPART(qq, cdate),    cmonth = DATEPART(mm, cdate),    cday = DATEPART(day, cdate),    cday_of_year = DATEPART(dy, cdate),    cmonth_name = DATENAME(mm, cdate),    cday_name = DATENAME(dw, cdate); -- Add some fixed date official holidays-- See http://en.wikipedia.org/wiki/List_of_holidays_by_country-- for full list by country -- New Year's DayUPDATE CalendarSET holiday = 'New Year''s Day'WHERE cday = 1  AND cmonth = 1; -- Christmas EveUPDATE CalendarSET holiday = 'Christmas Eve'WHERE cday = 24  AND cmonth = 12; -- Christmas DayUPDATE CalendarSET holiday = 'Christmas Day'WHERE cday = 25  AND cmonth = 12; -- New Year's EveUPDATE CalendarSET holiday = 'New Year''s Eve'WHERE cday = 31  AND cmonth = 12; -- Get work days for year 2008-- Note that not all official holidays are defined-- and this is only counting non-weekend days-- and existing holidaysSELECT COUNT(*) AS work_daysFROM CalendarWHERE cyear = 2008  AND cweekday NOT IN (6, 7) -- 6 = Saturday; 7 = Sunday  AND holiday = 'N/A'; /* -- Resultwork_days-----------258 */`

To get ISO week calculation use the function example IsoWeek in SQL Server Books Online under CREATE FUNCTION:
http://msdn2.microsoft.com/en-us/library/ms186755.aspx

In SQL Server 2008 the DATEPART function has been enhanced and has date part parameter ISO_WEEK, which supports ISO 8601 week numbering:
http://msdn2.microsoft.com/en-us/library/ms174420(SQL.100).aspx

On SQL Server 2008 the calendar table can be modeled using the new DATE data type which will eliminate the need of constraints to handle the time portion:
http://technet.microsoft.com/en-us/library/bb630352(SQL.100).aspx

## Prevent Table Drop

Since there is no direct permission in SQL Server to drop a table, here are two techniques to implement that.

The first one is based on creating a view on the table with option SCHEMABINDING. When the SCHEMABINDING option is used the table cannot be modified in a way that will affect the view definition, as well as it cannot be dropped unless the view is dropped first.

The second method is using the new DDL triggers in SQL Server 2005. Defining a trigger for DROP_TABLE with rollback in the body will not allow dropping tables.

`CREATE TABLE dbo.Foo ( keycol INT PRIMARY KEY, datacol CHAR(1)); GO  -- Using view with SCHEMABINDINGCREATE VIEW DoNotDropFoo WITH SCHEMABINDINGASSELECT keycol, datacolFROM dbo.Foo; GO -- Attempt to drop table FooDROP TABLE dbo.Foo; GO Msg 3729, Level 16, State 1, Line 3Cannot DROP TABLE 'dbo.Foo' because it is being referenced by object 'DoNotDropFoo'.  -- Using DDL triggerCREATE TRIGGER DoNotDropTables ON DATABASEFOR DROP_TABLEAS  RAISERROR ('Cannot drop tables!', 10, 1);  ROLLBACK; GO -- Attempt to drop table FooDROP TABLE dbo.Foo; GO Cannot drop tables!Msg 3609, Level 16, State 2, Line 3The transaction ended in the trigger. The batch has been aborted.`

## Unpivoting Data

Unpivoting data is a very useful technique for normalizing denormalized tables. It refers to the process of transposing multiple columns across into rows in a single column. In other words, taking the data below:

`sales_year  first_quarter  second_quarter   third_quarter   fourth_quarter----------  -------------  --------------   -------------   --------------2006        211203.50      381594.95        439187.00       503155.802007        231205.10      451101.25        601209.40       531907.30`

and converting to look like this:

`sales_year   sales_quarter  sales_amount-----------  -------------  ------------2006         1              211203.502006         2              381594.952006         3              439187.002006         4              503155.802007         1              231205.102007         2              451101.252007         3              601209.402007         4              531907.30`

Here are different methods to achieve that:

`-- Create sample tableCREATE TABLE QuarterlySales ( sales_year INT PRIMARY KEY, first_quarter DECIMAL(10, 2), second_quarter DECIMAL(10, 2), third_quarter DECIMAL(10, 2), fourth_quarter DECIMAL(10, 2)); -- Insert dataINSERT INTO QuarterlySales VALUES(2006, 211203.50, 381594.95, 439187.00, 503155.80); INSERT INTO QuarterlySales VALUES(2007, 231205.10, 451101.25, 601209.40, 531907.30); -- Method 1) Using UNIONSELECT sales_year,        1 AS sales_quarter,        first_quarter AS sales_amountFROM QuarterlySalesUNION ALLSELECT sales_year, 2, second_quarterFROM QuarterlySalesUNION ALLSELECT sales_year, 3, third_quarterFROM QuarterlySales UNION ALLSELECT sales_year, 4, fourth_quarterFROM QuarterlySalesORDER BY sales_year, sales_quarter; -- Method 2) Using cross join with table with numbersSELECT sales_year,       qtr AS sales_quarter,       CASE qtr          WHEN 1 THEN first_quarter         WHEN 2 THEN second_quarter         WHEN 3 THEN third_quarter         WHEN 4 THEN fourth_quarter       END AS sales_amountFROM QuarterlySalesCROSS JOIN(SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) AS Quarters(qtr)ORDER BY sales_year, sales_quarter; -- Method 3) Using the UNPIVOT operator in SQL Server 2005SELECT sales_year,       CASE sales_quarter         WHEN 'first_quarter' THEN 1         WHEN 'second_quarter' THEN 2         WHEN 'third_quarter' THEN 3         WHEN 'fourth_quarter' THEN 4       END AS sales_quarter,       sales_amount     FROM QuarterlySalesUNPIVOT(sales_amount FOR  sales_quarter IN  (first_quarter, second_quarter,   third_quarter, fourth_quarter)) AS UORDER BY sales_year, sales_quarter;`

It is good no note that unpivoting multiple columns (like amount1, date1, amount2, date2 to amount and date columns) is not supported by the SQL Server 2005 UNPIVOT operator. This can be achieved only by using the first two methods.

`SET SHOWPLAN_TEXT ON;GO DECLARE @Employees TABLE (emp_nbr INT PRIMARY KEY NONCLUSTERED,                          emp_name VARCHAR(35),                          ssn CHAR(11) UNIQUE CLUSTERED);  INSERT INTO @Employees VALUES(1, 'Jim Brown', '987-11-0234');INSERT INTO @Employees VALUES(2, 'Chris Defoe', '987-13-9899');INSERT INTO @Employees VALUES(3, 'Keith Goldwin', '986-31-1543'); SELECT emp_nbr, emp_name, ssnFROM @EmployeesWHERE emp_nbr = 2; SELECT emp_nbr, emp_name, ssnFROM @EmployeesWHERE ssn LIKE '998-13%'; --Results  |--Nested Loops(Inner Join, OUTER REFERENCES:([ssn]))       |--Index Seek(OBJECT:(@Employees), SEEK:([emp_nbr]=(2)) ORDERED FORWARD)       |--Clustered Index Seek(OBJECT:(@Employees), SEEK:([ssn]=[ssn]) LOOKUP ORDERED FORWARD)   |--Clustered Index Seek(OBJECT:(@Employees), SEEK:([ssn] >= '998-12þ' AND [ssn] < '998-14'), WHERE:([ssn] like '998-13%') ORDERED FORWARD)`