Tag Archive for: database

Bulk Loading Images in SQL Server

Loading images and any binary files (like Adobe PDF documents) to a database is not always the best option but sometimes needed. In SQL Server 2005 this process is simplified a lot with the BULK option of OPENROWSET. Here is an example of inserting image data into a VARBINARY(MAX) column (the same applies to loading an Adobe PDF file or any other binary file):

CREATE TABLE Foobar (

  image_data VARBINARY(MAX));

 

INSERT INTO Foobar (image_data)

SELECT image_data

FROM OPENROWSET(

        BULK N'C:image.jpg',

        SINGLE_BLOB) AS ImageSource(image_data);

Note the use of the SINGLE_BLOB option.

Keeping History Data in SQL Server

Very often there is the need to archive data on daily basis. Just had that question today and here is trimmed down solution based on a recent project.

The task is to keep history data on daily basis of changed data only (that is only rows that have changed for the last day). The scenario is a Loan table with loan number and loan amount. If there is change in any of the columns in the table, that change needs to be inserted into a history table. Further, the history table should be able to provide a convenient way to query the latest data (most recent updated row) for reporting purposes.

Here is the code to that solution:

-- Create sample tables.

 

CREATE TABLE Loans (

  loan_nbr INTEGER NOT NULL PRIMARY KEY,

  loan_amount DECIMAL(12, 2) DEFAULT 0.0 NOT NULL);

 

CREATE TABLE LoansHistory (

  loan_nbr INTEGER NOT NULL,

  loan_amount DECIMAL(12, 2) NOT NULL,

  change_date DATETIME

              DEFAULT CAST(DATEDIFF(day, 0, CURRENT_TIMESTAMP) AS DATETIME)

              NOT NULL,

  PRIMARY KEY (loan_nbr, change_date),

  archive_date DATETIME NULL,

  CHECK (change_date < archive_date));

 

GO

 

-- Sample data.

 

INSERT INTO Loans VALUES (1, 100.00);

INSERT INTO Loans VALUES (2, 150.00);

INSERT INTO Loans VALUES (3, 120.00);

INSERT INTO Loans VALUES (4, 160.00);

 

INSERT INTO LoansHistory VALUES (1, 100.00, '20070501', NULL);

INSERT INTO LoansHistory VALUES (2, 95.00, '20070503', NULL);

INSERT INTO LoansHistory VALUES (3, 80.00, '20070506', '20070508');

INSERT INTO LoansHistory VALUES (3, 85.00, '20070508', NULL);

 

GO

 

-- View to use for latest loans.

 

CREATE VIEW CurrentLoans (loan_nbr, loan_amount)

AS

SELECT loan_nbr, loan_amount

FROM LoansHistory

WHERE archive_date IS NULL;

 

GO

 

-- Insert changed rows.

 

-- SQL Server 2005.

 

INSERT INTO LoansHistory (loan_nbr, loan_amount)

SELECT loan_nbr, loan_amount

FROM Loans

EXCEPT

SELECT loan_nbr, loan_amount

FROM CurrentLoans;

 

-- SQL Server 2000.

 

INSERT INTO LoansHistory (loan_nbr, loan_amount)

SELECT loan_nbr, loan_amount

FROM ( SELECT 'Live' AS Source,

              loan_nbr,

              loan_amount 

       FROM Loans 

       UNION ALL

       SELECT 'History' AS Source,

              loan_nbr,

              loan_amount 

       FROM CurrentLoans ) AS L

GROUP BY loan_nbr, loan_amount

HAVING COUNT(*) = 1

   AND MIN(Source) = 'Live';

 

-- Update archive date for old history.

-- Runs in the same transaction as the insert of new history.

 

UPDATE LoansHistory

SET archive_date = CAST(DATEDIFF(day, 0, CURRENT_TIMESTAMP) AS DATETIME)

WHERE archive_date IS NULL

  AND EXISTS (SELECT *

              FROM LoansHistory AS H

              WHERE H.loan_nbr = LoansHistory.loan_nbr

                AND H.change_date > LoansHistory.change_date);

 

GO

 

-- Use the view for all reporting purposes.

 

SELECT loan_nbr, loan_amount

FROM CurrentLoans

ORDER BY loan_nbr;

 

-- Loans as of 2007-05-08.

 

SELECT loan_nbr, loan_amount

FROM LoansHistory

WHERE change_date <= '20070508'

  AND COALESCE(archive_date, CURRENT_TIMESTAMP) > '20070508';

The process is fairly simple with keeping two dates: change date for when the last change occurred (and that is the date when the row was inserted in history), and archive date which indicates that this row is archived and there is another more recent change. If the archive data is NULL it indicates that this is the most recent change. To simplify the querying process both dates have time portion set to midnight.

Cleaning Data with Recursive CTE

SQL Server 2005 added a great new feature: Common Table Expressions (CTE). And even better than that – recursive CTEs. That provides a new powerful tool to solve many SQL problems. One of the areas where recursive CTEs shine is the hierarchical data management.

Here is another side of the recursive CTEs – utilizing them for some common tasks like cleaning data. The problem: a table has a column with values that have invalid characters. The task is to replace all those invalid characters with a space. Unfortunately the REPLACE function does not support pattern matching and each character in the column has to be verified individually and replaced if it falls in the invalid range. The solution below utilizes a recursive CTE to walk though the ACSII table of characters and to replace the invalid characters in the column values.

-- Create test table.

 

CREATE TABLE Foobar (

  key_col INT PRIMARY KEY,

  text_col NVARCHAR(100));

 

-- Populate sample data.

 

INSERT INTO Foobar VALUES (1, N'ABC!@#%DEFgh');

INSERT INTO Foobar VALUES (2, N'~!102WXY&*()_Z');

 

-- Perform the cleanup with recursive CTE.

 

WITH Clean (key_col, text_col, ch)

AS

(SELECT key_col,

        REPLACE(text_col, CHAR(255), ' '),

        255

 FROM Foobar

 UNION ALL

 SELECT key_col,

        CASE WHEN

            CHAR(ch - 1) NOT LIKE '[A-Z]'

            THEN REPLACE(text_col, CHAR(ch - 1), ' ')

            ELSE text_col END,

        ch - 1

 FROM Clean

 WHERE ch > 1)

SELECT key_col, text_col

FROM Clean

WHERE ch = 1

OPTION (MAXRECURSION 255);

On a side note – the recursive CTEs are not the best performers. Also, by default a CTE allows only 100 levels of recursion. The MAXRECURSION hint can be used to set higher level (a value between 0 and 32767; setting to 0 will remove the limit). Be aware that settings MAXRECURSION to 0 may create an infinite loop.

Here is a different method using utility table with numbers and FOR XML PATH, which is more effective:

WITH Clean (key_col, text_col)

AS

(SELECT key_col, REPLACE(CAST(

        (SELECT CASE

                  WHEN SUBSTRING(text_col, n, 1) LIKE '[A-Z]'

                  THEN SUBSTRING(text_col, n, 1)

                  ELSE '.'

                END

        FROM (SELECT number

               FROM master..spt_values

               WHERE type = 'P'

                AND number BETWEEN 1 AND 100) AS Nums(n)

        WHERE n <= LEN(text_col)

        FOR XML PATH('')) AS NVARCHAR(100)), '.', ' ')

 FROM Foobar)

SELECT key_col, text_col

FROM Clean;

Dates and Date Ranges in SQL Server

One of the most common tasks when working with data is to select data for a specific date range or a date. There are two issues that arise: calculating the date range and trimming the time portion in order to select the full days. Below are a few techniques to show how this can be done.

First, a look at the internals of the DATETIME data type in SQL Server. Its internal representation is as two 4-byte values. The first value represents the number of days since 01/01/1900. The second number is the number of milliseconds since midnight. Here is how those two values can be converted to VARBINARY and then to INT to extract the days and milliseconds portions.

-- The internal representation of a datetime is two 4-byte values

SELECT CURRENT_TIMESTAMP AS 'Today',

       CAST(CURRENT_TIMESTAMP AS VARBINARY(8)) AS 'Two 4-byte Internal',

       SUBSTRING(CAST(CURRENT_TIMESTAMP AS VARBINARY(8)), 1 , 4)

           AS 'Days Since 1900-01-01',

       SUBSTRING(CAST(CURRENT_TIMESTAMP AS VARBINARY(8)), 5 , 4)

           AS 'Milliseconds Since Midnight',

       CAST(SUBSTRING(CAST(CURRENT_TIMESTAMP AS VARBINARY(8)), 1 , 4) AS INT)

           AS 'Days Represented as INT',

       CAST(SUBSTRING(CAST(CURRENT_TIMESTAMP AS VARBINARY(8)), 5 , 4) AS INT)

           AS 'Milliseconds Represented as INT';

Below are a few basic date calculations that show the power of the DATETIME functions and how to calculate key dates that can be used to extract data:

-- Basic dates and date calculations

SELECT CAST('19000101 00:00' AS SMALLDATETIME) AS 'Min SMALLDATETIME',

       CAST('20790606 23:59' AS SMALLDATETIME) AS 'Max SMALLDATETIME',

       CAST('1753-01-01T00:00:00.000' AS DATETIME) AS 'Min DATETIME',

       CAST('9999-12-31T23:59:59.997' AS DATETIME) AS 'Max DATETIME',

       CAST(0 AS DATETIME) AS 'Base SQL Server DATETIME',

       CURRENT_TIMESTAMP AS 'Current Date/Time',

       DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)

           AS 'Current Date/Time as Number (days since 1900-01-01)',

       CAST(DATEDIFF(DAY, 0, CURRENT_TIMESTAMP) AS DATETIME)

           AS 'Today as Date',

       CAST(DATEDIFF(DAY, -1, CURRENT_TIMESTAMP) AS DATETIME)

           AS 'Tomorrow as Date',

       CAST(DATEDIFF(DAY, 1, CURRENT_TIMESTAMP) AS DATETIME)

           AS 'Yesterday as Date',

       DATEADD(YEAR, DATEDIFF(YEAR, 0, CURRENT_TIMESTAMP)-1, 0)

           AS 'First Day of Last Year',

       DATEADD(YEAR, DATEDIFF(YEAR, 0, CURRENT_TIMESTAMP), 0)

           AS 'First Day of This Year',

       DATEDIFF(DAY, 0, DATEADD(YEAR, -1, CURRENT_TIMESTAMP))

           AS 'Today One Year Ago as Number',

       CAST(DATEDIFF(DAY, 0, DATEADD(YEAR, -1, CURRENT_TIMESTAMP)) AS DATETIME)

           AS 'Today One Year Ago as Date',

       DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)

           AS 'First Day of Current Month',

       DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0)

           AS 'First Day of Next Month',

       DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) - 1, 0)

           AS 'First Day of Prior Month',

       DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0) - 1

           AS 'Last Day of Current Month',

       DATEADD(DAY, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP), '01:30:00')

           AS '1:30 am today',

       CONVERT(VARCHAR(10), CURRENT_TIMESTAMP, 101) + ' 11:30:00.000'

           AS '11:00 am today',

       CONVERT(VARCHAR(10), CURRENT_TIMESTAMP, 101) + ' 17:00:00.000'

           AS '5:00 pm today',

       DATEADD(WEEK, DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP), 0)

           AS 'First Day of This Week',

       DATEADD(WEEK, DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP) - 1, 0)

           AS 'First Day of Last Week',

       DATEADD(WEEK, DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP) + 1, 0)

           AS 'First Day of Next Week';

Let’s implement some date range searches to see this in action. Given table Orders with DATETIME data type column order_date, here are a few range selections:

-- Get Orders for Last Month

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(MONTH,

                      DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) - 1, 0)

  AND order_date < DATEADD(MONTH,

                      DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0);

 

-- Get Orders for Current Month

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(MONTH,

                      DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)

  AND order_date < DATEADD(MONTH,

                      DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0);

 

-- Get Year to Date Orders

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(YEAR,

                      DATEDIFF(YEAR, 0, CURRENT_TIMESTAMP), 0)

  AND order_date < DATEDIFF(DAY, -1, CURRENT_TIMESTAMP);

 

-- Get Month to Date Orders

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(MONTH,

                      DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)

  AND order_date < DATEDIFF(DAY, -1, CURRENT_TIMESTAMP);

 

-- Get Last Year's Orders

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(YEAR,

                      DATEDIFF(YEAR, 0, CURRENT_TIMESTAMP)-1, 0)

  AND order_date < DATEADD(YEAR,

                    DATEDIFF(YEAR, 0, CURRENT_TIMESTAMP), 0);

 

-- Get Today's Orders

SELECT order_date

FROM Orders

WHERE order_date >= DATEDIFF(DAY, 0, CURRENT_TIMESTAMP)

  AND order_date < DATEDIFF(DAY, -1, CURRENT_TIMESTAMP);

 

-- Get Yesterday's Orders

SELECT order_date

FROM Orders

WHERE order_date >= DATEDIFF(DAY, 1, CURRENT_TIMESTAMP)

  AND order_date < DATEDIFF(DAY, 0, CURRENT_TIMESTAMP);

 

-- Get Today's Orders Between 9:00 am And 11:00 am

SELECT order_date

FROM Orders

WHERE order_date BETWEEN

           CONVERT(VARCHAR(10), CURRENT_TIMESTAMP, 101) + ' 09:00:00.000'

       AND CONVERT(VARCHAR(10), CURRENT_TIMESTAMP, 101) + ' 11:00:00.000';

 

-- Or

SELECT order_date

FROM Orders

WHERE order_date BETWEEN

           DATEADD(DAY, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP), '09:00:00')

       AND DATEADD(DAY, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP), '11:00:00');

 

-- Get Orders for Last Week

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(WEEK,

                      DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP) - 1, 0)

  AND order_date < DATEADD(WEEK,

                    DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP), 0);

 

-- Get Orders for Current Week

SELECT order_date

FROM Orders

WHERE order_date >= DATEADD(WEEK,

                      DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP), 0)

  AND order_date < DATEADD(WEEK,

                    DATEDIFF(WEEK, 0, CURRENT_TIMESTAMP) + 1, 0);

 

-- Get Orders for One Year Back From Current Date

SELECT order_date

FROM Orders

WHERE order_date >= DATEDIFF(DAY, 0,

                      DATEADD(YEAR, -1, CURRENT_TIMESTAMP))

  AND order_date < DATEDIFF(DAY, -1, CURRENT_TIMESTAMP);

There could be many more variations based on the same techniques to calculate dates. All those techniques use the datetime functions DATEADD and DATEDIFF to trim the time portion of the date in order to get accurate results. The key to this is the behavior of DATEDIFF to calculate difference between boundaries (that is when crossing between years, months, days, minutes, etc.). For example, difference in days between May 3, 2006 at 11:59 PM and May 4, 2006 at 00:01 AM returns one day, even it is only a couple minutes. Similar, calculating difference in years between Dec 31, 2006 and Jan 1, 2007 returns one year, even it is only a day or less. With this in mind it is very easy to accomplish the timing effect. Calculating difference one level higher trims the lower level. For example, to trim second calculate difference in minutes, to trim hours (or time in general) calculate difference in days, difference between months will trim days, and so on.

The same trimming can be done by converting to a string and then back to a date. But that is a slower and lengthy way of doing it. Here is a brief example of both methods:

-- Trim time portion

DECLARE @date DATETIME;

SET @date = CURRENT_TIMESTAMP;

 

-- Convertmg to string and back to date

SELECT CONVERT(DATETIME, CONVERT(CHAR(10), @date, 101)) ;

 

-- Using datetime functions

SELECT DATEADD(DAY, DATEDIFF(DAY, 0, @date),0);

As a bonus, here is how to convert time stored as a decimal number (like 5.25, 5.50, or 5.75) to a time format (like 5:15, 5:30, or 5:45):

-- Format Time From 5.25 to 5:15

-- Decimal Format to Time Format

DECLARE @time DECIMAL(5, 2);

SET @time = 5.25;

 

SELECT CONVERT(CHAR(5), DATEADD(ss, @time * 3600, 0), 108);

Just for clarification: the CURRENT_TIMESTAMP function returns the current date and time. This function is the ANSI SQL equivalent to the GETDATE function.

Pivoting data in SQL Server

Very often there is a need to pivot (cross-tab) normalized data for some reporting purposes. While this is best done with reporting tools (Excel is one example with powerful pivoting capabilities), sometimes it needs to be done on the database side.

The discussion here is limited to static pivoting (that is when the values to pivot are known). Dynamic pivoting is accomplished by using dynamic SQL and is based on the same techniques, just dynamically building the pivoting query.

Here are a few methods and techniques to implement that in SQL Server.

Given the following data:

 OrderId    OrderDate  Amount
----------- ---------- ------
1 2007-01-01 10.50
2 2007-01-26 12.50
3 2007-01-30 12.00
4 2007-02-14 13.75
5 2007-02-20 10.00
6 2007-03-06 15.00
7 2007-03-10 17.50
8 2007-03-29 20.00

We would like to return this result set:

 OrderYear   Jan   Feb   Mar
----------- ----- ----- -----
2007 35.00 23.75 52.50

Creating the sample Orders table:

CREATE TABLE Orders (

 order_id INT IDENTITY(1, 1) NOT NULL PRIMARY KEY,

 order_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,

 amount DECIMAL(8, 2) NOT NULL DEFAULT 0

        CHECK (amount >= 0));

 

INSERT INTO Orders

  (order_date, amount)

SELECT '20070101', 10.50

UNION ALL

SELECT '20070126', 12.50

UNION ALL

SELECT '20070130', 12.00

UNION ALL

SELECT '20070214', 13.75

UNION ALL

SELECT '20070220', 10.00

UNION ALL

SELECT '20070306', 15.00

UNION ALL

SELECT '20070310', 17.50

UNION ALL

SELECT '20070329', 20.00;

Using CASE

CASE can be very handy for pivoting. It allows writing flexible logic and to transform the data as needed. Also, it is a very efficient method as it requires a one pass scan of the data.

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(CASE WHEN DATEPART(m, order_date) = 1

                THEN amount ELSE 0 END) AS 'Jan',

       SUM(CASE WHEN DATEPART(m, order_date) = 2

                THEN amount ELSE 0 END) AS 'Feb',

       SUM(CASE WHEN DATEPART(m, order_date) = 3

                THEN amount ELSE 0 END) AS 'Mar'

FROM Orders

GROUP BY DATEPART(yyyy, order_date);

Using Matrix table

This method requires creating a matrix table with 1 for the significant columns and 0 for columns to be ignored. Then joining the matrix table to the data table based on the pivoting column will produce the results:

CREATE TABLE MonthMatrix (

  month_nbr INT NOT NULL PRIMARY KEY

            CHECK (month_nbr BETWEEN 1 AND 12),

  jan INT NOT NULL DEFAULT 0

      CHECK (jan IN (0, 1)),

  feb INT NOT NULL DEFAULT 0

      CHECK (feb IN (0, 1)),

  mar INT NOT NULL DEFAULT 0

      CHECK (mar IN (0, 1)),

  apr INT NOT NULL DEFAULT 0

      CHECK (apr IN (0, 1)),

  may INT NOT NULL DEFAULT 0

      CHECK (may IN (0, 1)),

  jun INT NOT NULL DEFAULT 0

      CHECK (jun IN (0, 1)),

  jul INT NOT NULL DEFAULT 0

      CHECK (jul IN (0, 1)),

  aug INT NOT NULL DEFAULT 0

      CHECK (aug IN (0, 1)),

  sep INT NOT NULL DEFAULT 0

      CHECK (sep IN (0, 1)),

  oct INT NOT NULL DEFAULT 0

      CHECK (oct IN (0, 1)),

  nov INT NOT NULL DEFAULT 0

      CHECK (nov IN (0, 1)),

  dec INT NOT NULL DEFAULT 0

      CHECK (dec IN (0, 1)));

 

-- Populate the matrix table

INSERT INTO MonthMatrix (month_nbr, jan) VALUES (1, 1);

INSERT INTO MonthMatrix (month_nbr, feb) VALUES (2, 1);

INSERT INTO MonthMatrix (month_nbr, mar) VALUES (3, 1);

INSERT INTO MonthMatrix (month_nbr, apr) VALUES (4, 1);

INSERT INTO MonthMatrix (month_nbr, may) VALUES (5, 1);

INSERT INTO MonthMatrix (month_nbr, jun) VALUES (6, 1);

INSERT INTO MonthMatrix (month_nbr, jul) VALUES (7, 1);

INSERT INTO MonthMatrix (month_nbr, aug) VALUES (8, 1);

INSERT INTO MonthMatrix (month_nbr, sep) VALUES (9, 1);

INSERT INTO MonthMatrix (month_nbr, oct) VALUES (10, 1);

INSERT INTO MonthMatrix (month_nbr, nov) VALUES (11, 1);

INSERT INTO MonthMatrix (month_nbr, dec) VALUES (12, 1);

 

-- Use the matrix table to pivot

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(amount * jan) AS 'Jan',

       SUM(amount * feb) AS 'Feb',

       SUM(amount * mar) AS 'Mar'

FROM Orders AS O

JOIN MonthMatrix AS M

  ON DATEPART(m, O.order_date) = M.month_nbr

GROUP BY DATEPART(yyyy, order_date);

The David Rozenshtein method

This method is based on the formula: 1 – ABS(SIGN(x – y)). This formula returns 1 when x = y, and 0 otherwise. In a way it mimics the matrix table approach.

SELECT DATEPART(yyyy, order_date) AS OrderYear,

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 1)))) AS 'Jan',

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 2)))) AS 'Feb',

       SUM(amount * (1 - ABS(SIGN(DATEPART(m, order_date) - 3)))) AS 'Mar'

FROM Orders

GROUP BY DATEPART(yyyy, order_date);

The PIVOT operator in SQL Server 2005

SQL Server 2005 provides built-in pivoting mechanism via the PIVOT operator.

SELECT OrderYear,

       [1] AS 'Jan',

       [2] AS 'Feb',

       [3] AS 'Mar'

FROM (SELECT DATEPART(yyyy, order_date),

            DATEPART(m, order_date),

            amount

      FROM Orders) AS O (OrderYear, month_nbr, amount)

PIVOT

(SUM(amount) FOR month_nbr IN ([1], [2], [3])) AS P;

There are some other methods to implement pivoting (like using subqueries, multiple joins, or the APPLY operator in SQL Server 2005) but I am not showing examples of those as I do not find them practical, especially with large number of values to pivot.

Additional Resources:

Using PIVOT and UNPIVOT
http://msdn2.microsoft.com/en-us/library/ms177410.aspx

Passing a Variable to an IN List

Every once in a while there is a need to do something like this:

SELECT person_id, person_name

FROM MyUsers

WHERE person_id IN (@search_list);

And @search_list contains some form of a delimited list. However, this is not a supported syntax and will fail. Here is one solution to this problem:

-- Create the test table

CREATE TABLE MyUsers (

 person_id INT PRIMARY KEY,

 person_name VARCHAR(35));

 

-- Insert sample data

INSERT INTO MyUsers VALUES (1327, 'Joe');

INSERT INTO MyUsers VALUES (1342, 'John F.');

INSERT INTO MyUsers VALUES (1411, 'Mary');

INSERT INTO MyUsers VALUES (1345, 'Nancy');

INSERT INTO MyUsers VALUES (1366, 'Greg');

INSERT INTO MyUsers VALUES (1367, 'Jeff');

INSERT INTO MyUsers VALUES (1368, 'Chris');

INSERT INTO MyUsers VALUES (1369, 'John M.');

INSERT INTO MyUsers VALUES (1370, 'Peggy');

INSERT INTO MyUsers VALUES (1371, 'Samuel');

INSERT INTO MyUsers VALUES (1372, 'Tony');

INSERT INTO MyUsers VALUES (1373, 'Lisa');

INSERT INTO MyUsers VALUES (1374, 'Tom');

INSERT INTO MyUsers VALUES (1375, 'Dave');

INSERT INTO MyUsers VALUES (1376, 'Peter');

INSERT INTO MyUsers VALUES (1377, 'Jason');

INSERT INTO MyUsers VALUES (1378, 'Justin');

INSERT INTO MyUsers VALUES (1379, 'Oscar');

 

DECLARE @search_list VARCHAR(100);

 

DECLARE @delimiter CHAR(1);

 

SELECT @search_list = '1327,1342,1411',

       @delimiter = ',';

 

-- Get the users based on the delimited variable list

SELECT person_id, person_name

FROM MyUsers

WHERE person_id IN

    (SELECT SUBSTRING(string, 2, CHARINDEX(@delimiter, string, 2) - 2)

      FROM (SELECT SUBSTRING(list, n, LEN(list))

            FROM (SELECT @delimiter + @search_list + @delimiter) AS L(list),

                (SELECT ROW_NUMBER() OVER (ORDER BY person_id)

                  FROM MyUsers) AS Nums(n)

            WHERE n <= LEN(list)) AS D(string)

      WHERE LEN(string) > 1

        AND SUBSTRING(string, 1, 1) = @delimiter)

ORDER BY person_id;


-- Results person_id person_name
-------- ------------------------------
1327 Joe
1342 John F.
1411 Mary

Notes:

– This algorithm is based on walking the delimited string (could be in a table column or a variable) and parsing it into a table format that can feed the IN list.

– The use of ROW_NUMBER assumes you are using SQL Server 2005. However, you can accomplish the same if you have SQL Server 2000. What this subquery does (SELECT ROW_NUMBER() OVER (ORDER BY person_id) FROM MyUsers) is simply generating a table with numbers. In SQL Server 2000 you can create a temp table and generate sequential numbers. The only requirement is that you have more numbers than the length of the string to parse (in this case the length of ‘1327,1342,1411’). These numbers are used as an index in the walk process.

SQL Server 2005 Service Pack 2

The long awaited SQL Server 2005 SP2 is released! This is important for a few reasons, not the least of which that Vista requires SP2 to run SQL Server.

It is an all inclusive service pack, which means it includes everything from service pack 1. The installation requires 1.9 GB of disk space available to download and install SP2. Upon installation it does take approximately 1 GB of disk space on the system drive.

The following table shows the version numbers that correspond to versions of SQL Server 2005:

Version Number  SQL Server 2005 Version

--------------  ---------------------------------
9.00.1399 SQL Server 2005 (initial version)
9.00.2047 SQL Server 2005 SP1
9.00.3042 SQL Server 2005 SP2
9.00.3050 SQL Server 2005 SP2a
9.00.3152 SQL Server 2005 SP2 with KB933097

The SQL Server 2005 version can be verified in SQL Server Management Studio Object Explorer when connected to the instance, or by executing the following query against a Database Engine instance:

SELECT SERVERPROPERTY('ProductVersion');

To avoid reboot after the installation it is recommended to stop the Windows Management Instrumentation and SQL Server FullText Search services.

Everyone has its preferences when it comes to new features. Here is the list of my top 10 features in this service pack:

1. VARDECIMAL – This new storage format, known as vardecimal storage format, stores decimal and numeric data as variable length columns and can be enabled or disabled at a table level on new or existing tables. If you have declared a decimal or numeric column with high precision but most values in the column do not require many digits of precision, you can potentially save a large amount of the disk space that is needed to store the table. A new stored procedure is provided to estimate the reduction in average row size with the new storage format.

2. Maintenance Plans – Maintenance plans are now supported by the SQL Server Database Services installation. Before SP2, you were required to install SQL Server 2005 Integration Services (SSIS) to run maintenance plans on a server-only installation.

3. Custom Reports – Users can now integrate their own management reports into Management Studio.

4. Linked servers – The New Linked Server dialog box now includes a Test Connection button. The Linked Servers node in Object Explorer now displays system catalogs and user database objects. As soon as the linked server is created, you can expand its node to show tables and views on the remote server.

5. Script generation – You can control how objects are scripted from the Object Explorer using the new Tools Options script dialog box. This dialog box also controls the defaults for the generate Script Wizard. Also, in the Generate Script Wizard you can now specify that the scripted objects include a DROP statement before the CREATE statement. Scripting of objects into separate files is now possible.

6. Maintenance Plan Schedules and Multi-Server – The Maintenance Plan Designer supports multiple subplans for a maintenance plan. Each subplan can have an independent schedule for maintenance tasks. Also, maintenance plans now support multi-server environments, logging to remote servers, and multiple schedules.

7. Reporting Services Integration with SharePoint – You can integrate a report server instance with Windows SharePoint Services 3.0 or Microsoft Office 2007 SharePoint Server to store, secure, access, and manage report server items from a SharePoint site. Integration features are provided jointly through SP2 and a special Reporting Services Add-in that you download and install on an instance of the SharePoint technology you are using. The new Report Viewer Web Part is included in the Reporting Services Add-in that you install on a SharePoint technology instance.

8. Graphical Show Plans – The spacing between nodes is reduced to display more plan information.

9. SQL Server Management Studio Connections – Fewer connection prompts when you open T-SQL scripts. Management Studio determines the connection of the current context.

10. Copy Database Wizard – The new wizard can discover many of the issues with dependent objects to improve reliability. A verbose logging file is created during operation and can be examined to diagnose problems with the operation. Improved reliability for online database transfers using SMO method: Stored procedures, Jobs, Packages, Messages, Endpoints, LOGINs; Permissions (explicit GRANT/DENY).

Additional resources:

How to obtain the latest service pack for SQL Server 2005
http://support.microsoft.com/kb/913089

SQL Server 2005 Service Pack 2 Download
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/servicepacks/sp2.mspx

SQL Server 2005 Express with SP2 Download
http://msdn.microsoft.com/vstudio/express/sql/download/

What’s New in SQL Server 2005 SP2
http://download.microsoft.com/download/2/b/5/2b5e5d37-9b17-423d-bc8f-b11ecd4195b4/WhatsNewSQL2005SP2.htm

Readme Files for SQL Server 2005 SP2 and SQL Server 2005 SP2 Express Editions and Tools
http://www.microsoft.com/downloads/details.aspx?FamilyID=eb05d099-8a66-45f6-84ce-4888760d2af8&DisplayLang=en

Readme for Microsoft SQL Server 2005 Service Pack 2
http://download.microsoft.com/download/2/B/5/2B5E5D37-9B17-423D-BC8F-B11ECD4195B4/ReadmeSQL2005SP2.htm

Press Release: Microsoft Releases SQL Server 2005 Service Pack 2
http://www.microsoft.com/presspass/press/2007/feb07/02-19SQLSP2PR.mspx

A list of the bugs that are fixed in SQL Server 2005 Service Pack 2
http://support.microsoft.com/default.aspx?scid=kb;en-us;921896

JumpstartTV Video by Brian Knight: Overview and Installing SQL Server 2005 SP2
http://www.jumpstarttv.com/Media.aspx?vid=70

Feature Pack for Microsoft SQL Server 2005 – February 2007
http://www.microsoft.com/downloads/details.aspx?familyid=50b97994-8453-4998-8226-fa42ec403d17&displaylang=en

SQL Server 2005 Samples and Sample Databases (February 2007)
http://www.microsoft.com/downloads/details.aspx?familyid=E719ECF7-9F46-4312-AF89-6AD8702E4E6E&displaylang=en

SQL Server 2005 Books Online
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx

Reporting Services Add-in for Microsoft SharePoint
http://www.microsoft.com/downloads/details.aspx?FamilyId=1e53f882-0c16-4847-b331-132274ae8c84&DisplayLang=en

Data Mining Add-ins for Microsoft Office 2007
http://www.microsoft.com/downloads/details.aspx?FamilyId=7c76e8df-8674-4c3b-a99b-55b17f3c4c51&DisplayLang=en

Microsoft SQL Server 2005 Service Pack 2 issue: Cleanup tasks run at different intervals than intended
http://support.microsoft.com/kb/933508

Cumulative hotfix package (build 3152) for SQL Server 2005 Service Pack 2
http://support.microsoft.com/kb/933097/

SQL Server 2005 Compact Edition

While using SQL Server Compact Edition for mobile applications makes perfect sense, using it for regular desktop applications may not be the best choice. Here is a summary of pros and cons of using the Compact Edition with desktop applications.

Pros:

– Very small footprint (only 7-8 DLLs needed, no services installed, runs as in-process rather than separate service). It takes only 1.8 MB installed.
– Easy deployment. While it can be deployed using the provided Microsoft Installer file, deployment can be as simple as shipping and installing the Compact Edition DLLs to the application directory. However, this type of installation will not provide the automatic updates via Windows Update (it provides automatic updates only if installed via the MSI file).
– Support for T-SQL.
– If installed by deploying the DLLs to the application directory then there is no need for administrative user privileges (only the MSI install requires administrative privileges to register DLLs).
– Support for Remote Data Access (RDA) and ADO.NET Sync Framework. This provides convenience and flexibility to synchronize and transfer data between Compact Edition database and SQL Server databases.
– Support for different file extensions, which can help to secure the database file.
– Allows storing the database file on a network share.
– Higher safety and security because it is code free (no support for T-SQL procedural code).
– Encryption of the database file (as well as password protection) provides data security.

Cons:

– Procedural T-SQL code is not supported (stored procedures, views, triggers).
– The T- SQL language is limited (for example, no TOP, IF, ROW_NUMBER).
– No ODBC driver (although the existing ADO.NET and OLE DB data providers are sufficient from programming stand point).
– Missing role based security.
– No distributed transactions support.
– No native XML and XQuery support (XML is stored as NTEXT).
– Only 256 user connections (although this can hardly be viewed as a drawback because of the intended usage as a desktop database).

After all, the SQL Server Compact Edition is a great addition to the SQL Server family and has many benefits for particular applications. Still, to deliver flexible and rich desktop database applications, something like SQL Server 2005 Express may be a better choice.

Libraries:

The following libraries are installed on the desktop with the SQL Server Compact Edition MSI file:

– SQLCESE30.DLL
– SQLCEQP30.DLL
– SQLCEME30.DLL
– SQLCEOLEDB30.DLL (OLE DB data provider)
– SQLCECA30.DLL
– SQLCECOMPACT30.DLL
– SQLCEER30xx.DLL
– System.Data.SqlServerCe.dll (ADO.NET data provider)

Using the OLE DB provider does not require the .NET Framework, while using the ADO.NET provider requires .NET 2.0 or later (since it is using the SQL Server Native Client).

How-to:

– To connect to SQL Server Compact Edition using SQL Server Management Studio, on the login screen in SSMS you select “SQL Server Mobile” for Server type. Then browse to the database file or create a new database.

– The connection string for using in applications looks like this:
“data source=MyDatabase.sdf; ssce: mode=Exclusive;”

Here is a sample code for connecting:

SqlCeEngine eng = new SqlCeEngine();

eng.LocalConnectionString = "data source=MyDatabase.sdf; ssce: mode=Exclusive;";

Additional resources:

SQL Server 2005 Compact Edition Home:
http://www.microsoft.com/sql/editions/compact/default.mspx

SQL Server 2005 Compact Edition How-to Tutorials:
http://msdn2.microsoft.com/en-us/sql/bb219480.aspx

Convert Tree Structure From Nested Set Into Adjacency List

Tree structures are often represented in nested set model or adjacency list model. In the nested set model each node has a left and right, where the root will always have a 1 in its left column and twice the number of nodes in its right column. On the other side the adjacency list model uses a linking column (child/parent) to handle hierarchies.

Sometimes there is a need to convert a nested set model into an adjacency list model. Here is one example of doing that:

CREATE TABLE NestedSet (

 node CHAR(1) NOT NULL PRIMARY KEY,

 lf INT NOT NULL,

 rg INT NOT NULL);

 

INSERT INTO NestedSet VALUES ('A', 1, 8);

INSERT INTO NestedSet VALUES ('B', 2, 3);

INSERT INTO NestedSet VALUES ('C', 4, 7);

INSERT INTO NestedSet VALUES ('D', 5, 6);

 

CREATE TABLE AdjacencyList (

 node CHAR(1) NOT NULL PRIMARY KEY,

 parent CHAR(1) NULL);

 

INSERT INTO AdjacencyList

SELECT A.node,

       B.node AS parent

FROM NestedSet AS A

LEFT OUTER JOIN NestedSet AS B

  ON B.lf = (SELECT MAX(C.lf)

            FROM NestedSet AS C

            WHERE A.lf > C.lf

               AND A.lf < C.rg);


-- Results
node parent
------ --------
A NULL
B A
C A
D C

Additional resources:

Book: “Trees and Hierarchies in SQL for Smarties” by Joe Celko

Adjacency List Model
http://www.sqlsummit.com/AdjacencyList.htm

Trees in SQL
http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=1266295

DST 2007 and SQL Server

Beginning in 2007, daylight saving time (DST) will be extended in the United States. DST will start three weeks earlier on March 11, 2007, and end one week later on November 4, 2007. This results in a new DST period that is four weeks longer than previously observed.

How that does affect SQL Server users? Since the SQL Server engine takes time from the Windows operating system, SQL Server users will not be directly affected. As long as their Windows system is updated with the appropriate patches the SQL Server engine will be fine. However, users that utilize the Notification Services for SQL Server will be affected. They need to apply the update script from this KB article: http://support.microsoft.com/?id=931815.

To determine if Notification Services is installed users can check Add/Remove Programs for SQL Server 2000 (since Notification Services is installed as a separate product) or certain registry keys can be verified for both SQL Server 2000 and 2005. Also, the existence of the NS services or databases named like NS* will be an indication for existence of Notification Services. See the section “How to determine whether Notification Services is installed” in the above KB article for more details.

Important notes:

– The update script needs to be applied only if a Notification Services instance has been deployed, since only the NS application instance database is affected. By default when Notification Services is installed it installs only the binary files (which do not need to be updated).
– New instances created after SQL Server 2005 SP2 is installed do not need to be patched (however, SQL Server SP2 does not fix existing instances, so they have to be patched).

Additional resources:

2007 time zone update for Microsoft Windows operating systems
http://support.microsoft.com/kb/928388/

Daylight Saving Time Help and Support Center
http://support.microsoft.com/gp/cp_dst

2007 time zone update for SQL Server 2005 Notification Services and for SQL Server 2000 Notification Services
http://support.microsoft.com/?id=931815

How to prepare SQL Server 2005 and SQL Server 2000 for changes to daylight saving time in 2007
http://support.microsoft.com/?kbid=931975

February 2007 cumulative time zone update for Microsoft Windows operating systems
http://support.microsoft.com/kb/931836/