Posts

Paging with Ranking Functions

Paging through result sets is a very common need in applications. It provides a convenient way to deliver small chunks of data to the client application, minimizing network traffic and allowing end users to browse data in page size format.

There are different methods to accomplish the paging, both on client and server side. The introduction of the ranking functions in SQL Server 2005 (and SQL Server 2008) provides another efficient tool to implement paging.

The following example demonstrates paging utilizing the ROW_NUMBER function. Here it helps to generate sequence for each row ordered by the loan date column, and the sequence is later used to split the result set into pages.

CREATE TABLE Loans (

  loan_nbr INT NOT NULL PRIMARY KEY,

  loan_date DATETIME NOT NULL,

  loan_amount DECIMAL(12, 2) NOT NULL

              DEFAULT 0.0,

  customer_nbr INT NOT NULL,

  loan_type CHAR(1) NOT NULL

            DEFAULT 'P'

            CHECK (loan_type IN ('P'-- personal

                                'B')) -- business

  );

 

INSERT INTO Loans

VALUES (1, '20080801', 12000.00, 2, 'P'),

       (2, '20080805', 15700.00, 1, 'B'),

       (3, '20080610', 12000.00, 3, 'P'),

       (4, '20080401', 5000.00, 1, 'P'),

       (5, '20080715', 25000.00, 4, 'B'),

       (6, '20080610', 25000.00, 5, 'P'),

       (7, '20080501', 1000.00, 6, 'P'),

       (8, '20080810', 6000.00, 7, 'B'),

       (9, '20080815', 2000.00, 8, 'B'),

       (10, '20080815', 1000.00, 9, 'P'),

       (11, '20080715', 5500.00, 10, 'P'),

       (12, '20080615', 1000.00, 11, 'B'),

       (13, '20080820', 6000.00, 12, 'P'),

       (14, '20080510', 28000.00, 6, 'B'),

       (15, '20080815', 2000.00, 10, 'P'),

       (16, '20080810', 1500.00, 8, 'P'),

       (17, '20080817', 10000.00, 10, 'B'),

       (18, '20080816', 2500.00, 9, 'P');  

 

-- Paging

DECLARE @page_size INT = 5;

DECLARE @page_nbr  INT = 4;

 

WITH LoansRanked (loan_date, loan_amount, loan_type, seq)

AS

(SELECT loan_date, loan_amount, loan_type,

        ROW_NUMBER() OVER (ORDER BY loan_date, loan_nbr)

FROM Loans)

SELECT loan_date, loan_amount, loan_type, seq

FROM LoansRanked

WHERE seq > (@page_nbr - 1) * @page_size

  AND seq <= @page_nbr * @page_size;

 

/*

 

Results (4th page which contains only 3 rows):

 

loan_date               loan_amount   loan_type seq

----------------------- ------------- --------- ----

2008-08-16 00:00:00.000     2500.00 P         16

2008-08-17 00:00:00.000     10000.00 B         17

2008-08-20 00:00:00.000     6000.00 P         18

 

*/

SQL Server 2008 Launch Event

With SQL Server 2008 just released, there is a lot of interest of what it offers. On Friday, September 12, you can hear about all the new features of SQL Server 2008 in a full day of free training brought to you by JSSUG, Idea Integration and Pragmatic Works. Each session will dive deeply into new features from a BI, DBA and developer perspective around SQL Server 2008. To resister go to http://www.sqlsaturday.com and select the SQL Server 2008 Launch Event.

I will be presenting at this event the following session:

– T-SQL Enhancements in SQL Server 2008

Code Camp

On August 23, 2008 is the Jax Code Camp IV in Jacksonville. A full day of coding and free training on different technologies: WCF, LINQ, Cold Fusion, Flex/Air, Silverlight, Ruby on Rails, .Net Compact Framework, SharePoint, SQL Server 2008, and more. To sign up and see the full schedule visit the Code Camp website at http://www.jaxcodecamp.com/.

I will be presenting three sessions:

– Advanced SQL Querying Techniques
– SQL Injection
– SQL Server 2008 for Developers

Top 10 T-SQL Enhancements in SQL Server 2008

As SQL Server 2008 is getting in the advances phases of release, a lot has been published about new features and changes. There have been a few enhancements to Transact SQL that will be a great help for both developers and administrators. Here is list of the top 10 in no particular order, with links to detailed examples. In other words:

SELECT TOP(10) t_sql_enhancements

FROM SQLServer2008

ORDER BY CHECKSUM(NEWID());

• Delighters: http://pratchev.blogspot.com/2008/04/delighters-in-sql-server-2008.html
• Row constructors: http://pratchev.blogspot.com/2008/01/table-value-constructors-in-sql-server.html
• MERGE: http://pratchev.blogspot.com/2008/03/upsert-and-more-with-merge.html
• Composable DML: http://pratchev.blogspot.com/2008/04/composable-dml.html
• Table valued parameters: http://pratchev.blogspot.com/2008/04/table-valued-parameters.html
• Filtered indexes: http://pratchev.blogspot.com/2008/04/filtered-indexes.html
• Sparse columns: http://pratchev.blogspot.com/2008/04/sparse-columns.html
• Hierarchy ID: http://pratchev.blogspot.com/2008/05/hierarchies-in-sql-server-2008.html
• Date and Time data types: http://pratchev.blogspot.com/2008/05/new-date-and-time-data-types.html
• FILESTREAM data type: http://pratchev.blogspot.com/2008/05/filestream-data-type.html

Here are a few more that did not make the list (casualty of random ordering) but are still of importance:

• Spatial data (GEOMETRY and GEOGRAPHY): http://pratchev.blogspot.com/2008/06/spatial-support-in-sql-server.html
• Grouping Sets: http://msdn.microsoft.com/en-us/library/bb522495(SQL.100).aspx
• Table hints: http://msdn.microsoft.com/en-us/library/bb510478(SQL.100).aspx
• Star join query optimizations: http://technet.microsoft.com/en-us/magazine/cc434693(TechNet.10).aspx
• T-SQL debugger: http://blogs.msdn.com/buckwoody/archive/2008/04/25/sql-server-2008-management-improvements-t-sql-debugger.aspx

Spatial Support in SQL Server

SQL Server 2008 adds new spatial data types and methods for storing and handling spatial data. The two new data types are GEOMETRY and GEOGRAPHY. This new functionality provides great capabilities to process spatial data.

Here is one brief example to demonstrate the GEOGRAPHY data type and one of the related methods. The code below finds if a geographical location defined by latitude and longitude coordinates is within a geographical region defined by a polygon.

DECLARE @point GEOGRAPHY;

DECLARE @polygon GEOGRAPHY;

 

SET @point = geography::Parse('POINT(49.274138 73.098562)');

SET @polygon = geography::Parse('POLYGON((47.0 90.0, 47.0 73.0, 50.0 52.0, 50.0 54.0, 47.0 90.0))');

 

SELECT @polygon.STIntersects(@point);

Additional resources:

Working with Spatial Data
http://msdn.microsoft.com/en-us/library/bb933876(SQL.100).aspx

Geometry Data Type Method Reference
http://msdn.microsoft.com/en-us/library/bb933973(SQL.100).aspx

Geography Data Type Method Reference
http://msdn.microsoft.com/en-us/library/bb933802(SQL.100).aspx

Working with Spatial Indexes
http://msdn.microsoft.com/en-us/library/bb895265(SQL.100).aspx

FILESTREAM Data Type

The new FILESTREAM data type in SQL Server 2008 offers a new way of storing unstructured data (images, documents, video, etc.). Up until now the two standard ways to store such data have been either in the database as BLOB objects or as files outside of the database. FILESTREAM provides the best of both worlds, storing unstructured data in the file system while maintaining transactional consistency with other structured data.

Here is a summary of the new capabilities offered by FILESTREAM:

• Implemented as VARBINARY(MAX)
• No limitation of 2 GB
• Stored in the file system
• Fast read access
• Good for objects larger than ~ 1 – 2 MB
• Encryption is not supported
• Setting to NULL deletes the BLOB data
• Transactional consistency
• Windows access via the OpenSqlFilestream APIs (read/write; no delete or rename)
• Uses Windows cache, not using SQL Server buffer pool, more memory for queries
• A ROWGUIDCOL column is required to use FILESTREAM data with Win32 APIs

Enabling FILESTREAM:

EXEC sp_filestream_configure

    @enable_level = 3,

    @share_name = "FileStreamShare";

Levels:

0 – Disabled. This is the default value
1 – Enabled only for Transact-SQL access
2 – Enabled only for Transact-SQL and local file system access
3 – Enabled for Transact-SQL, local file system access, and remote file system access

Creating database supporting FILESTREAM and table with FILESTREAM column:

CREATE DATABASE BlobDatabase ON PRIMARY

(NAME = Blob, FILENAME = 'C:DataBlob.mdf'),

FILEGROUP FSGroup CONTAINS FILESTREAM

(NAME = BlobFS, FILENAME = 'C:DataFileStream')

LOG ON ( NAME = BlobLog, FILENAME = 'C:DataBlob.ldf');

 

 

CREATE TABLE BlobDatabase.dbo.Images (

 image_id UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,

 image_desc VARCHAR(35),

 image_data VARBINARY(MAX) FILESTREAM NULL);

Update (August 20, 2008)

There have been some changes in the RTM version of SQL Server 2008. To enable FILESTREAM in the release version go to SQL Server Configuration Manager, open the Properties of the instance, on the FILESTREAM tab select Enable FILESTREAM for Transact-SQL access. Then open a query and execute the following:

EXEC sp_configure filestream_access_level, 2;

RECONFIGURE;

The stored procedure sp_filestream_configure used in the prerelease versions has been removed. Now the sp_configure procedure is used as demonstrated above. The new access level values are as follows:

0 – Disables FILESTREAM support for this instance (default)
1 – Enables FILESTREAM for Transact-SQL access
2 – Enables FILESTREAM for Transact-SQL and Win32 streaming access

In addition, one note from the field – turning off the 8.3 name generation on the NTFS volume using the command prompt utility fsutil helps a lot on performance.

Additional resources:

FILESTREAM Overview
http://msdn.microsoft.com/en-us/library/bb933993(SQL.100).aspx

Designing and Implementing FILESTREAM Storage
http://msdn.microsoft.com/en-us/library/bb895234(SQL.100).aspx

New Date and Time Data Types

SQL Server 2008 introduces enhancements to the existing date and time data types. The most important changes are the addition of separate DATE and TIME data types. The DATETIME2 data type adds more precision to the existing DATETIME data type, and DATETIMEOFFSET provides support for time zones. Here is a list of the new data types:

New data types:

• DATE (0001-01-01 through 9999-12-31)
• TIME (00:00:00.0000000 through 23:59:59.9999999)
• DATETIME2 (fraction 0 through 7)
• DATETIMEOFFSET (time zone awareness)

There are also a few new functions to support those new data types. Here is a list of the added date and time functions:

New functions:

• SYSDATETIME
• SYSDATETIMEOFFSET
• SYSUTCDATETIME
• SWITCHOFFSET
• TODATETIMEOFFSET

Follows a brief example to illustrate the use of the new date and time data types as well as the new functions.

-- The range for the DATE datatype is from 0001-01-01 through 9999-12-31

DECLARE @date DATE;

 

SET @date = CURRENT_TIMESTAMP;

 

PRINT @date;

 

-- The range for the TIME data type is 00:00:00.0000000 through 23:59:59.9999999

DECLARE @time TIME;

 

SET @time = CURRENT_TIMESTAMP;

 

PRINT @time;

 

-- Fraction 0 through 7

DECLARE @datetime2 DATETIME2(7);

 

SET @datetime2 = CURRENT_TIMESTAMP;

 

PRINT @datetime2;

 

SET @datetime2 = SYSDATETIME();

 

PRINT @datetime2;

 

-- Timezone offset range: -14:00 through +14:00

-- Date range: 0001-01-01 through 9999-12-31

-- Time range: 00:00:00 through 23:59:59.9999999

DECLARE @datetimeoffset DATETIMEOFFSET(7);

 

SET @datetimeoffset = CURRENT_TIMESTAMP;

 

PRINT @datetimeoffset;

 

SET @datetimeoffset = SYSDATETIMEOFFSET();

 

PRINT @datetimeoffset;

 

-- Functions

SELECT SYSDATETIME(),

       SYSDATETIMEOFFSET(),

       SYSUTCDATETIME(),

       CURRENT_TIMESTAMP,

       GETDATE(),

       GETUTCDATE();

 

-- New ISO week option for week number calculation      

SELECT DATEPART(ISO_WEEK, CURRENT_TIMESTAMP);   

 

-- Switch between time zones

SELECT SYSDATETIMEOFFSET(),

       SWITCHOFFSET(SYSDATETIMEOFFSET(), '+06:00'); --Paris

 

-- Set date and time offset based on DATETIME value

SELECT CURRENT_TIMESTAMP,

       TODATETIMEOFFSET(CURRENT_TIMESTAMP, '-01:00');

 

-- Convert DATETIME to DATE and TIME      

SELECT CONVERT(DATE, CURRENT_TIMESTAMP),

       CONVERT(TIME, CURRENT_TIMESTAMP);

Additional resources:

Using Date and Time Data
http://msdn.microsoft.com/en-us/library/ms180878(SQL.100).aspx

Date and Time Data Types and Functions
http://msdn.microsoft.com/en-us/library/ms186724(SQL.100).aspx

Hierarchies in SQL Server 2008

Graphs, tree algorithms and structures have been used for long time in databases to solve hierarchy related problems. Adjacency list, nested sets, materialized path, and other hybrid methods offer different capabilities to help.

SQL Server 2008 adds a new feature to help with modeling hierarchical relationships: the HIERARCHYID data type. It provides compact storage and convenient methods to manipulate hierarchies. In a way it is very much like optimized materialized path. In addition the SqlHierarchyId CLR data type is available for client applications.

While HIERARCHYID has a lot to offer in terms of operations with hierarchical data, it is important to understand a few basic concepts:

– HIERARCHYID can have only a single root (although easy to work around by adding sub-roots)
– It does not automatically represent a tree, the application has to define the relationships and enforce all rules
– The application needs to maintain the consistency

Here is a one example of employee hierarchy to illustrate the usage of HIERARCHYID and the related methods for manipulation of hierarchies.

CREATE TABLE Employees (

 emp_id INT NOT NULL PRIMARY KEY,

 emp_name VARCHAR(35),

 manager_id INT REFERENCES Employees(emp_id),

 org_chart_path HIERARCHYID);

 

/*

 

The primary key prevents cyclic paths.

Another way is using a CHECK constraint.

 

CHECK (org_chart_path.ToString() NOT LIKE '%/' + CAST(emp_id AS VARCHAR(10)) + '/_%')

 

*/

 

-- Insert the top level manager as hierarchy root

INSERT INTO Employees

VALUES (1, 'Jeff Brown', NULL, hierarchyid::GetRoot());

 

-- Insert John who reports to the top level manager                           

INSERT INTO Employees

VALUES (2, 'John Doe', 1,

           (SELECT hierarchyid::GetRoot().GetDescendant(NULL, NULL)

            FROM Employees));

 

 

-- Insert Peter at the same level as John

DECLARE @mgr HIERARCHYID = (SELECT org_chart_path

                            FROM Employees

                            WHERE emp_name = 'John Doe');

 

INSERT INTO Employees

VALUES (3, 'Peter Hanna', 1,

           (SELECT hierarchyid::GetRoot().GetDescendant(@mgr, NULL)

            FROM Employees

            WHERE org_chart_path = hierarchyid::GetRoot()));

 

-- Insert Richard as reporting to John

INSERT INTO Employees

VALUES (4, 'Richard Burns', 2,

           hierarchyid::Parse('/1/1/')); -- Also: CAST('/1/1/' AS HIERARCHYID)

 

SELECT emp_id, emp_name,

       manager_id, org_chart_path,

       org_chart_path.GetAncestor(1) AS emp_manager,

       hierarchyid::GetRoot() AS top_manager,

       org_chart_path.GetDescendant(NULL, NULL) AS emp_descendant,

       org_chart_path.GetLevel() AS emp_level,

       org_chart_path.ToString() AS emp_org_path

FROM Employees;

 

/*

 

emp_id emp_name         manager_id org_chart_path emp_manager   top_manager emp_descendant emp_level emp_org_path

------ ---------------- ----------- --------------- ------------- ------------ --------------- --------- ------------

1     Jeff Brown       NULL        0x             NULL         0x           0x58            0         /

2     John Doe         1           0x58            0x            0x           0x5AC0         1         /1/

3     Peter Hanna     1           0x68            0x            0x           0x6AC0         1         /2/

4     Richard Burns    2           0x5AC0         0x58         0x           0x5AD6         2         /1/1/

 

*/

 

-- Move Richard to report to Peter

DECLARE @new_mgr HIERARCHYID = (SELECT org_chart_path

                                FROM Employees

                                WHERE emp_name = 'Peter Hanna');

 

UPDATE Employees

SET org_chart_path = org_chart_path.Reparent(org_chart_path.GetAncestor(1),

                                            @new_mgr)

WHERE emp_name = 'Richard Burns';

 

SELECT emp_id, emp_name,

       manager_id, org_chart_path,

       org_chart_path.GetAncestor(1) AS emp_manager,

       hierarchyid::GetRoot() AS top_manager,

       org_chart_path.GetDescendant(NULL, NULL) AS emp_descendant,

       org_chart_path.GetLevel() AS emp_level,

       org_chart_path.ToString() AS emp_org_path

FROM Employees;   

 

/*

 

emp_id emp_name        manager_id org_chart_path    emp_manager   top_manager emp_descendant   emp_level emp_org_path

------- --------------- ----------- ----------------- ------------- ------------ ---------------- --------- ------------

1       Jeff Brown     NULL        0x                NULL         0x           0x58             0         /

2       John Doe        1           0x58             0x            0x           0x5AC0           1         /1/

3       Peter Hanna     1           0x68             0x            0x           0x6AC0           1         /2/

4       Richard Burns   2           0x6AC0            0x68         0x           0x6AD6           2         /2/1/

 

*/

From the above example it is very easy to see the similarity between materialized path and HIERARCHYID when the HIERARCHYID is converted to the character format using the ToString() method. Converting hierarchy from traditional parent/child format to HIERARCHYID is simple using recursive CTEs (very similar to building a materialized path).

Note:

This code has been tested with SQL Server 2008 CTP 6 (February 2008). As of SQL Server 2008 Release Candidate 0 (June 2008) the method Reparent() has been replaced with the method GetReparentedValue(). It is called using the same parameters and returns the same value.

Additional resources:

Using hierarchyid Data Types
http://msdn.microsoft.com/en-us/library/bb677173(SQL.100).aspx

Working with hierarchyid Data
http://msdn.microsoft.com/en-us/library/bb677212(SQL.100).aspx

Sparse Columns

Sparse columns are another addition to the SQL Server 2008 tool belt. They offer a good trade-off for many applications: taking no space if they are empty and more space if they have data. In other words they optimize storage for NULL values.

Sparse columns are just like ordinary columns with a few limitations. They are defined with the SPARSE keyword and there is no difference on how they are used in data manipulation statements.

Here are the details:

Pros:
• Storing NULL in a sparse column takes up no space at all
• Up to 30,000 columns
• To any external application the column will behave the same
• Sparse columns fit well with filtered indexes to index on non-NULL values

Cons:
• If a sparse column has data it takes 4 more bytes than a normal column
• Not all data types can be sparse: TEXT, NTEXT, IMAGE, TIMESTAMP, user-defined data types, GEOMETRY, GEOGRAPHY, and VARBINARY(MAX) with the FILESTREAM attribute
• Computed columns cannot be sparse
• Cannot have default values

Here is example of creating and using sparse columns.

CREATE TABLE Survey (

  survey_nbr INT NOT NULL PRIMARY KEY,

  survey_desc VARCHAR(30),

  survey_info1 VARCHAR(30) SPARSE NULL,

  survey_info2 VARCHAR(30) SPARSE NULL,

  survey_info3 VARCHAR(30) SPARSE NULL,

  survey_info4 VARCHAR(30) SPARSE NULL);

 

INSERT INTO Survey(survey_nbr, survey_desc, survey_info1)

VALUES(1, 'Survey 1', 'some data'),

      (2, 'Survey 2', 'good');

Tables with sparse columns can have defined column set to manipulate all sparse columns as a structure. The column set is untyped XML column and is treated as any other XML column.

Here are the characteristics of column sets:

• Defined as XML data type column
• Not stored, like computed column, but updatable
• Only one column set per table
• Cannot be changed
• Cannot be added if the table already has sparse columns
• Limited by XML data size limit of 2 GB (all sparse columns in a row cannot exceed 2 GB)
• Cannot define constraints or default values
• Allows to update or insert sparse column values
• On update all sparse columns are updated, when no value provided sets the column to NULL

Follows example of creating a column set and inserting data into sparse columns using the column set.

CREATE TABLE Survey (

  survey_nbr INT NOT NULL PRIMARY KEY,

  survey_desc VARCHAR(30),

  survey_info1 VARCHAR(30) SPARSE NULL,

  survey_info2 VARCHAR(30) SPARSE NULL,

  survey_info3 VARCHAR(30) SPARSE NULL,

  survey_info4 VARCHAR(30) SPARSE NULL,

  survey_set XML column_set FOR ALL_SPARSE_COLUMNS);

 

-- Insert using the regular sparse columns

INSERT INTO Survey(survey_nbr, survey_desc, survey_info1)

VALUES(1, 'Survey 1', 'some data'),

      (2, 'Survey 2', 'good');

 

-- Insert using the column set

INSERT INTO Survey(survey_nbr, survey_desc, survey_set)

VALUES(3, 'Survey 3', '<survey_info3>data 3</survey_info3><survey_info4>answer 4</survey_info4>');

Any mistyping of column name in the XML column set or providing invalid data type value will result in error as when manipulating directly the columns.

Resources:

Using Sparse Columns
http://technet.microsoft.com/en-us/library/cc280604(SQL.100).aspx

Using Column Sets
http://technet.microsoft.com/en-us/library/cc280521(SQL.100).aspx

Table Valued Parameters

Passing multiple rows of data to stored procedures has always intrigued application developers. Many different methods have been used, more evolving around XML or some form of concatenated list. Those approaches require a method for parsing before the data can be used in normalized table format.

SQL Server 2008 adds another option: table valued parameters. This allows sending a table as parameter to the stored procedure. The current implementation still has shortcomings because it is a read-only parameter, and it cannot be used as output parameter.

Here is one example to demonstrate using table valued parameters to pass a table as stored procedure input.

CREATE TABLE Loans (

 loan_nbr INT PRIMARY KEY,

 loan_date DATETIME,

 loan_amount DECIMAL(15, 2));

 

-- User-defined table type

CREATE TYPE LoanTable

AS TABLE (

 loan_nbr INT PRIMARY KEY,

 loan_date DATETIME,

 loan_amount DECIMAL(15, 2));

 

GO

 

-- Procedure with table valued parameter

-- Must use the READONLY clause

CREATE PROCEDURE InsertLoans

 @Loans LoanTable READONLY

AS

 

 INSERT INTO Loans

 SELECT loan_nbr, loan_date, loan_amount

 FROM @Loans;

 

GO

 

-- Declare table variable of the new type

DECLARE @NewLoans LoanTable;

 

-- Initialize the table variable with data

INSERT INTO @NewLoans

VALUES (1, '20080101', 10000.00),

       (2, '20080203', 15000.00),

       (3, '20080315', 25000.00);

 

 

-- Insert new loans using the table variable as parameter

EXEC InsertLoans @Loans = @NewLoans;

The really cool part here is that table valued parameters are fully supported on the client side via ADO.NET (.NET Framework 3.5 or later) . That means we could easily have a data bound control like grid on the client side that can be used to feed a table parameter to update database table via a stored procedure. Here is a look at what that client code will be.

DataTable loans;

 

loans = new DataTable();

loans.Columns.Add("loan_nbr", typeof(int));

loans.Columns.Add("loan_date", typeof(System.DateTime));

loans.Columns.Add("loan_amount", typeof(decimal));

 

using(SqlCommand cmd = new SqlCommand("InsertLoans", conn)

{

  cmd.CommandType = CommandType.StoredProcedure;

  cmd.Parameters.AddWithValue("Loans", loans);

  cmd.ExecuteNonQuery();

}

Resources:

Table-Valued Parameters
http://msdn2.microsoft.com/en-us/library/bb510489(SQL.100).aspx