Predicates in SQL

One of the most confusing topics for developers coming from traditional languages like C, C++, Java, C#, and VB to SQL, is the evaluation of predicates. In most languages predicate logic is evaluated left to right. SQL is a different story. It follows the concept of all-at-once operations. This means all logical predicates will be evaluated at the same time. In practice, the SQL engine is free to process the logical expressions in any order it finds appropriate and cost effective to retrieve the data.

Below is example to illustrate this behavior. Given a table with account policies, where a policy may be represented only as numeric values for certain accounts, and a mix of alpha-numeric characters for other accounts. Knowing that account 2 has policies that can be represented only as numeric values, the goal is to retrieve policies that are greater than 50000.

An attempt to add a first predicate to filter on account 2 and then a second one to select policies greater than 50000 will fail. Even if the query is formed with a derived table to extract first all policies for account 2 and then the outer query filters on the policy, it will not work. The derived table is expanded in the main query plan and a single query plan is produced. Then the query engine is free to push up and down the predicates in the plan as it finds efficient.

CREATE TABLE AccountPolicies (

 account_nbr INT,

 policy VARCHAR(20),

 PRIMARY KEY (account_nbr, policy));

 

INSERT INTO AccountPolicies VALUES(1, 'P1000234');

INSERT INTO AccountPolicies VALUES(1, 'P1020256');

INSERT INTO AccountPolicies VALUES(2, '1001');

INSERT INTO AccountPolicies VALUES(2, '5002');

INSERT INTO AccountPolicies VALUES(2, '50001');

INSERT INTO AccountPolicies VALUES(2, '50005');

INSERT INTO AccountPolicies VALUES(2, 'P50005');

 

-- Second predicate may be evaluated first

-- resulting in conversion error

SELECT account_nbr, policy

FROM AccountPolicies

WHERE account_nbr = 2

  AND CAST(policy AS INT) > 50000;

 

-- The derived table is expended in the

-- query plan and predicates can be pushed

-- up or down in the plan

SELECT account_nbr, policy

FROM (SELECT account_nbr, policy

      FROM AccountPolicies

      WHERE account_nbr = 2) AS P

WHERE CAST(policy AS INT) > 50000;

One way to solve this is to use a CASE expression to force the desired evaluation because WHEN clauses in CASE are evaluated in order. This solution not only demonstrates how to handle the evaluation process, but adds verification that only policies with numeric values are casted (as usual business rules change and later account 2 may be allowed to create policies with alpha-numeric characters).

-- Enforce sequence of evaluating conditions

-- and cast only valid values

SELECT account_nbr, policy

FROM AccountPolicies

WHERE CASE

        WHEN account_nbr = 2

        THEN CASE

               WHEN policy NOT LIKE '%[^0-9]%'

               THEN CASE

                      WHEN CAST(policy AS INT) > 50000

                      THEN 'True'

                      ELSE 'False'

                    END

               ELSE 'False'

            END            

        ELSE 'False'

      END = 'True';

10 replies
  1. Anonymous
    Anonymous says:

    Is the following code incorrect

    ;WITH CTETest AS
    (
    SELECT account_nbr, policy FROM AccountPolicies WHERE account_nbr = 2
    )

    SELECT account_nbr, policy
    FROM CTETest
    WHERE CAST(policy AS INT) > 50000;

    Reply
  2. Plamen Ratchev
    Plamen Ratchev says:

    Yes, this code the same problem as the derived table example. Here in a similar manner the CTE is expanded in the main query plan and a single query plan is produced. Then the optimizer may chose to evaluate first the predicate CAST(policy AS INT) > 50000 which will result in error.

    Reply
  3. Vijay Chand
    Vijay Chand says:

    Sir,

    The only reason why you are getting error is because of this –> INSERT INTO AccountPolicies VALUES(2, 'P50005');

    The account_nbr=2 but policy is alphnumeric. Explicit cast of alphanumeric to numeric is less forgiving than the implicit cast that sql engine performs.

    Also, try using both the queries without the explicit Cast(policy as int) function as sql has an implicit cast function.

    On broader perspective, there is a logic to the way sql engine processes predicates. This logic is determinate logic. Without this data retrieval would be a hell-of-a-task.

    Cheers!

    Vijay Chand

    Reply
  4. Mohammad
    Mohammad says:

    Hi,
    >>
    WHEN clauses in CASE are evaluated in order
    <<

    Is IF statement same with CASE for evaluating? Or optimizer make a decision for trying which Boolean statement first evaluate (in terms of cost-based)

    Reply
  5. Plamen Ratchev
    Plamen Ratchev says:

    IF is a control of flow statement and not an expression like CASE. As such it cannot be used in queries and it is not part of the execution plan, so evaluation is different.

    Reply
  6. Plamen Ratchev
    Plamen Ratchev says:

    Kevin,

    The double negative guarantees that only numeric characters are allowed (in a way saying the value is not like not a number). If you use the LIKE predicate as you posted then it allows non numeric characters.

    Reply

Trackbacks & Pingbacks

  1. […] loan type value which will result in conversion error. I described this in more detail in my post Predicates in SQL.     To fix this you may be tempted to try a different approach: using table […]

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *