Case statement not correctly matching expected values

I’m trying to generate some randomized data, and I’ve been using newid() to seed functions since it is called once for every row and is guaranteed to return a different result each time. However I’m frequently getting values that are somehow not equal to any integers in the expected range.

I’ve tried a few variations, including a highly upvoted one, but they all result in the same issue. I’ve put it into a script that shows the problem:

  • How to show blank record in sql if duplicate rows exists
  • Dynamic Column Name by Year
  • Connection established. Data is not collected from DB2 using SSMA
  • SQL Server: pivoting across multiple colums
  • determining the character set of a table / database?
  • SQL formatting standards
  • declare @test table (id uniqueidentifier)
    insert into @test
    select newid() from sys.objects
    
    select 
        floor(rand(checksum(id)) * 4),
        case isnull(floor(rand(checksum(id)) * 4), -1)
            when 0 then 0
            when 1 then 1
            when 2 then 2
            when 3 then 3
            when -1 then -1
            else 999
        end,
        floor(rand(checksum(newid())) * 4),
        case isnull(floor(rand(checksum(newid())) * 4), -1)
            when 0 then 0
            when 1 then 1
            when 2 then 2
            when 3 then 3
            when -1 then -1
            else 999
        end
    from @test
    

    I expect the results to always be in the range 0 to 3 for all four columns. When the unique identifiers are retrieved from a table, the results are always correct (first two columns.) Similarly, when they’re output on the fly they’re also correct (third column.) But when they’re compared on the fly to integers in a case statement, it often returns a value outside the expected range.

    Here’s an example, these are the first 20 rows when I ran it just now. As you can see there are ‘999’ instances in the last column that shouldn’t be there:

    0   0   3   1
    3   3   3   1
    0   0   3   3
    3   3   2   999
    1   1   2   999
    3   3   2   1
    2   2   0   999
    0   0   0   0
    3   3   2   0
    1   1   3   999
    3   3   0   999
    2   2   2   2
    1   1   3   0
    2   2   3   0
    3   3   1   999
    0   0   1   999
    3   3   1   1
    0   0   0   3
    3   3   0   999
    0   0   1   0
    

    At first I thought maybe the type coercion was different than I expected, and the result of rand() * int was a float not an int. So I wrapped it all in floor to force it to be an int. Then I thought perhaps there’s an odd null value creeping in, but with my case statement a null would be returned as -1, and there are none.

    I’ve run this one two different SQL Server 2012 SP1 instances, both give the same sort of results.

  • How do I pass field name as a parameter into a stored procedure
  • How can I improve this database model?
  • Weird result coverting from varchar to varbinary and converting back — sql server 2008
  • Explicitly drop temp table or let SQL Server handle it
  • Not Nullable fields in SQL Server Still Considered Required Fields in ASP.NET MVC
  • Install C# desktop app with sql server DB on client machine
  • One Solution collect form web for “Case statement not correctly matching expected values”

    In the fourth column, isnull(floor(rand(checksum(newid())) * 4), -1) is being evaluated up to five times for each row. Once for each branch of the case. On each call the values can be different. So it can return 2, not match 1, 3 not match 2, 1 not match 3, 3 not match 4 fall to the else and return 999.

    This can be seen if you get the execution plan, and look at the XML, there is a line [whitespace added.]:

    <ScalarOperator ScalarString="
    CASE WHEN isnull(floor(rand(checksum(newid()))*(4.000000000000000e+000)),(-1.000000000000000e+000))=(0.000000000000000e+000) THEN (0) 
        ELSE CASE WHEN isnull(floor(rand(checksum(newid()))*(4.000000000000000e+000)),(-1.000000000000000e+000))=(1.000000000000000e+000) THEN (1) 
            ELSE CASE WHEN isnull(floor(rand(checksum(newid()))*(4.000000000000000e+000)),(-1.000000000000000e+000))=(2.000000000000000e+000) THEN (2) 
                ELSE CASE WHEN isnull(floor(rand(checksum(newid()))*(4.000000000000000e+000)),(-1.000000000000000e+000))=(3.000000000000000e+000) THEN (3) 
                    ELSE CASE WHEN isnull(floor(rand(checksum(newid()))*(4.000000000000000e+000)),(-1.000000000000000e+000))=(-1.000000000000000e+000) THEN (-1) 
                        ELSE (999) 
                    END 
                END 
            END 
        END 
    END
    ">
    

    Placing the expression in a CTE seems to keep the recomputes from happening:

    ; WITH T AS (SELECT isnull(floor(rand(checksum(newid())) * 4), -1) AS C FROM @Test)
    SELECT CASE C
            when 0 then 0
            when 1 then 1
            when 2 then 2
            when 3 then 3
            when -1 then -1
            else 999 END
    FROM T
    
    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.