SELECT TOP N with UNION and ORDER BY

Given the example below, why does using using UNION ALL with two SELECT TOP 5 statements in Query 2 below seemingly not respect the ORDER BY clause?

Query 1 returns the expected results but does not include the required union.
Query 2 demonstrates the unexpected behavior.
Query 3 is the workaround I’m currently using to get the desired results.

  • SSIS: Just started getting a “Key not valid for use in specified state.” error on my scheduled SSIS package
  • How would you design your database to allow user-defined schema
  • CROSS APPLY with table valued function restriction performance
  • I cannot add views when reverse engineering my database in Visio
  • alternatives to REPLACE on a text or ntext datatype
  • Hashing web user password in ASP.NET vs SQL CLR
  • CREATE TABLE #T1 ([ID] int IDENTITY(1,1), [Description] varchar(100), [Inactive] bit);
    CREATE TABLE #T2 ([ID] int IDENTITY(1,1), [Description] varchar(100), [Inactive] bit);
    
    INSERT INTO #T1([Description], [Inactive]) VALUES ('One', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Two', 0);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Three', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Four', 0);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Five', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Six', 0);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Seven', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Eight', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Nine', 1);
    INSERT INTO #T1([Description], [Inactive]) VALUES ('Ten', 0);
    
    -- Query 1, works as expected giving all 4 records with Inactive = 0 plus one more
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T1
    ORDER BY [Inactive], [Description]; 
    
    -- Query 2, does not work as expected, as only 2 of the Inactive = 0 records are present
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T1
    UNION ALL
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T2
    ORDER BY [Inactive], [Description]; 
    
    -- Query 3, Workaround to produce desired results
    WITH T1 AS (
        SELECT TOP 5 [ID], [Description], [Inactive]
        FROM #T1
        ORDER BY [Inactive], [Description]
    ),
    T2 AS (
        SELECT TOP 5 [ID], [Description], [Inactive]
        FROM #T2
        ORDER BY [Inactive], [Description]
    )
    SELECT [ID], [Description], [Inactive] FROM T1
    UNION ALL
    SELECT [ID], [Description], [Inactive] FROM T2
    ORDER BY [Inactive], [Description]; 
    
    DROP TABLE #T1;
    DROP TABLE #T2;
    

    Obviously, the workaround is working for me but I’d like to understand why Query 2 does not do what I expected it to. In case you’re wondering why I am bothering with the empty table #T2, the results are actually limited by a WHERE clause in my production example – however leaving it empty here serves the purpose of providing a comparable example without bothering to populate it.

    If you do populate #T2 with the following, I find the results to be just as strange – Query 2 only gives four results with Inactive = 0.

    INSERT INTO #T2([Description], [Inactive]) VALUES ('Eleven', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Twelve', 0);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Thirteen', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Fourteen', 0);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Fifteen', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Sixteen', 0);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Seventeen', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Eighteen', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Nineteen', 1);
    INSERT INTO #T2([Description], [Inactive]) VALUES ('Twenty', 0);
    

    I have run this script with the same results on SQL Server 2014 and SQL Server 2008 R2.

  • SQL Server Management Studio - Adding/Moving Columns require drop and re-create?
  • Is there a difference between SQL Server Express (2012) and LocalDB?
  • Getting a strange error when Try to save a SSRS report
  • How to remove white space characters from a string in SQL Server
  • convert string to date in sql server
  • intermittent problem occurring with a table that is only used for selection
  • 3 Solutions collect form web for “SELECT TOP N with UNION and ORDER BY”

    A Union query works thus: execute the queries, then apply the order by clause. So with

    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T1
    UNION ALL
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T2
    ORDER BY [Inactive], [Description]; 
    

    you select five arbitrarily chosen records from #T1 plus five arbitrarily chosen records from #T2 and then you order these. So you need subqueries or with clauses. E.g.:

    SELECT * FROM
    (
      (
        SELECT TOP 5 [ID], [Description], [Inactive]
        FROM #T1
        ORDER BY [Inactive], [Description]
      )
      UNION ALL
      (
        SELECT TOP 5 [ID], [Description], [Inactive]
        FROM #T2
        ORDER BY [Inactive], [Description]
      )
    ) t;
    

    So your workaround is not a workaround at all, but the proper query.

    You should move the whole UNION ALL within a subquery:

    SELECT *
    FROM (SELECT TOP 5 [ID], [Description], [Inactive] 
          FROM #T1 
          ORDER BY [Inactive], [Description]
          UNION ALL
          SELECT TOP 5 [ID], [Description], [Inactive] 
          FROM #T2 
          ORDER BY [Inactive], [Description]) T3
    ORDER BY [Inactive], [Description];
    GO
    
    ID | Description | Inactive
    -: | :---------- | :-------
     4 | Four        | False   
     6 | Six         | False   
    10 | Ten         | False   
     2 | Two         | False   
     8 | Eight       | True    
    

    There is a major difference between query #2 and #3 – the ORDER BY clause on #T1.

    The thumb rule is -in a SQL query if you don’t specify the ORDER BY cause, “TOP” will just return random set of records. After you get these “random” rows, the sorting comes into play. And hence there is no “proper” sorting taking place.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.