Selecting SUM of TOP 2 values within a table with multiple GROUP in SQL

I’ve been playing with sets in SQL Server 2000 and have the following table structure for one of my temp tables (#Periods):

    RestCTR     HoursCTR    Duration    Rest
    ----------------------------------------
    1           337         2           0
    2           337         46          1
    3           337         2           0
    4           337         46          1
    5           338         1           0
    6           338         46          1
    7           338         2           0
    8           338         46          1
    9           338         1           0
    10          339         46          1
    ...

What I’d like to do is to calculate the Sum of the 2 longest Rest periods for each HoursCTR, preferably using sets and temp tables (rather than cursors, or nested subqueries).

  • How to resolve Invalid Object Name “Table Name” in C#?
  • Find the difference between top 2 rows
  • SQLServer RCSI - ReadOnly query resulting in deadlock
  • T-SQL Job fail with Incorrect syntax near ' '. (Error 102)
  • How can SQLServer notify a vb.net application of an event?
  • Case Statement with different operator in Where Condition
  • Here’s the dream query that just won’t work in SQL (no matter how many times I run it):

    Select HoursCTR, SUM ( TOP 2 Duration ) as LongestBreaks
    FROM #Periods
    WHERE Rest = 1
    Group By HoursCTR    
    

    The HoursCTR can have any number of Rest periods (including none).

    My current solution is not very elegant and basically involves the following steps:

    1. Get the max duration of rest, group by HoursCTR
    2. Select the first (min) RestCTR row that returns this max duration for each HoursCTR
    3. Repeat step 1 (excluding the rows already collected in step 2)
    4. Repeat step 2 (again, excluding rows collected in step 2)
    5. Combine the RestCTR rows (from step 2 and 4) into single table
    6. Get SUM of the Duration pointed to by the rows in step 5, grouped by HoursCTR

    If there are any set functions that cut this process down, they would be very welcome.

  • Subquery returning more then one value but it shouldn't
  • UPDATE datatype image in SQL-Table
  • Set variable from user selection in SQL Server
  • SQL Cursors…Any use cases you would defend?
  • SET statement always runs regardless of IF ELSE condition
  • SQL Server - Given a Set of Columns, finding missing combinations within the Set
  • 3 Solutions collect form web for “Selecting SUM of TOP 2 values within a table with multiple GROUP in SQL”

    The best way to do this in SQL Server is with a common table expression, numbering the rows in each group with the windowing function ROW_NUMBER():

    WITH NumberedPeriods AS (
      SELECT HoursCTR, Duration, ROW_NUMBER() 
        OVER (PARTITION BY HoursCTR ORDER BY Duration DESC) AS RN
      FROM #Periods
      WHERE Rest = 1
    )
    SELECT HoursCTR, SUM(Duration) AS LongestBreaks
    FROM NumberedPeriods
    WHERE RN <= 2
    GROUP BY HoursCTR
    

    edit: I’ve added an ORDER BY clause in the partitioning, to get the two longest rests.


    Mea culpa, I did not notice that you need this to work in Microsoft SQL Server 2000. That version doesn’t support CTE’s or windowing functions. I’ll leave the answer above in case it helps someone else.

    In SQL Server 2000, the common advice is to use a correlated subquery:

    SELECT p1.HoursCTR, (SELECT SUM(t.Duration) FROM 
        (SELECT TOP 2 p2.Duration FROM #Periods AS p2
         WHERE p2.HoursCTR = p1.HoursCTR 
         ORDER BY p2.Duration DESC) AS t) AS LongestBreaks
    FROM #Periods AS p1
    

    SQL 2000 does not have CTE’s, nor ROW_NUMBER().
    Correlated subqueries can need an extra step when using group by.

    This should work for you:

    SELECT 
        F.HoursCTR,
        MAX (F.LongestBreaks) AS LongestBreaks -- Dummy max() so that groupby can be used.
    FROM
        (
            SELECT 
                Pm.HoursCTR, 
                (
                    SELECT 
                        COALESCE (SUM (S.Duration), 0)    
                    FROM 
                        (
                            SELECT TOP 2    T.Duration
                            FROM            #Periods    AS T
                            WHERE           T.HoursCTR  = Pm.HoursCTR 
                            AND             T.Rest      = 1
                            ORDER BY        T.Duration  DESC
                        ) AS S
                 ) AS LongestBreaks
            FROM
                #Periods AS Pm
        ) AS F
    GROUP BY
        F.HoursCTR
    

    Unfortunately for you, Alex, you’ve got the right solution: correlated subqueries, depending upon how they’re structured, will end up firing multiple times, potentially giving you hundreds of individual query executions.

    Put your current solution into the Query Analyzer, enable “Show Execution Plan” (Ctrl+K), and run it. You’ll have an extra tab at the bottom which will show you how the engine went about the process of gathering your results. If you do the same with the correlated subquery, you’ll see what that option does.

    I believe that it’s likely to hammer the #Periods table about as many times as you have individual rows in that table.

    Also – something’s off about the correlated subquery, seems to me. Since I avoid them like the plague, knowing that they’re evil, I’m not sure how to go about fixing it up.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.