Sum of minutes between multiple date ranges
The scenario is users specify when they are available, these specified times can overlap each other. I’m trying to get the total time they are available for. Example with SQL Fiddle:
--Available-- ID userID availStart availEnd 1 456 '2012-11-19 16:00' '2012-11-19 17:00' 2 456 '2012-11-19 16:00' '2012-11-19 16:50' 3 456 '2012-11-19 18:00' '2012-11-19 18:30' 4 456 '2012-11-19 17:30' '2012-11-19 18:10' 5 456 '2012-11-19 16:00' '2012-11-19 17:10' 6 456 '2012-11-19 16:00' '2012-11-19 16:50'
The output should be 130 minutes:
1: 60 2: 0 as falls inside 1 3: 30 4: 30 as the last 10 mins is covered by 3 5: 10 as first 60 mins is covered by 1 6: 0 as falls inside 1
I can get the total overlapping minutes, however this is more than the SUM of the available minutes:
Any ideas how I can achieve this?
EDIT 21st Nov 12: Thanks so far for everyone’s solutions – in a way I’m happy to see this wasn’t an ‘easy’ query to write.
EDIT 23rd Nov 12: This is all great work. Internally here we’re thinking it might be best to ensure users cannot enter overlapping times (eg forcing them to amend an existing entry)!
5 Solutions collect form web for “Sum of minutes between multiple date ranges”
Gordon Linoff has a CTE based answer
I’ve done some performance analysis on all the working algorithms
Blank values mean it took too long. This is tested on a single Core i7 X920 @2GHz chip, backed by a couple of SSDs. The only index created was a cluster on UserID, AvailStart. If you think you can improve any of the performance, let me know.
This CTE version was worse than linear, SQL Server can’t do the RN = RN + 1 join in an efficient way. I rectified this with a hybrid approach below, where I save and index the first CTE into a table variable. This still takes ten times as much IO as the cursor based approach.
With OrderedRanges as ( Select Row_Number() Over (Partition By UserID Order By AvailStart) AS RN, AvailStart, AvailEnd From dbo.Available Where UserID = 456 ), AccumulateMinutes (RN, Accum, CurStart, CurEnd) as ( Select RN, 0, AvailStart, AvailEnd From OrderedRanges Where RN = 1 Union All Select o.RN, a.Accum + Case When o.AvailStart <= a.CurEnd Then 0 Else DateDiff(Minute, a.CurStart, a.CurEnd) End, Case When o.AvailStart <= a.CurEnd Then a.CurStart Else o.AvailStart End, Case When o.AvailStart <= a.CurEnd Then Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End Else o.AvailEnd End From AccumulateMinutes a Inner Join OrderedRanges o On a.RN = o.RN - 1 ) Select Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes
After doing some performance analysis, here’s a hybrid CTE/table variable version that performs better than anything except the cursor based approach
Create Function dbo.AvailMinutesHybrid(@UserID int) Returns Int As Begin Declare @UserRanges Table ( RN int not null primary key, AvailStart datetime, AvailEnd datetime ) Declare @Ret int = Null ;With OrderedRanges as ( Select Row_Number() Over (Partition By UserID Order By AvailStart) AS RN, AvailStart, AvailEnd From dbo.Available Where UserID = @UserID ) Insert Into @UserRanges Select * From OrderedRanges ;With AccumulateMinutes (RN,Accum, CurStart, CurEnd) as ( Select RN, 0, AvailStart, AvailEnd From @UserRanges Where RN = 1 Union All Select o.RN, a.Accum + Case When o.AvailStart <= a.CurEnd Then 0 Else DateDiff(Minute, a.CurStart, a.CurEnd) End, Case When o.AvailStart <= a.CurEnd Then a.CurStart Else o.AvailStart End, Case When o.AvailStart <= a.CurEnd Then Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End Else o.AvailEnd End From AccumulateMinutes a Inner Join @UserRanges o On a.RN + 1 = o.RN ) Select @Ret = Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes Option (MaxRecursion 0) Return @Ret End
Here’s another way of doing it with a cursor. I feel this techinique should be adaptable to a CTE, but I can’t figure out how to do it
The method is to arrange each range by start time
Then we build a range that coalesces ranges in order, until we find
a range that doesn’t overlap our coalesced range.
We then calculate how many minutes are in the coalesced range, and remember this
We carry on with the next ranges, again coalesing any that overlap.
We accumulate minutes each time we get a non overlapping start point
At the end we add the accumulated minutes onto the length of the last range
It’s fairly easy to see that because of the order, once a range is distinct from
what’s gone before then no further ranges could overlap what’s gone before, as their
start dates are all greater.
Declare @UserID int = 456, @CurStart datetime, -- our current coalesced range start @CurEnd datetime, -- our current coalsced range end @AvailStart datetime, -- start or range for our next row of data @AvailEnd datetime, -- end of range for our next row of data @AccumMinutes int = 0 -- how many minutes so far accumulated by distinct ranges Declare MinCursor Cursor Fast_Forward For Select AvailStart, AvailEnd From dbo.Available Where UserID = @UserID Order By AvailStart Open MinCursor Fetch Next From MinCursor Into @AvailStart, @AvailEnd Set @CurStart = @AvailStart Set @CurEnd = @AvailEnd While @@Fetch_Status = 0 Begin If @AvailStart <= @CurEnd -- Ranges Overlap, so coalesce and continue Begin If @AvailEnd > @CurEnd Set @CurEnd = @AvailEnd End Else -- Distinct range, coalesce minutes from previous range Begin Set @AccumMinutes = @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) Set @CurStart = @AvailStart -- Start coalescing a new range Set @CurEnd = @AvailEnd End Fetch Next From MinCursor Into @AvailStart, @AvailEnd End Select @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) As TotalMinutes Close MinCursor Deallocate MinCursor;
The main problem is that you can have chains of overlapping entries, so you need to combine an indefinite amount of times to remove all the overlap – this is more suited to a procedural method than SQL. But if you would prefer to not use temporary tables, here’s a CTE method – keep in mind that CTEs can only recurse a given number of times, so if you have any particularly long chains, it will fail.
WITH MergedAvailable AS ( SELECT Available.UserID, Available.AvailStart, MAX(Available.AvailEnd) AS AvailEnd FROM Available WHERE ( SELECT COUNT(*) FROM Available AS InnerAvailable WHERE InnerAvailable.AvailStart < Available.AvailStart AND InnerAvailable.AvailEnd >= Available.AvailStart ) = 0 GROUP BY Available.UserID, Available.AvailStart UNION ALL SELECT MergedAvailable.UserID, MergedAvailable.AvailStart, LongestExtensionToAvailableInterval.NewIntervalEnd FROM MergedAvailable CROSS APPLY GetLongestExtensionToAvailableInterval(MergedAvailable.UserID, MergedAvailable.AvailStart, MergedAvailable.AvailEnd) AS LongestExtensionToAvailableInterval WHERE LongestExtensionToAvailableInterval.NewIntervalEnd IS NOT NULL ) SELECT SUM(DATEDIFF(MINUTE, FinalAvailable.AvailStart, FinalAvailable.AvailEnd)) AS MinsAvailable FROM ( SELECT MergedAvailable.UserID, MergedAvailable.AvailStart, MAX(MergedAvailable.AvailEnd) AS AvailEnd FROM MergedAvailable GROUP BY MergedAvailable.UserID, MergedAvailable.AvailStart ) AS FinalAvailable
This table function is required:
CREATE FUNCTION GetLongestExtensionToAvailableInterval ( @UserID int, @CurrentIntervalStart datetime, @CurrentIntervalEnd datetime ) RETURNS TABLE AS RETURN SELECT MAX(Available.AvailEnd) AS NewIntervalEnd FROM Available WHERE Available.UserID = @UserID AND Available.AvailStart > @CurrentIntervalStart AND Available.AvailStart <= @CurrentIntervalEnd AND Available.AvailEnd > @CurrentIntervalEnd
The general idea is that it starts from all ranges where the start of the range isn’t overlapping anything, and then with every recursion it extends the current range to the furthest extent of the currently overlapping ranges. The table function is needed to determine the furthest extent, as recursing sections of CTEs are not allowed to included plain aggregates.
With the data you’ve provided, the starting rows are:
456 2012-11-19 16:00 2012-11-19 17:10 456 2012-11-19 17:30 2012-11-19 18:10
The only row which ends up being added via the recursion is:
456 2012-11-19 17:30 2012-11-19 18:30
For the sake of the example, say you had a row with ID 7 which went from 18:20 to 19:20. Then there would be a second recursion which brought back the row:
456 2012-11-19 17:30 2012-11-19 19:20
So while the query will get to the start and end of each overlapping range, it will also be bringing back all the intermediate stages. This is why we need to take the aggregate maximum end date for each start date after the CTE, to remove them.
Condition t1.availStart > t2.availEnd OR t1.availEnd < t2.availStart check period which will never be crossed. If it is crossed then minimum availStart or maximum availEnd esle availStart or availEnd.
Probably more than one crossing period. In your case it 16:00:00 - 17:10:00 includes the ranges:16:00:00 - 16:50:00, 16:00:00 - 16:50:00, 16:00:00 - 17:00:00, 16:00:00 - 17:10:00 17:30:00 - 18:30:00 includes the ranges:17:30:00 - 18:10:00, 18:00:00 - 18:30:00
UPDATE 21.11.2012; 30.11.2012; 04.01.2013
CREATE FUNCTION dbo.Overlap ( @availStart datetime, @availEnd datetime, @availStart2 datetime, @availEnd2 datetime ) RETURNS TABLE RETURN SELECT CASE WHEN @availStart >= @availEnd2 OR @availEnd <= @availStart2 THEN @availStart ELSE CASE WHEN @availStart > @availStart2 THEN @availStart2 ELSE @availStart END END AS availStart, CASE WHEN @availStart >= @availEnd2 OR @availEnd <= @availStart2 THEN @availEnd ELSE CASE WHEN @availEnd > @availEnd2 THEN @availEnd ELSE @availEnd2 END END AS availEnd ;WITH cte AS ( SELECT userID, availStart, availEnd, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY AvailStart) AS Id FROM dbo.test53 ), cte2 AS ( SELECT Id, availStart, availEnd FROM cte WHERE Id = 1 UNION ALL SELECT c.Id, o.availStart, o.availEnd FROM cte c JOIN cte2 ct ON c.Id = ct.Id + 1 CROSS APPLY dbo.Overlap(c.availStart, c.availEnd, ct.availStart, ct.availEnd) AS o ) SELECT TOP 1 SUM(DATEDIFF(minute, availStart, MAX(availEnd))) OVER() FROM cte2 GROUP BY availStart
Demo on SQLFiddle
Create Table #Available ( ID int not null primary key, UserID int not null, AvailStart datetime not null, AvailEnd datetime not null ) Insert Into #Available (ID,UserID, AvailStart, AvailEnd) Values (1,456, '2012-11-19 16:00', '2012-11-19 17:00'), (2,456, '2012-11-19 16:00', '2012-11-19 16:50'), (3,456, '2012-11-19 18:00', '2012-11-19 18:30'), (4,456, '2012-11-19 17:30', '2012-11-19 18:10'), (5,456, '2012-11-19 16:00', '2012-11-19 17:10'), (6,456, '2012-11-19 16:00', '2012-11-19 16:50'), (7,457, '2012-11-19 16:00', '2012-11-19 17:10'), (8,457, '2012-11-19 16:00', '2012-11-19 16:50'); Select Distinct UserID into #users from #Available Create Table #mins(UserID int,atime datetime,aset tinyint ) Declare @start Datetime Declare @end Datetime Select @start=min(AvailStart),@end=max(AvailEnd) from #Available While @start<@end begin insert into #mins(UserID,atime) Select UserID ,@Start from #users Select @start=DateAdd(mi,1,@start) end update #mins set aset=1 from #Available where atime>=AvailStart and atime<Availend and #mins.UserID = #Available.UserID select UserID,SUM(aset) as [Minutes] from #mins Group by UserID Drop table #Available Drop table #mins Drop table #users