Most recent entry from two tables
I have a SQL 2000 DB with and old table and a new table with combined records of over 20,000,000 records. The two tables are exactly the same, but were split due to performance issues. I am not the DB admin, I just need data out of it and have been given DBReader rights on it.
I need to retrieve the most recent record for each client, appid and event from whichever table has the most recent entry for it. Anyone any ideas about the best method for this? I have tried using a union, but the query takes over two hours to complete. I was thinking of using a join instead, but I’m not sure the best approach.
5 Solutions collect form web for “Most recent entry from two tables”
you will have to use a
UNION, but if the tables are DISTINCT, consider using a
UNION ALL which will be faster.
Also ensure that you have the correct indexes on the tables for this kind of query.
why not perform the query on each table, union the results, and repeat the query on the union?
If you’re using a plain “UNION”, then that could cause issues. UNION ensures that it’s output contains no duplicates, which generally requires sorting or hashing the entire dataset.
UNION ALL, on the other hand, just returns all rows from both sides.
If this is just a one-off job and you only have two tables, just run a ‘most-recent-entry’ query on the two tables separately. Then do a UNION ALL of the two resultsets and use
GROUP BY and
MAX to leave only the most recent. In SQL:
SELECT ClientID, AppID, Event, MAX(MaxModTime) FROM ( SELECT ClientID, AppID, Event, MAX(ModTime) MaxModTime FROM table1 GROUP BY ClientID, AppID, Event UNION ALL SELECT ClientID, AppID, Event, MAX(ModTime) MaxModTime FROM table2 GROUP BY ClientID, AppID, Event ) Q GROUP BY ClientID, AppID, Event
You can improve the speed of such a query by having an composite index on (ClientID, AppID, Event) for both tables, or when it is possible a clustered index on (ClientID, AppId, Event, ModTime).
For performance, I suggest inserting ClientID, AppID, and MAX(ModTime) from the old table into a temporary table, appending ClientID, AppID, and MAX(ModTime) from the new table into the same temporary table and then querying ClientID, AppID, and MAX(ModTime) from the temporary table.