Most recent entry from two tables

I have a SQL 2000 DB with and old table and a new table with combined records of over 20,000,000 records. The two tables are exactly the same, but were split due to performance issues. I am not the DB admin, I just need data out of it and have been given DBReader rights on it.

OldTable:
ClientID,
AppID,
ModTime,
Event

  • DATEPART IN SELECT CASE Not working
  • Querying a table that contains an XML column
  • How significant is the performance difference when joining on nvarchar versus on int
  • Two levels of grouping on one set of data. Is it possible
  • Elasticsearch : set up parent/child using jdbc-rivers
  • Why does using XML Path('') return a blue underlined data and how to remove it?
  • NewTable:
    ClientID,
    AppID,
    ModTime,
    Event

    I need to retrieve the most recent record for each client, appid and event from whichever table has the most recent entry for it. Anyone any ideas about the best method for this? I have tried using a union, but the query takes over two hours to complete. I was thinking of using a join instead, but I’m not sure the best approach.

    Thanks!

    5 Solutions collect form web for “Most recent entry from two tables”

    you will have to use a UNION, but if the tables are DISTINCT, consider using a UNION ALL which will be faster.

    Also ensure that you have the correct indexes on the tables for this kind of query.

    why not perform the query on each table, union the results, and repeat the query on the union?

    If you’re using a plain “UNION”, then that could cause issues. UNION ensures that it’s output contains no duplicates, which generally requires sorting or hashing the entire dataset.

    UNION ALL, on the other hand, just returns all rows from both sides.

    If this is just a one-off job and you only have two tables, just run a ‘most-recent-entry’ query on the two tables separately. Then do a UNION ALL of the two resultsets and use GROUP BY and MAX to leave only the most recent. In SQL:

    SELECT ClientID, AppID, Event, MAX(MaxModTime) FROM (
        SELECT ClientID, AppID, Event, MAX(ModTime) MaxModTime FROM table1
        GROUP BY ClientID, AppID, Event
        UNION ALL
        SELECT ClientID, AppID, Event, MAX(ModTime) MaxModTime FROM table2
        GROUP BY ClientID, AppID, Event
    ) Q
    GROUP BY ClientID, AppID, Event
    

    You can improve the speed of such a query by having an composite index on (ClientID, AppID, Event) for both tables, or when it is possible a clustered index on (ClientID, AppId, Event, ModTime).

    For performance, I suggest inserting ClientID, AppID, and MAX(ModTime) from the old table into a temporary table, appending ClientID, AppID, and MAX(ModTime) from the new table into the same temporary table and then querying ClientID, AppID, and MAX(ModTime) from the temporary table.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.