What to replace left join in a view so i can have an indexed view?

I have normalized tables in a database and to denormalize it, I created a view out of two tables. When I tried to create a clustered index on the view, it wouldn’t let me, as the view was created with a left outer join. I used a left join because I want the null values to show up in the resulting view, much like how it was suggested in this earlier post.

Question on join where one column one side is null

  • xsd schema file must be annotated in SQLXMLBULKLOADLib.SQLXMLBulkLoad4Class?
  • What's the SQL national character (NCHAR) datatype really for?
  • Sql Sever 2008 Select and Delete deadlock on the same index
  • Entity framework 5.0 context reload
  • Stop Inserting in Table if record already exists
  • SQL. I need to append a set of strings to each value in a column
  • The table structure and relationship is very much similar to what was described in the above link.

    I seemed to hit a wall here as I couldn’t convert my left join into an inner join, as that would exclude all records with null values on any of the joined columns. My questions are:

    1. Why is indexing not allowed on outer or self joins?
    2. Are there any performance hits on this kind of un-indexed view?
    3. Anyone knows any workaround to this problem?

    I’ve just finished a SQL Server course yesterday so don’t know how to proceed. Would appreciate any comments. Cheers.

  • Re-indexing large table - how screwed am I?
  • Understanding indexed view update qnd query process in SQL Server 2008 R2
  • 5 Solutions collect form web for “What to replace left join in a view so i can have an indexed view?”

    There is a “workaround” here that involves check for NULL in the join and having a NULL representation value in the table

    NULL value

    INSERT INTO Father (Father_id, Father_name) values(-255,'No father')
    

    The join

    JOIN [dbo].[son] s on isnull(s.father_id, -255) = f.father_id
    

    Here is an alternative. You want a materialized view of A not containing B. That isn’t directly available… so instead, materialize two views. One of all A’s and one of only A’s with B’s. Then, get only A’s not having B’s by taking A except B. This can be done efficiently:

    Create two materialized views (mA and mAB) (edit: mA could just be the base table).
    mA lacks the join between A and B (thus containing all A’s period [and therefore containing those records WITHOUT matches in B]).
    mAB joins between A and B (thus containing only A’s with B’s [and therefore excluding those records WITHOUT matches in B]).

    To get all A’s without matches in B, mask out those that match:

    with ids as (
      select matchId from mA with (index (pk_matchid), noexpand)
      except
      select matchId from mAB with (index (pk_matchid), noexpand)
    )
    select * from mA a join ids b on a.matchId = b.matchId;
    

    This should yield a left anti semi join against both your clustered indexes to get the ids and a clustered index seek to get the data out of mA you are looking for.

    Essentially what you are running into is the basic rule that SQL is much better at dealing with data that IS there than data that ISN’T. By materializing two sources, you gain some compelling set based options. You have to weigh the cost of these views against those gains yourself.

    I don’t think there is a good workaround. What you can do about this is to create a real table from the view and set indexes on that. This can be done by a stored procedure that is called regularly when data is updated.

    Select * 
    into <REAL_TABLE>
    From <VIEW>
    
    create CLUSTERED index <INDEX_THE_FIELD> on <REAL_TABLE>(<THE_FIELD>)
    

    But this is only a noteworthy approach if data isn’t updated every few seconds.

    Logically you are making two separate queries. ‘A LEFT JOIN B’ is just shorthand for ‘(A JOIN B) UNION A’

    The first query is table A inner joined to table B. This gets an indexed view, since this is where all the heavy lifting is done.

    The second query is just table A where any of the join columns are null. Make a view that produces the same output columns as the first query and pads them with nulls.

    Just union the two results before returning them. No need for a workaround.

    I’ll work on an answer to 1, but for now:

    [2]. The view will be no more nor less performant than the equivalent query on the udnerlying tables. All the usual advice applies about having covering indexes, preferably an index on the joined columns, etc.

    [3]. There’s no real workaround. Most of the restrictions on indexed views exist for very good reasons, once you dig into them.

    I’d just create the view, generally, and do no more, unless there was a specific performance problem.

    I’ll try to add an answer for 1 once I’ve reconstructed it in my own mind.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.