Friday, December 2, 2011

SQL Server : Using EXCEPT and INTERSECT to compare tables

Previously, I had written that UNION ALL (combined with a GROUP BY) is a really quick and easy way to compare two tables. You don't need to worry about NULLS, the code is fairly short and easy to follow, and you can view exceptions from both tables at the same time.

Well, now in SQL 2005, we have another option: using EXCEPT and INTERSECT. And these are even easier!

To return all rows in table1 that do not match exactly the rows in table2, we can just use EXCEPT like this:

select * from table1 except select * from table2

To return all rows in table2 that do not match exactly in table1, we reverse the EXCEPT:

select * from table2 except select * from table1

And to return all rows in table1 that match exactly what is in table2, we can use INTERSECT:

select * from table1 intersect select * from table2

In all of the above examples, the columns must match between the two tables, of course.
Thus, we can return a listing of all rows from either table that do not match completely by using UNION ALL to combine the results of both EXCEPT queries:

select 'table1' as tblName, * from
(select * from table1
except
select * from table2) x

union all

select 'table2' as tblName, * from
(select * from table2
except select * from table1) x

And we can now write a very simple stored procedure that compares any two tables (assuming the schemas match, of course) like this:

create procedure CompareTables @table1 varchar(100), @table2 varchar(100)
as
declare @sql varchar(8000)
set @sql = 'select ''' + @table1 + ''' as tblName, * from

(select * from ' + @table1 + '

except

select * from ' + @table2 + ') x

union all

select ''' + @table2 + ''' as tblName, * from

(select * from ' + @table2 + '

except

select * from ' + @table1 +') x'

exec(@sql)


Of course, both tables must have primary keys in place; duplicate values in these tables will not make logical sense when trying to determine which rows match or not.

So, EXCEPT and INTERSECT are pretty handy. Does anyone else have any suggestions for ideas where these operators can make things shorter, quicker or more efficient compared to older (pre SQL 2005) methods?

No comments:

Popular Posts