How to Select Every Row Where Column Value is NOT Distinct

I need to run a select statement that returns all rows where the value of a column is not distinct (e.g. EmailAddress).

For example, if the table looks like below:

  • Importing a CSV file using BULK INSERT command into SQL Server table
  • Sargable queries using ISNULL in TSQL
  • Can I return a byte array from a SQL Server VarBinary column using a parameterized query?
  • What's the maximum number of rows a single SQL Server table can store?
  • How to load files according to Created Date in Windows command shell via SQL Server's xp_cmdshell
  • Can I use one .mdf file with multiple DBMSes (SQL Server 2008 and 2012)?
  • CustomerName     EmailAddress
    Aaron            aaron@gmail.com
    Christy          aaron@gmail.com
    Jason            jason@gmail.com
    Eric             eric@gmail.com
    John             aaron@gmail.com
    

    I need the query to return:

    Aaron            aaron@gmail.com
    Christy          aaron@gmail.com
    John             aaron@gmail.com (peer edited from john@gmail.com)
    

    I have read many posts and tried different queries to no avail. The query that I believe should work is below. Can someone suggest an alternative or tell me what may be wrong with my query?

    select EmailAddress, CustomerName from Customers
    group by EmailAddress, CustomerName
    having COUNT(distinct(EmailAddress)) > 1
    

    7 Solutions collect form web for “How to Select Every Row Where Column Value is NOT Distinct”

    This is significantly faster than the EXISTS way:

    SELECT [EmailAddress], [CustomerName] FROM [Customers] WHERE [EmailAddress] IN
      (SELECT [EmailAddress] FROM [Customers] GROUP BY [EmailAddress] HAVING COUNT(*) > 1)
    

    The thing that is incorrect with your query is that you are grouping by email and name, that forms a group of each unique set of email and name combined together and hence

    aaron and aaron@gmail.com
    christy and aaron@gmail.com
    john and aaron@gmail.com
    

    are treated as 3 different groups rather all belonging to 1 single group.

    Please use the query as given below :

    select emailaddress,customername from customers where emailaddress in
    (select emailaddress from customers group by emailaddress having count(*) > 1)
    

    How about

    SELECT EmailAddress, CustomerName FROM Customers a
    WHERE Exists ( SELECT emailAddress FROM customers c WHERE a.customerName != c.customerName AND a.EmailAddress = c.EmailAddress)
    
    select CustomerName,count(1) from Customers group by CustomerName having count(1) > 1
    

    Just for fun, here’s another way:

    ;with counts as (
        select CustomerName, EmailAddress,
          count(*) over (partition by EmailAddress) as num
        from Customers
    )
    select CustomerName, EmailAddress
    from counts
    where num > 1
    

    Rather than using sub queries in where condition which will increase the query time where records are in huge.

    I would suggest to use Inner Join as better option to this problem.

    Considering same table this could give the result

    SELECT EmailAddress, CustomerName FROM Customers as a 
    Inner Join Customers as b on a.CustomerName <> b.CustomerName and a.EmailAddress = b.EmailAddress
    

    For still better results I would suggest you to use CustomerID or any unique field of your table. Duplication of CustomerName is possible.

    Well there is a slight change to find the non distinct rows..

    SELECT EmailAddress, CustomerName FROM Customers WHERE EmailAddress NOT IN
    (SELECT EmailAddress FROM Customers GROUP BY EmailAddress HAVING COUNT(*) > 1)

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.