Finding duplicated data across one or more columns in a database table

A few months ago I posted a little query about finding duplicate rows in a database table. I’m revisiting this because I helped out Doogie with a similar query last night but with some complications.

Let’s start with the original simple scenario of checking duplicates in a single column.

Some example data, a Users table:

+----+----------------+
| Id | Email          |
+----+----------------+
|  1 | joe@bloggs.com |
|  2 | joe@bloggs.com |
|  3 | joe@bloggs.com |
|  4 | jane@doe.com   |
|  5 | jane@doe.com   |
|  6 | john@doe.com   |
+----+----------------+

You can see that joe@bloggs.com and jane@doe.com have been duplicated. This could have been prevented by putting a unique index on the Email column.

So to find what emails have duplicates in our table:

Continue reading Finding duplicated data across one or more columns in a database table

Finding duplicate rows in the database

NOTE: This post has been re-visited in Finding duplicated data across one or more columns in a database table

Information gleaned from Microsoft KB article: How to remove duplicate rows from a table in SQL Server

When you allow duplicated rows in a database that aren’t expected and shouldn’t be allowed, it’s a flag saying that you need . . . → Read More: Finding duplicate rows in the database