So your application has been humming along live for a couple of months and suddenly you find that one of your database tables contains duplicate records. You review the table structure and find that it does not have a unique constraint defined on a field (or group of fields) that should be unique. So how can these duplicates be eliminated?
I've encountered many solutions for joining a table to itself to identify and delete groups of duplicates, but on large (1GB+) tables this will cause a full table scan (unless you run it in batches) and could be very difficult on a production server. Another option is process the deletions through a temporary table and then rename the tables. So after you've eliminated the duplicates you still need to add the unique constraint to the table; wouldn't it be better if you could handle everything in one shot?
MySQL includes an option for that in the ALTER TABLE syntax; ALTER IGNORE TABLE. The IGNORE option works a bit like 'INSERT IGNORE INTO table1 SELECT * FROM table2...' in that the first record remains in the table but subsequent records are silently dropped.
Here's a "fun" example of how this all works: