MongoDB: Find and remove duplicates without MapReduce

The following command will find all emails that exists more than once in a collection called users

db.users.aggregate([
 {$group: {_id: "$email", count: {$sum: 1}}},
 {$match: {count: {$gt: 1} }} 
])

The following command will create a unique index on email and delete the duplicates

db.users.createIndex( {email: 1}, {unique: true, dropDups: true} )
Advertisements

2 thoughts on “MongoDB: Find and remove duplicates without MapReduce

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s