Near-Duplicate Image Detection [closed]

There has been a lot of research on image searching and similarity measures. It’s not an easy problem. In general, a single int won’t be enough to determine if images are very similar. You’ll have a high false-positive rate.

However, since there has been a lot of research done, you might take a look at some of it. For example, this paper (PDF) gives a compact image fingerprinting algorithm that is suitable for finding duplicate images quickly and without storing much data. It seems like this is the right approach if you want something robust.

If you’re looking for something simpler, but definitely more ad-hoc, this SO question has a few decent ideas.

Leave a Comment