[kwlug-disc] Image Comparison

Chris Irwin chris at chrisirwin.ca
Tue Aug 11 20:32:52 EDT 2009


Does anybody know of any way of mass comparing jpeg files by a image
content rather than a file sum or name? I've got two sets of several
thousand images I need to sort through. Basically I want to find unique
files in directories A & B, deleting duplicates from B.

The issues are that there ARE duplicated file names in each
set("mom.jpg" for example), so I can't do a simple `find | uniq` combo.
File sums are also different as tags are stored in EXIF data so
something like fslint won't work as it just compares sums based on the
whole file.

The whole story (if you want more information) is that I recently
imported about 6000 photos of my Uncle's photos into my Photo Library
(I'm using f-spot). During import f-spot copies photos into a dated
folder hierarchy:

	Pictures/YYYY/MM/DD/original_file_name

Before I got rid of the files from my USB key, I decided to check the
total counts. The Tag I created for these photos contains 5112 photos,
but there were 6529 files originally...

f-spot DOES do duplicate detection, so it is possible that I had some
photos already and it skipped importing them. But I only had 2000 photos
to begin with and the overlap would be much less than 1400 photos.

There are a manageable amount of non-jpeg files that probably didn't
import (PCX, etc) that I can sort through manually. I just don't want to
manually compare ~6300 jpegs.

-- 
Chris Irwin <chris at chrisirwin.ca>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://astoria.ccjclearline.com/pipermail/kwlug-disc_kwlug.org/attachments/20090811/ce330a40/attachment.bin>


More information about the kwlug-disc_kwlug.org mailing list