Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with finding duplicate photos



 
Tom Haskins-Vaughan writes: 

> Hi, here's the scenario. I have a directory, /home/photos and in that 
> folder are lots and lots of photos in many different subfolders. Now, 
> since I'm paying by the gig to back these up remotely I'd like to make 
> sure that there are no duplicates. If i can get a list of duplicate 
> filenames, I'm happy to go through and check that they're actually 
> duplicates manually, but I'm not too good on the command line. 
> 
> Any help would be greatly appreciated. 

This comes to mind: 

perl -MFile::Find -e ' 
  # first we iterate over all of the files and create 
  # a hash with a key of the bare filename and a value 
  # consisting of a list of all of the full filenames with 
  # that key 
  find(sub { 
         # $_ is the simple filename 
         # $File::Find::name is the full, absolute filename 
         # %allfiles is the hash 
         push(@{$allfiles{$_}}, $File::Find::name); 
       }, 
       @ARGV);  # you pass those directories in later... 

   # now we see if any filenames appear more than once 
   # if so, print them 
   while (($key, $val) = each %allfiles) { 
     if (scalar(@{$val}) > 1) { 
        map { print $_."\n" } @{$val}; 
        print "\n" 
     } 
   }' /home/photos /home/some-other-photo-dir 

You can probably just paste this into your shell and it will Just 
Work. 

I did type this on the command line.  Ha ha.  Only serious. 

Hope this helps, 

--kevin 
-- 
GnuPG ID: B280F24E             Don't you know there ain't no devil, 
alumni.unh.edu!kdc             there's just God when he's drunk? 
                                 -- Tom Waits 

-- 
This message has been scanned for viruses and 
dangerous content by MailScanner, and is 
believed to be clean. 

_______________________________________________ 
Discuss mailing list 
[hidden email] 
http://lists.blu.org/mailman/listinfo/discuss
 


BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org