Results 1 to 10 of 12

Thread: Help find duplicate files

Hybrid View

  1. #1
    Join Date
    Oct 2005
    Location
    Al Ain
    Beans
    8,942

    Re: Help find duplicate files

    Ugh... I can feel your pain.

    You should read up on the 'find' utility. Between find, md5sum and sort, you should be able to get a list of filenames sorted by checksum. Then you can look for duplicates in that list.

    Find has the ability to recurse into a directory tree and call a utility or a script on each file in the tree.

    Something like this:
    Code:
    find /photos -name \*.jp* -print -exec md5sum {} \;|sort>filelist
    I haven't tried the above, but it may(!) do something like what you need - at least it should give you an idea. Copy some files to a different spot and run your experiments there, before letting it go on the real data.

    Cheers,

    Herman

  2. #2
    Join Date
    Sep 2006
    Beans
    2,914

    Re: Help find duplicate files

    Code:
    md5sum *.* |sort -u|awk 'x[$1]++'

  3. #3
    Join Date
    Jan 2008
    Beans
    4,757

    Re: Help find duplicate files

    As much as I admire the elegance of it, what about the other file/s (as in the other duplicates)?
    And hidden ones?

    Shouldn't that be mentioned too?

    There is an app called fdupes which can do this for you.

    Code:
    fdupes $PWD
    [EDIT]
    It also has a command argument to ignore empty files
    Code:
    fdupes -n $PWD
    Regards
    Iain
    Last edited by ibuclaw; August 5th, 2008 at 08:01 PM.

  4. #4
    Join Date
    May 2008
    Location
    Salem, WV, US
    Beans
    18
    Distro
    Kubuntu 9.10 Karmic Koala

    Re: Help find duplicate files

    Quote Originally Posted by tinivole View Post
    As much as I admire the elegance of it, what about the other file/s (as in the other duplicates)?
    And hidden ones?

    Shouldn't that be mentioned too?

    There is an app called fdupes which can do this for you.

    Code:
    fdupes $PWD
    [EDIT]
    Just tried out the Intrepid version, and it has a command argument to ignore empty files
    Code:
    fdupes -n $PWD
    Regards
    Iain
    Thanks, this saved me a lot of time. This worked just about perfectly:
    Code:
    fdupes -r -d /home/digitalhead/Photos
    Probably took an all day job and turned it into a 10 minute job for it to scan and for me to choose which ones to keep. Just too bad there isn't a syntax to automatically select ones to preserve, such as [capletter][capletter]######.[capext] or [capletter]#######.[capext] or automatically delete files with "-1" between the filename and extension or lower case filenames. Oh well, job is done in a matter of minutes rather than hours.

    Now, what's the easiest way to update the f-spot database to remove the old files? Just delete ~/.gnome2/f-spot/photos.db and re-import without copying to ~/Photos, or is there an easier (quicker) way? It takes a long time to do the initial import of all these pictures.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •