Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Question About Diff

  1. #1
    Join Date
    Sep 2008
    Beans
    274

    Question Question About Diff

    Ey guys, I am having trouble understanding how diff of directories works. I have a directory:
    Code:
    $ ls -aR
    .				.FreezeFileSettings.txt		folder2
    ..				.FreezeFileSettings_fromBin.txt
    .DS_Store			folder1
    
    ./folder1:
    .	..	happy	subdir1
    
    ./folder1/subdir1:
    .		..		hallel.txt
    
    ./folder2:
    .	..	sad	subdir2
    
    ./folder2/subdir2:
    .		..		bombakz		hallel.txt	subsub2
    
    ./folder2/subdir2/subsub2:
    .	..	touched
    Here are the contents of each file:
    Code:
    $ head -3 folder1/happy 
    $ head -3 folder2/sad 
    $ head -3 folder1/subdir1/hallel.txt 
    jewish
    $ head -3 folder2/subdir2/bombakz 
    iiiiii
    $ head -3 folder2/subdir2/hallel.txt 
    wombatz
    jewish
    $ head -3 folder2/subdir2/subsub2/touched 
    hummers
    
    hithere
    This is what I get when I diff from terminal:
    Code:
    diff folder1 folder2
    Only in folder1: happy
    Only in folder2: sad
    Only in folder1: subdir1
    Only in folder2: subdir2
    ...and when I do it from python:
    Code:
    >>> cmp = filecmp.dircmp(os.getcwd()+'/folder1',os.getcwd()+'/folder2/')
    >>>
    >>>
    >>> cmp.report()
    diff /.../testarea/folder2/
    Only in /.../testarea/folder1 : ['happy', 'subdir1']
    Only in /.../testarea/folder2/ : ['sad', 'subdir2']
    >>>
    >>>
    >>>
    >>> cmp.report_full_closure()
    diff /.../testarea/folder2/
    Only in /.../testarea/folder1 : ['happy', 'subdir1']
    Only in /.../testarea/folder2/ : ['sad', 'subdir2']
    >>>
    >>>
    >>> cmp.right_only
    ['sad', 'subdir2']
    >>>
    >>>
    >>> cmp.left_only
    ['happy', 'subdir1']
    >>>
    >>>
    >>> cmp.diff_files
    []
    My question is, why isn't diff going all the way down the directories? Thanks.

    Edit: Oh and what would you recommend to check to see if directories are identical? I want the function to recurse all the way down each directory in case there are some changes in the very last subdirectory. What is the most full-proof method for this? Thanks again.
    Last edited by StunnerAlpha; October 29th, 2009 at 10:48 AM.

  2. #2
    Join Date
    May 2006
    Beans
    1,787

    Re: Question About Diff

    Quote Originally Posted by StunnerAlpha View Post
    My question is, why isn't diff going all the way down the directories? Thanks.
    Does "diff -r" do what you want?

  3. #3
    Join Date
    Sep 2008
    Beans
    274

    Re: Question About Diff

    Quote Originally Posted by Arndt View Post
    Does "diff -r" do what you want?
    Nope, I just get this:
    Code:
    $ diff -r folder1 folder2
    Only in folder1: happy
    Only in folder2: sad
    Only in folder1: subdir1
    Only in folder2: subdir2
    Thanks for the response.

  4. #4
    Join Date
    May 2006
    Beans
    1,787

    Re: Question About Diff

    Quote Originally Posted by StunnerAlpha View Post
    Nope, I just get this:
    Code:
    $ diff -r folder1 folder2
    Only in folder1: happy
    Only in folder2: sad
    Only in folder1: subdir1
    Only in folder2: subdir2
    Thanks for the response.
    Do you want the comparison to go down and compare the contents of folder1/subdir1 and folder2/subdir2? If they were named folder1/subdir and folder2/subdir, they would. Now, their names are different, so the comparison stops there.

    Do you know that each directory only contains at most one subdirectory? Then the operation is well defined, but I don't think 'diff' can do it.

  5. #5
    Join Date
    Sep 2008
    Beans
    274

    Re: Question About Diff

    Quote Originally Posted by Arndt View Post
    Do you want the comparison to go down and compare the contents of folder1/subdir1 and folder2/subdir2? If they were named folder1/subdir and folder2/subdir, they would. Now, their names are different, so the comparison stops there.

    Do you know that each directory only contains at most one subdirectory? Then the operation is well defined, but I don't think 'diff' can do it.
    Ey man, to answer your question: no, I don't know that each directory will contain at most one subdir. I really need something that will recursively compare two directories as well as files within the directories(I want folder1/ compared with folder2/, then folder1/subdir1 compared with folder2/subdir2, and so on...). If diff cannot do it, is there any other utility that can help me out? Is there anything in the python libs that can help me here? Thanks.

  6. #6
    Join Date
    May 2006
    Beans
    1,787

    Re: Question About Diff

    Quote Originally Posted by StunnerAlpha View Post
    Ey man, to answer your question: no, I don't know that each directory will contain at most one subdir. I really need something that will recursively compare two directories as well as files within the directories(I want folder1/ compared with folder2/, then folder1/subdir1 compared with folder2/subdir2, and so on...). If diff cannot do it, is there any other utility that can help me out? Is there anything in the python libs that can help me here? Thanks.
    Then I think the task isn't well-defined yet. If you have these directories:

    folder1/subdir1
    folder1/subdir3
    folder1/morefiles

    folder2/subdir2
    folder2/subdir3

    then which directories do you want compared to which?

  7. #7
    Join Date
    Sep 2008
    Beans
    274

    Re: Question About Diff

    Quote Originally Posted by Arndt View Post
    Then I think the task isn't well-defined yet. If you have these directories:

    folder1/subdir1
    folder1/subdir3
    folder1/morefiles

    folder2/subdir2
    folder2/subdir3

    then which directories do you want compared to which?
    All I want is a boolean value based on if the two directories are identical or not. In the example you gave me folder1/subdir1 compared with both folder2/subdir2 and folder2/subdir3 to see if either of those folders are the same as folder1/subdir1, and then folder1/subdir3 compared with both subdirs of folder 2 again. But once the program comes to folder1/morefiles it should return False since "morefiles"(whether it be a dir or files) is not present in both directories in the same location.

    I don't want comparisons between the contents of files, I only want to compare the structure of the directories, meaning folders and contents of folders(filenames don't have to match, but there just needs to be the correct amount of files in one directory as in another).

    Hmm... this is becoming a little too far-fetched eh? I had best make an algorithm to do this myself, right?

  8. #8
    Join Date
    May 2006
    Beans
    1,787

    Re: Question About Diff

    Quote Originally Posted by StunnerAlpha View Post
    All I want is a boolean value based on if the two directories are identical or not. In the example you gave me folder1/subdir1 compared with both folder2/subdir2 and folder2/subdir3 to see if either of those folders are the same as folder1/subdir1, and then folder1/subdir3 compared with both subdirs of folder 2 again. But once the program comes to folder1/morefiles it should return False since "morefiles"(whether it be a dir or files) is not present in both directories in the same location.

    I don't want comparisons between the contents of files, I only want to compare the structure of the directories, meaning folders and contents of folders(filenames don't have to match, but there just needs to be the correct amount of files in one directory as in another).

    Hmm... this is becoming a little too far-fetched eh? I had best make an algorithm to do this myself, right?
    So the comparison of folder1 and folder2 should return False, since they have different numbers of subdirectories?

    Then the task seems to be to check whether there exists some way of renaming all node names in one tree T1 so that it becomes identical to another tree T2. There may be code written for that already, in some tree and graph manipulation library. A naive implementation could become exponential. But maybe the problem is NP-complete, I don't know.

  9. #9
    Join Date
    May 2006
    Beans
    1,787

    Re: Question About Diff

    Quote Originally Posted by Arndt View Post
    So the comparison of folder1 and folder2 should return False, since they have different numbers of subdirectories?

    Then the task seems to be to check whether there exists some way of renaming all node names in one tree T1 so that it becomes identical to another tree T2. There may be code written for that already, in some tree and graph manipulation library. A naive implementation could become exponential. But maybe the problem is NP-complete, I don't know.
    Interesting algorithm. The complexity is apparently n*log(n), so there is no need to worry. Google for "tree isomorphism" (assuming I haven't misunderstood your requirements).

  10. #10
    Join Date
    Sep 2008
    Beans
    274

    Re: Question About Diff

    Quote Originally Posted by Arndt View Post
    So the comparison of folder1 and folder2 should return False, since they have different numbers of subdirectories?

    Then the task seems to be to check whether there exists some way of renaming all node names in one tree T1 so that it becomes identical to another tree T2. There may be code written for that already, in some tree and graph manipulation library. A naive implementation could become exponential. But maybe the problem is NP-complete, I don't know.
    What do you mean by "NP-complete"? Or the task could be to just check to see if an item is a file/directory and compare the amount of each to the other directory, rather than renaming and making a mess of things, because I want to preserve everything the way it is if possible. So if I renamed to do the comparison I would have to rename each file back to the name it initially had. Oh and thanks for checking the big-O for the algorithm, I am thinking of tackling this myself, if you think this is foolish let me know. I don't think it should be too hard.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •