Nope you missed my update after more testing
The 37G is the "Total Usage" size, 947M the "New Usage" size (Not old and new).
Nope you missed my update after more testing
The 37G is the "Total Usage" size, 947M the "New Usage" size (Not old and new).
I wonder if, although all hard-links for a file are considered equal...
...that some hard-links are considered more equal than others
It could be as simple as deciding that the top hard-link in the table is considered the original.
That is...In normal usage, the totals reported by du are of files which are the top hard-link in the table of hard-links pointing to a file.
If a file is hard-linked but not top link in the table, it is not counted.
If you delete the top hard-link the links all move up one, what was the second link, now becomes the top hard-link and is counted by du.
Last edited by SilverWave; November 21st, 2010 at 10:44 PM.
Except that I don't think there is actually a table of links anywhere. "stat filename" shows the total number of links to filename, but not any information about each link. Is there any other command besides "stat" that might reveal any more information about the file or its links?
As I understand it, the count of links acts like a reference count in an object-oriented language. An object is kept alive as long as the count of references to it remain above zero. When the last link is deleted, the object is added to the garbage collector's list, and it's resources (memory or disk blocks) are returned to the system as soon as the garbage collector gets around to it.
Ah ha! We have a winner!
How does du determine which hard link to disregard?
Originally Posted by alexandruOriginally Posted by alexandruPOSIX - duOriginally Posted by Dennis Williamson
Files with multiple links shall be counted and written for only one entry. The directory entry that is selected in the report is unspecified.
Last edited by SilverWave; November 22nd, 2010 at 07:25 AM.
14.2 du: Estimate file space usage
If two or more hard links point to the same file, only one of the hard links is counted. The file argument order affects which links are counted, and changing the argument order may change the numbers that du outputs.
.
Last edited by SilverWave; November 22nd, 2010 at 09:00 AM.
Nice research, Silver. I did some more tests (see below), and my hypothesis that du is simply looking at the link counts on each file is wrong. I'm leaning now toward the explanation that du keeps track of every file it has already counted. That would explain why it seems to run slow on very large link trees. There could be in memory some large collection of inodes for files already counted.
Here is the tree I set up for testing. File f1 was originally in dir1, f2 in dir2, and f3 in dir3. The links were added later. By listing the arguments to du in various sequences, we can see that the order in which the files and links were created is irrelevant. So I'm still sticking with the hypothesis that there is nothing more than a link count associated with each file (no ordered list of links, etc.)
Other observations relevant to RLB:Code:david@david-desktop:~/test$ ll dir1 total 20 drwxr-xr-x 2 david david 4096 2010-11-21 23:36 ./ drwxr-xr-x 5 david david 4096 2010-11-21 23:38 ../ -rwxr--r-- 3 david david 12110 2010-11-21 23:37 f1* david@david-desktop:~/test$ ll dir2 total 32 drwxr-xr-x 2 david david 4096 2010-11-21 23:42 ./ drwxr-xr-x 5 david david 4096 2010-11-21 23:38 ../ -rwxr--r-- 3 david david 12110 2010-11-21 23:37 f1* -rwxr--r-- 2 david david 12110 2010-11-21 23:38 f2* david@david-desktop:~/test$ ll dir3 total 44 drwxr-xr-x 2 david david 4096 2010-11-21 23:51 ./ drwxr-xr-x 5 david david 4096 2010-11-21 23:38 ../ -rwxr--r-- 3 david david 12110 2010-11-21 23:37 f1* -rwxr--r-- 2 david david 12110 2010-11-21 23:38 f2* -rwxr--r-- 1 david david 12110 2010-11-21 23:38 f3* david@david-desktop:~/test$ du -c dir1 dir2 dir3 16 dir1 16 dir2 16 dir3 48 total david@david-desktop:~/test$ du -c dir3 dir2 dir1 40 dir3 4 dir2 4 dir1 48 total david@david-desktop:~/test$ du -c dir3 dir3 dir3 40 dir3 16 dir3 16 dir3 72 total david@david-desktop:~/test$ du dir2 dir3 dir1 28 dir2 16 dir3 4 dir1
1) If you list a directory twice, the total shown by -c is incorrect.
2) The size shown for the second and later listings includes files that are linked just once, even if those files have already been counted.
I have found the explanation
Here we go:
The first size for folder1 is for files with single links and for files with multiple links*.Code:du -sh folder1 folder1 folder2 folder3
*I am tempted to say all files with multiple links, but you could have 2 hard links to a file in the same folder. Only the first will be counted.
The second size for folder1 is for files with single links only. (du has remembered the inodes for files with multiple links and does not show them again).
The third size for folder2 is for files with single links and for any files with multiple links that haven't been reported on previously.
__________________
The Rules are:
- du notes files with multiple links and will only show them once.
- du will always show files with only one link.
- The total reported with -c is simply the sum of the sizes listed.
__________________
Well that was fun
__________________
Thanks to alexandru; Dave Sherohman; Dennis Williamson; Nimmy Lebby and Slartibartfast on superuser.com for the help.
Last edited by SilverWave; November 23rd, 2010 at 07:11 PM. Reason: added caveat
Hi there,
I am relatively new to Ubuntu and just set up a Ubuntu Server Maverick as my file- and webserver. This script seems great but when I run it (after sudo -i) I get a ton of these error messages:
rsync: failed to hard-link /media/Data/Backup_Jonas/Server/current-0/lib32/libpcre.so with lib32/libpcre.so: Function not implemented (38)
/media/Data/Jonas_Backup/Server is the destination directory.
Any idea what I am doing wrong?
Cheers,
Jonas
Last edited by SilverWave; December 4th, 2010 at 07:31 PM. Reason: P.S.
Bookmarks