Page 3 of 5 FirstFirst 12345 LastLast
Results 21 to 30 of 46

Thread: gnuddrescue utility script

  1. #21
    Join Date
    Oct 2012
    Beans
    4

    Re: gnuddrescue utility script

    I don't know how clean or quick it is, but I have mocked up a quick program in C. Very little error handling, okay security if run as an unprivileged user, etc.

    Unfortunately, your attachment did not include a bitmap file to test it on, but I did a quick unit test with what you had in the readme.txt and an additional test where the last cluster is used also, to make sure it handled the edge case.

    I am emailing you my test cases, the program, and the resulting outputs.
    Last edited by shimavak; October 22nd, 2012 at 08:40 PM.

  2. #22
    Join Date
    Jul 2009
    Beans
    38

    Re: gnuddrescue utility script

    Quote Originally Posted by shimavak View Post
    I don't know how clean or quick it is, but I have mocked up a quick program in C. Very little error handling, okay security if run as an unprivileged user, etc.

    Unfortunately, your attachment did not include a bitmap file to test it on, but I did a quick unit test with what you had in the readme.txt and an additional test where the last cluster is used also, to make sure it handled the edge case.

    I am emailing you my test cases, the program, and the resulting outputs.
    Oops, I was supposed to include a small sample bitmap file that I was using. I have updated the attachment to include it. But I think you did what I needed already! As a matter of fact I think you fixed a bug that I didn't realize was affecting the length in the last line of output. And so much faster... my bash 3.589s, yours 0.004s. Awesome!


  3. #23
    Join Date
    Oct 2012
    Beans
    4

    Re: gnuddrescue utility script

    Glad to help,

    I was also thinking of a method which might improve the speed a bit more too, but it would heavily depend on if a typical bitmap file is completely random, or well ordered with large sections either filled or empty.

    The thought is that I am checking each bit, but the comparison could be done 4 times faster if we expect most bytes to either be FF or 00. We could check the byte before beginning to process it and if it is FF or 00 handle it differently. However, if it is a nearly random distribution, this will cause a slowdown, as we would have to do two extra comparisons every 8 loops which will nearly always fail.

    I tried running it on a urandom generated 32MB bitmap (corresponding to a 1TiB HDD with 4096 byte blocks) and the result only took 1.5 minutes, and most of that was disk writing because it was good random data. Actually, it may be a problem with large bitmaps which do have random data, because it was compiled without large file support, so my output file dies at 2GB. I have recompiled with LFS and made some minor bug fixes to handle such large files.

    In my test with a large random file, the change doesn't cost any time, and produces identical results, so it may be faster on real files. Either way, I am sure noone will mind a minute. I can add a status display if needed also (wouldn't take much effort to dump an update every few cycles).

  4. #24
    Join Date
    Jul 2009
    Beans
    38

    Re: gnuddrescue utility script

    Quote Originally Posted by shimavak View Post
    Glad to help,

    I was also thinking of a method which might improve the speed a bit more too, but it would heavily depend on if a typical bitmap file is completely random, or well ordered with large sections either filled or empty.

    The thought is that I am checking each bit, but the comparison could be done 4 times faster if we expect most bytes to either be FF or 00. We could check the byte before beginning to process it and if it is FF or 00 handle it differently. However, if it is a nearly random distribution, this will cause a slowdown, as we would have to do two extra comparisons every 8 loops which will nearly always fail.

    I tried running it on a urandom generated 32MB bitmap (corresponding to a 1TiB HDD with 4096 byte blocks) and the result only took 1.5 minutes, and most of that was disk writing because it was good random data. Actually, it may be a problem with large bitmaps which do have random data, because it was compiled without large file support, so my output file dies at 2GB. I have recompiled with LFS and made some minor bug fixes to handle such large files.

    In my test with a large random file, the change doesn't cost any time, and produces identical results, so it may be faster on real files. Either way, I am sure noone will mind a minute. I can add a status display if needed also (wouldn't take much effort to dump an update every few cycles).
    It would seem we think alike. I was actually going to do something like that for my bash script. But when I realized how slow it really was I just decided to stick with the very basics to ask help for. I would think that a real hard drive would have many areas of data in the bitmap that were either FF or 00, especially if there is ever any sort of defrag happening. I will try some speed tests on a real bitmap file at some point in the future when I get some free time. If you find the time to add some sort of status display that lets the user know that something is happening, that would be great. Everyone likes to see progress and not a seemingly locked up program.

    This utility is a concept that is based on the idea that many people have large hard drives in their computer of which they are using very little space. It is yet to be seen if this will end up creating very large ddrescue log files that are not efficient to process. I am sure it will not be the best for every scenario, but at what level it can work can only be found by testing in the real world.

  5. #25
    Join Date
    Oct 2012
    Beans
    4

    Re: gnuddrescue utility script

    I just ran a couple of tests with real bitmaps from 40GB drives and 250GB drives in various states of fullness and found that my fear was correct. With all of the extra conditionals to be checked each loop, it takes a statistically significant amount longer to run if we check for 0x00 and 0xFF.

    But, I then ran it on a 1TB real partition bitmap and it completed in an average of 0.392 seconds (n=50, 95%CI 0.387s-0.398s) whereas the normal one completed in an average of 1.130 seconds (n=50, 95%CI 1.106s-1.155s).

    Either way, for normal systems, it really doesn't seem to matter in any appreciable way, and adding a status indicator would just slow it down (has to do a comparison every loop to see if it is time to display an update) and you wouldn't even get to see it. The file sizes, by the way, are still quite small for the logs, so it should really be helpful. In the 1TB drive it ended up only being 3.3MiB, so nothing to worry about. It occurs to me that there may be a problem with using 32bit integers to store the position, but that won't happen with drives less than 16TiB.

    I did run a test switching them to unsigned long long but it does cost in time (0.542s [0.534s-0.552s] and 1.675s [1.662s-1.687s] respectively), so that is something to consider. Then again, it would not cost anything when compiled for 64 bit...

    Anyway, let me know if there are any other things you might like out of it. Or, of course, you can play around with all of those things yourself. As I mentioned, all you should need are gcc and libc6-dev packages and compile it with:

    Code:
    gcc -D_FILE_OFFSET_BITS=64 -O3 -Wall processbitmap.c -o processbitmap
    Last edited by shimavak; October 24th, 2012 at 07:10 PM.

  6. #26
    Join Date
    Jul 2009
    Beans
    38

    Re: gnuddrescue utility script

    I am sorry that I have not had time to do any tests, or anything else for that matter. I have been sick all week, and work decided that since I am sick it is a good time to make me work crazy overtime and try to kill me. If i am lucky I might be able to spend some time on it this Sunday.

    I do have one question about compiling. A few days ago I did play around with making and compiling a simple C program. I know what the -Wall switch does, but what is the -D_FILE_OFFSET_BITS=64 -O3

  7. #27
    Join Date
    Oct 2012
    Beans
    4

    Re: gnuddrescue utility script

    The -D option [D]efines a macro, in this case _FILE_OFFSET_BITS to be 64. Macros in C allow one to define bits of branching code that is branched at compile time instead of run time. In this case the _FILE_OFFSET_BITS lets us use 64 bit file addressing on a 32 bit system, allowing us to make files larger than 2GiB.

    The -O3 tells the compiler to be very aggressive in optimizing the code for speed during run. It makes it very difficult to debug if there are memory leaks, etc. but as this is a very simple program, there will not be any issues like that. You could compile without -O3 and it would work just fine, but it may make it ever so slightly faster to use the optimization. It can introduce some really strange errors though, but usually only if you aren't writing standards compliant code in the first place.

    A good resource for GCC -O options:
    http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
    Last edited by shimavak; October 25th, 2012 at 01:38 PM.

  8. #28
    Join Date
    Jul 2009
    Beans
    38

    Re: gnuddrescue utility script

    I just had a few minutes to do a quick test on a real 250gb drive with both versions and did see that your second version was slower by a bit. But still under 1/4 second for both in my virtual world running 2 operating systems. I guess I was looking at that 1.5 minutes you had from a random input when I was still thinking about a progress indicator. Your program is so fast compared to my bash script that now I KNOW that I MUST learn C.

    At some point in time in the hopefully near future I will try to get an alpha version of the real working utility posted. I will give you credit in it and include your source in the download. Just wondering if you had any specifics as to how you would like credit given.

  9. #29
    Join Date
    Jul 2009
    Beans
    38

    ddrntfsbitmap Utility to read only used portion of NTFS disk

    Alpha version of ddrntfsbitmap released. It seems to work, but could really use some real world testing with report of results. I wouldn't use it on any critical data until after you have tested it to see how it works. Includes a help.txt file which I encourage everyone to read first.
    Attached Files Attached Files
    Last edited by maximus57; October 30th, 2012 at 08:14 PM.

  10. #30
    Join Date
    Oct 2012
    Beans
    3

    Re: gnuddrescue utility script

    I'm interested in including this in Parted Magic by default when you guys think it's ready for the public to use. =D>

Page 3 of 5 FirstFirst 12345 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •