Page 3 of 5 FirstFirst 12345 LastLast
Results 21 to 30 of 42

Thread: Could something like this be done or did I watch too many movies?

  1. #21
    Join Date
    Dec 2004
    Location
    Manchester
    Beans
    2,081
    Distro
    Ubuntu Studio 13.04 Raring Ringtail

    Re: Could something like this be done or did I watch too many movies?

    Quote Originally Posted by koenn View Post
    2- hash/fingerprint functions, maybe ?
    Theoretically, each file produces a unique hash.
    So if the server sends you a hash of the file you requested, all you have to do is recreate a file that, given the function, creates the same hash. You then know that the file you created is identical to the one on the server.

    Unfortunately, creating the correct file may be difficult.
    hashes are not unique, it is just that hash collisions are rare (and for a good hash algorithm it is hard to find the collision).

    for example suppose you take a 1MB file (8 million bits) and a 25 bit hash. there are 2^8000000 possible 1MB files (in normal notations this is a number with 2.4 million digits, so it wont fit in this post). there 2^256 possible hash values (only 11579208923731619542357098500868790785326998466564 0564039457584007913129639936). So for each possible has there must be many possible files.

    for roughly the same reason a lossless compression algorithm can't guarantee to be able to compress every file.

    if you take the
    AAABB BBCCC CCDEE (15 "bits") -> 3A 4B 5C D 2E (9 "bits")
    example from above, you can see that some inputs (actually most) can't be compressed
    ABDED CADEA DCAEC (15 "bits") -> 1A 1B 1D 1E 1D 1C 1A 1D 1E 1A 1D 1C 1A 1E 1C (30 "bits")

    compression only works if there is redundancy in the input that can squeezed out.

  2. #22
    Join Date
    Aug 2005
    Beans
    5,991

    Re: Could something like this be done or did I watch too many movies?

    I think you watched to many movies. You can't really create something out of nothing.

  3. #23
    Join Date
    Nov 2006
    Location
    Belgium
    Beans
    3,007
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: Could something like this be done or did I watch too many movies?

    Quote Originally Posted by ssam View Post
    hashes are not unique, it is just that hash collisions are rare (and for a good hash algorithm it is hard to find the collision).

    for example suppose you take a 1MB file (8 million bits) and a 25 bit hash. there are 2^8000000 possible 1MB files (in normal notations this is a number with 2.4 million digits, so it wont fit in this post). there 2^256 possible hash values (only 11579208923731619542357098500868790785326998466564 0564039457584007913129639936). So for each possible has there must be many possible files.
    Well, I said "theoretically" - and even so, I don't know enough math to be conclusive. So yes, it's "lossy" in that the file you get is not necessarily identical to the file you requested.

    In any case, what you describe, although it's correct, is merely implementation. As you demonstrate, the collision probability will depend on the size of the hash in relation to the key space (how large are the files vs how large are the hashes). With large hashes and relatively small source files, you could get (near) zero collision.

    Whether that would at all be practical, I don't know. Is there a reasonable change that sufficiently large hashes would be so big that, overall, you don't gain anything ? Probably, yes. Do I think it's likely you can save on transfer cost with something like "fuzzy rsync" or similar techniques ? Again, yes.

  4. #24
    Join Date
    Sep 2005
    Location
    Dallas, Tx
    Beans
    266
    Distro
    Xubuntu 12.10 Quantal Quetzal

    Re: Could something like this be done or did I watch too many movies?

    This is basically how filetypes work.
    Desktop
    Antec 300 Illusion, Athlon II X4 640, 8GB DDR3 1333, EarthWatts430W, Radeon HD 6670 1GB
    Laptop
    Lenovo Thinkpad Z61T, T7200@2.00GHz, 2GB DDR2, 100gb HD, TPM Encryption

  5. #25
    Join Date
    Jun 2007
    Location
    Porirua, New Zealand
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Could something like this be done or did I watch too many movies?

    Quote Originally Posted by koenn View Post
    The point of all compression is to look for repeating patterns, and reproduce them; and often to do so requires less information than the actual file.
    IMO it's not just repetition but redundancy can be used as well.

    Some of the compression algorithms I've briefly looked at go over my head, but one which is a bit easier to grasp is the case of a simple ASCII text file with no fancy formatting, UTF or anything like that. In such a scenario only 7 out of every 8 bits would typically be used, making it unnecessary to transmit the full 8 bits for every byte. Yes, I know, not a particularly massive saving in space and bandwidth in this example, but the idea of an algorithm optimised for a particular kind of data has been used, e.g. jpeg.
    Forum DOs and DON'Ts
    Never assume that information you find using a search engine is up-to-date.

  6. #26
    Join Date
    Apr 2008
    Location
    LOCATION=/dev/random
    Beans
    5,767
    Distro
    Ubuntu Development Release

    Re: Could something like this be done or did I watch too many movies?

    A similar process is used by the zsync file transfer program.

    I download the 13.04 testing images which are updated every day. By using zsync the latest version is checked against the version already on my machine, and only the pieces of the iso file that are different to the ones I already have are downloaded.

    In the example below you can see that zsync realised I already had 91.4% of the file on my machine, so I only had to download 70MB to get the new iso instead of the entire 800MB file.

    Code:
    rob@arch:~/ISOs/Linux/Ubuntu/Raring$ zsync http://cdimage.ubuntu.com/daily-live/current/raring-desktop-amd64.iso.zsync
    #################### 100.0% 323.8 kBps DONE     
    
    reading seed file raring-desktop-amd64.iso: ***************************************************************
    ***********************************************************************************************************
    ***********************************************************************************************************
    ***********************************************************************************************************
    ***********************************************************************************************************
    ***********************************************************************************************************
    ***********************************************************************************************************
    *************************************************************
    Read raring-desktop-amd64.iso. Target 91.4% complete.      
    downloading from http://cdimage.ubuntu.com/daily-live/current/raring-desktop-amd64.iso:
    #################### 100.0% 1645.9 kBps DONE     
    
    verifying download...checksum matches OK
    used 749846528 local, fetched 70852261
    rob@arch:~/ISOs/Linux/Ubuntu/Raring$
    Last edited by Cheesemill; January 14th, 2013 at 12:00 PM.
    Cheesemill

  7. #27
    Join Date
    Aug 2005
    Beans
    5,991

    Re: Could something like this be done or did I watch too many movies?

    I'm beginning to think people are missing what the OP proposes.

    He's not talking about compression, most files on the net you download are already heavily compressed.

    He seems to be talking about creating a file locally from nothing but something akin to a MD5 Hash sent by the far end.

    I'm no mathematician but this is not going to happen with normal algorithms unless we start delving into Quantum teleportation.

  8. #28
    Join Date
    Dec 2004
    Location
    Manchester
    Beans
    2,081
    Distro
    Ubuntu Studio 13.04 Raring Ringtail

    Re: Could something like this be done or did I watch too many movies?

    Quote Originally Posted by koenn View Post
    Well, I said "theoretically" - and even so, I don't know enough math to be conclusive. So yes, it's "lossy" in that the file you get is not necessarily identical to the file you requested.

    In any case, what you describe, although it's correct, is merely implementation. As you demonstrate, the collision probability will depend on the size of the hash in relation to the key space (how large are the files vs how large are the hashes). With large hashes and relatively small source files, you could get (near) zero collision.

    Whether that would at all be practical, I don't know. Is there a reasonable change that sufficiently large hashes would be so big that, overall, you don't gain anything ? Probably, yes. Do I think it's likely you can save on transfer cost with something like "fuzzy rsync" or similar techniques ? Again, yes.
    for there to be enough possible hashes for all the possible files you would need the hash to be as big as the file.

    most real files do have enough redundancy to allow compression. many transfer protocols (http, rsync, ssh) have compression options to enable lossless compression. some mobile web browsers can connect to a proxy server that does lossy compression.

  9. #29
    Join Date
    Jan 2010
    Location
    Hyperborea
    Beans
    1,403
    Distro
    Ubuntu

    Re: Could something like this be done or did I watch too many movies?

    So it's like this: don't send me a psychical file, send me a description of that file, and I will make it psychical here locally.
    You have already solved the problem! With your psychic files all you have to do is think of a file and it will immediately appear on your PC.
    Wow! I want a psychic computer too!

  10. #30
    Join Date
    Nov 2012
    Beans
    248
    Distro
    Ubuntu 13.04 Raring Ringtail

    Re: Could something like this be done or did I watch too many movies?

    I meant to write physical

    In any case, what I was asking originally is, can something be created out of nothing (or by borrowing parts of bits already on the computer), just by getting information of how that file looks like.

    Just like Cheesemill said, something similar to zsync, but unique in its own way. Zsync is the closest to this.

Page 3 of 5 FirstFirst 12345 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •