hashes are not unique, it is just that hash collisions are rare (and for a good hash algorithm it is hard to find the collision).
for example suppose you take a 1MB file (8 million bits) and a 25 bit hash. there are 2^8000000 possible 1MB files (in normal notations this is a number with 2.4 million digits, so it wont fit in this post). there 2^256 possible hash values (only 11579208923731619542357098500868790785326998466564 0564039457584007913129639936). So for each possible has there must be many possible files.
for roughly the same reason a lossless compression algorithm can't guarantee to be able to compress every file.
if you take the
AAABB BBCCC CCDEE (15 "bits") -> 3A 4B 5C D 2E (9 "bits")
example from above, you can see that some inputs (actually most) can't be compressed
ABDED CADEA DCAEC (15 "bits") -> 1A 1B 1D 1E 1D 1C 1A 1D 1E 1A 1D 1C 1A 1E 1C (30 "bits")
compression only works if there is redundancy in the input that can squeezed out.
Bookmarks