Results 1 to 10 of 10

Thread: Is this right? 4.4 GB folder compressed to 898 MB?

Hybrid View

  1. #1
    Join Date
    Apr 2009
    Location
    Costa Rica
    Beans
    255
    Distro
    Ubuntu 10.04 Lucid Lynx

    Question Is this right? 4.4 GB folder compressed to 898 MB?

    I am trying to make a backup of a folder which size is 4.4GB, this folder also contains subfolders that I need to backup, so I just figured I could tar the entire folder like this:

    Code:
    tar cvzf backup.tar.gz foldertobackup
    Which yields a 898 MB file.

    Can the compressing really be that good? I suspect something isn't being backed up here, how can I find out?

    thanks for the help!
    Last edited by X1R1; September 7th, 2012 at 10:49 PM.
    Linux User#498977
    There are only 10 types of people in the world. Those who understand binary, and those who dont.
    My Blog about Linux and other stuff

  2. #2
    Join Date
    Nov 2009
    Location
    Mataro, Spain
    Beans
    13,779
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: Is this right? 4.4 GB folder compressed to 898 MB?

    I guess it depends on the format of files compressed. How about untarring it to another location and comparing them?

    Also opening the .tar file with Archive Manager will show you the content. Right-click, open with archive manager.

    PS. When I said right-click I forgot for a moment this is in the server section so you probably don't have a GUI.
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 14.04 LTS 64bit & Windows 10 Pro 64bit

  3. #3
    Join Date
    Jun 2007
    Beans
    175

    Re: Is this right? 4.4 GB folder compressed to 898 MB?

    The amount of compression you can get depends on a number of things amongst which is the file type you are compressing:
    Jpg and pdf will usuallly hardly compress at all. Things like bitmaps and older word documents will compress considerably. Uncompress your compressed folder and do a file count is a quick and easy way to check.

    Dirdiff will help you compare files and folder for a more thorough check. Data loss through compression and recompression would be most surprising.

  4. #4
    Join Date
    Dec 2007
    Location
    California
    Beans
    4,954
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: Is this right? 4.4 GB folder compressed to 898 MB?

    Do you have symlinks inside that folder? Tar doesn't follow them by default, it just copies them as symlinks. As others said compression will vary widely with what you are compressing.
    "You can't expect to hold supreme executive power just because some watery tart lobbed a sword at you"

    "Don't let your mind wander -- it's too little to be let out alone."

  5. #5
    Join Date
    Apr 2009
    Location
    Costa Rica
    Beans
    255
    Distro
    Ubuntu 10.04 Lucid Lynx

    Exclamation Re: Is this right? 4.4 GB folder compressed to 898 MB?

    Ok I copied the file to a different server via scp, and untarred, then did a:

    Code:
    du -hs uncompressedfolder
    result: 4.4G

    As I was still skeptic I did a du and counted the files:

    Code:
    du -h uncompressedfolder | wc -l
    result: 8321

    Did the same command on the original folder, and...result: 8321

    amazing
    Linux User#498977
    There are only 10 types of people in the world. Those who understand binary, and those who dont.
    My Blog about Linux and other stuff

  6. #6
    Join Date
    Nov 2008
    Location
    Metro Boston
    Beans
    12,922
    Distro
    Kubuntu 14.04 Trusty Tahr

    Re: Is this right? 4.4 GB folder compressed to 898 MB?

    Text files will often achieve 4:1 compression or better. It also depends on how much variability the files have. Many compression algorithms will replace a string of identical characters with a just one placeholder and a counter. If the files have lots of spaces, they are very compressible.

    Most things like graphics and video are already compressed and will show little improvement. The highest levels of compression can be achieved by using the bzip2 algorithm, represented in tar with the "j" switch (don't ask me why it is "j"; maybe they were just running out of letters):

    Code:
    tar cjpvf mydirectory.tar.bz2 mydirectory
    Basically you just use "j" instead of "z" to get bzip2 instead of gzip. By the way, for archiving purposes, you should include the "p" switch to "preserve" all the permissions. See "man tar" for details.
    Last edited by SeijiSensei; September 8th, 2012 at 12:58 AM.
    If you ask for help, do not abandon your request. Please have the courtesy to check for responses and thank the people who helped you.

    Blog · Linode System Administration Guides · Android Apps for Ubuntu Users

  7. #7
    Join Date
    Apr 2009
    Location
    Costa Rica
    Beans
    255
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: Is this right? 4.4 GB folder compressed to 898 MB?

    @SeijiSensei

    thanks for that informative post!

    Indeed they are a lot of text files and there is also a PostgreSQL database in there, but I have no idea what the format for that is.

    I have used bzip2 in the past, but find it that it takes a lot longer to decompress the data (of course, with better compression, longer decompression).

    And thanks for that "p" switch, It looks really useful!

    cheers
    Linux User#498977
    There are only 10 types of people in the world. Those who understand binary, and those who dont.
    My Blog about Linux and other stuff

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •