Page 1 of 3 123 LastLast
Results 1 to 10 of 22

Thread: ext4 overhead?

  1. #1
    Join Date
    Dec 2008
    Location
    Taxes
    Beans
    455
    Distro
    Ubuntu Studio 12.04 Precise Pangolin

    ext4 overhead?

    Setting up a new software array under 12.04 LTS (64 bit) beta and noticed some things.

    I have an array of 8x750GB drives in raid 5.
    Code:
    # cat /proc/mdstat 
    Personalities : [raid6] [raid5] [raid4] 
    md2 : active raid5 sdi1[8] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
          5128005120 blocks super 1.2 level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
          
    unused devices: <none>
    This produces a 5251.1GB md device
    Code:
    # fdisk -l /dev/md2
    
    Disk /dev/md2: 5251.1 GB, 5251077242880 bytes
    2 heads, 3 sectors/track, 1709335040 cylinders, total 10256010240 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 524288 bytes / 3670016 bytes
    Disk identifier: 0x8bef43f3
    When I create an ext4 filesystem, however
    Code:
    # mke2fs -j -L md2p1 -m 1 -T ext4 -U 2d67f5c3-83a4-41a9-a41f-5af082f039c7 /dev/md2p1 
    mke2fs 1.42 (29-Nov-2011)
    Filesystem label=md2p1
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    Stride=128 blocks, Stripe width=896 blocks
    134217728 inodes, 536870015 blocks
    5368700 blocks (1.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=4294967296
    16384 block groups
    32768 blocks per group, 32768 fragments per group
    8192 inodes per group
    Superblock backups stored on blocks: 
    	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
    	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
    	102400000, 214990848, 512000000
    
    Allocating group tables: done                            
    Writing inode tables: done                            
    Creating journal (32768 blocks): done
    Writing superblocks and filesystem accounting information: done
    I only get a 2TB filesystem.
    Code:
    # df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    :/media/md0p1/Movies       9.5T  6.8T  2.2T  76% /mnt
    /dev/md2p1                 2.0T   31G  2.0T   2% /media/md2p1
    My questions are this: 31 GB overhead seems a little steep on a 2TB filesystem and note the NFS mount point - it was prepared in exactly the same way using 8x1.5TB disks, and it shows a filesystem that is well beyond 2TB in size.

    Is this a restriction in the beta, and where is the 31GB of overhead coming from?

  2. #2
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,134
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: ext4 overhead?

    Why did you put a partition on top of your mdadm array? You should add the filesystem right on top of the array like this.

    Code:
    mkfs.ext4 -b 4096 -E stride=128,stripe-width=896 /dev/md2
    If you chose to use ext2/3/4 you should also be aware of reserved space. By default ext2/3/4 will reserve 5% of the drives space, which only root is able to write to. This is done so a user cannot fill the drive and prevent critical daemons writing to it, but 5% of a large RAID array which isn’t going to be written to by critical daemons anyway, is a lot of wasted space. Set the reserved space to 0%, using tune2fs:

    Code:
    tune2fs -m 0 /dev/md2
    That should get you the proper amount of space. You'd just mount /dev/md2 like this in /etc/fstab.
    Code:
    /dev/md2        /media/md2            ext4        defaults        0        0
    Last edited by rubylaser; April 20th, 2012 at 02:08 PM.

  3. #3
    Join Date
    Dec 2008
    Location
    Taxes
    Beans
    455
    Distro
    Ubuntu Studio 12.04 Precise Pangolin

    Re: ext4 overhead?

    Quote Originally Posted by rubylaser View Post
    Why did you put a partition on top of your mdadm array? You should add the filesystem right on top of the array like this.

    Code:
    mkfs.ext4 -b 4096 -E stride=128,stripe-width=896 /dev/md2
    If you chose to use ext2/3/4 you should also be aware of reserved space. By default ext2/3/4 will reserve 5% of the drives space, which only root is able to write to. This is done so a user cannot fill the drive and prevent critical daemons writing to it, but 5% of a large RAID array which isn’t going to be written to by critical daemons anyway, is a lot of wasted space. Set the reserved space to 0%, using tune2fs:

    Code:
    tune2fs -m 0 /dev/md2
    That should get you the proper amount of space. You'd just mount /dev/md2 like this in /etc/fstab.
    Code:
    /dev/md2        /media/md2            ext4        defaults        0        0
    The 31GB doesn't appear to be reserved space - I initially created the filessystem using the defaults for ext4 and that's 5% reserved. I reduced in this last format to 1% (and that's why I specified the UUID to match the previous format as it was already in fstab.)


    To the question of why I'm partitioning the raid device ... Well, habit I guess. Is it really best practice to build a filesystem on a raw device and not a partition? Everything I have read seems to indicate that it's better to use partitions. A little silly maybe in this case where I want a single filesystem.

    I tried once using raw disk devices as the components in a software array. Worked great until I had to replace a failed drive. Matching the byte count of a physical device is more difficult than matching a software defined partition. I have just been carrying that over.

    I think what you are saying is the limitation is in fdisk, though, correct?

  4. #4
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,134
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: ext4 overhead?

    Quote Originally Posted by MakOwner View Post
    The 31GB doesn't appear to be reserved space - I initially created the filessystem using the defaults for ext4 and that's 5% reserved. I reduced in this last format to 1% (and that's why I specified the UUID to match the previous format as it was already in fstab.)


    To the question of why I'm partitioning the raid device ... Well, habit I guess. Is it really best practice to build a filesystem on a raw device and not a partition? Everything I have read seems to indicate that it's better to use partitions. A little silly maybe in this case where I want a single filesystem.

    I tried once using raw disk devices as the components in a software array. Worked great until I had to replace a failed drive. Matching the byte count of a physical device is more difficult than matching a software defined partition. I have just been carrying that over.

    I think what you are saying is the limitation is in fdisk, though, correct?
    Yes, you're running into the need for a GPT partition so you can go over the 2TB mark. You'd need to use parted to make a partition larger than 2TB. If you didn't put a partition on top of the array, this would not be an issue (unless you're individual disks where larger than 2TB, then you'd still need to use parted to make GPT partitions on them).

    It is a best practice to build the filesystem on top of the raw mdadm device. I build my arrays out of disk partitions (this is a common practice). So, your array would be built out disk partitions like it is currently, but there's no need for a partition on top of the array. I hope I explained that well enough so that you understand what I mean Take a look at my mdadm directions if this was confusing.
    Last edited by rubylaser; April 20th, 2012 at 02:38 PM.

  5. #5
    Join Date
    Aug 2008
    Location
    WA
    Beans
    2,186
    Distro
    Ubuntu

    Re: ext4 overhead?

    It is a best practice to build the filesystem on top of the raw mdadm device. I build my arrays out of disk partitions (this is a common practice). So, your array would be built out disk partitions like it is currently, but there's no need for a partition on top of the array.
    I learned something new..

  6. #6
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,134
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: ext4 overhead?

    Great The only other thing that makes sense to put on top of an mdadm array if you need it is LVM for multiple partition support. But, if you just want one big volume, put the filesystem right on top of the array.

  7. #7
    Join Date
    Dec 2008
    Location
    Taxes
    Beans
    455
    Distro
    Ubuntu Studio 12.04 Precise Pangolin

    Re: ext4 overhead?

    Quote Originally Posted by rubylaser View Post
    Yes, you're running into the need for a GPT partition so you can go over the 2TB mark. You'd need to use parted to make a partition larger than 2TB. If you didn't put a partition on top of the array, this would not be an issue (unless you're individual disks where larger than 2TB, then you'd still need to use parted to make GPT partitions on them).

    It is a best practice to build the filesystem on top of the raw mdadm device. I build my arrays out of disk partitions (this is a common practice). So, your array would be built out disk partitions like it is currently, but there's no need for a partition on top of the array. I hope I explained that well enough so that you understand what I mean Take a look at my mdadm directions if this was confusing.

    Thanks. I was doing this from memory -- I have lost my procedure document for doing this. I remember using parted instead of fdisk now for this very reason. Just creating the ext4 filesystem on the raw device works with no partitioning. However I now have a 73GB consumed instead of 31GB ...

    Not that I'm complaining, I'm just now curious as to why - is it because of blocksize space loss? df is smart enough to figure that out?

  8. #8
    Join Date
    Dec 2008
    Location
    Taxes
    Beans
    455
    Distro
    Ubuntu Studio 12.04 Precise Pangolin

    Re: ext4 overhead?

    Quote Originally Posted by rubylaser View Post
    Great The only other thing that makes sense to put on top of an mdadm array if you need it is LVM for multiple partition support. But, if you just want one big volume, put the filesystem right on top of the array.
    TIL.

    I have other arryas set up on 10.04 LTS, this one is on 12.04 LTS beta 2.
    Is there any danger/issue in taking an array backwards to an earlier version of mdadm? Any idea if it will even work?




    And in an effort to add at least some minor information to the topic, here's a one liner to add the new filesystem to your fstab:

    echo UUD=`blkid /dev/md2 | cut -f3 -d= | cut -f2 -d\"` /mnt ext4 defaults 0 0 >> /etc/fstab
    Last edited by MakOwner; April 20th, 2012 at 05:16 PM. Reason: oops, reversed fields

  9. #9
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,134
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: ext4 overhead?

    Quote Originally Posted by MakOwner View Post
    TIL.

    I have other arryas set up on 10.04 LTS, this one is on 12.04 LTS beta 2.
    Is there any danger/issue in taking an array backwards to an earlier version of mdadm? Any idea if it will even work?




    And in an effort to add at least some minor information to the topic, here's a one liner to add the new filesystem to your fstab:

    echo UUD=`blkid /dev/md2 | cut -f3 -d= | cut -f2 -d\"` /mnt ext4 defaults 0 0 >> /etc/fstab
    No, there's no danger in going backwards, but you will want to be running the same version or a newer version of mdadm. It looks like 12.04 uses mdadm version 3.2.3. 10.04 uses 2.6.8, so you'll want to upgrade to a version that at least supports your 1.2 version metadata like 3.1.4 although, the same version would be better.

  10. #10
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,134
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: ext4 overhead?

    Quote Originally Posted by MakOwner View Post
    Thanks. I was doing this from memory -- I have lost my procedure document for doing this. I remember using parted instead of fdisk now for this very reason. Just creating the ext4 filesystem on the raw device works with no partitioning. However I now have a 73GB consumed instead of 31GB ...

    Not that I'm complaining, I'm just now curious as to why - is it because of blocksize space loss? df is smart enough to figure that out?
    What's the output of these?
    Code:
    df -h
    du -h /mnt

Page 1 of 3 123 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •