Page 1 of 3 123 LastLast
Results 1 to 10 of 23

Thread: Script to tune mdadm raid5 (others too, maybe)

  1. #1
    Join Date
    Apr 2010
    Beans
    11
    Distro
    Ubuntu 9.10 Karmic Koala

    Post Script to tune mdadm raid5 (others too, maybe)

    edit: new version of the script, you now have to do less adjustments. should be pretty save to run on almost any system. regards, alex

    I just wrote a script to improve my mdadm-raid's performance and felt like sharing. Maybe someone else can use it or get some inspiration from it.

    Happy tuning,
    Alex

    PS: do NOT just run it, depending on your configuration it might seriously screw up your system if you don't alter it accordingly!

    Code:
    #!/bin/bash
    ###############################################################################
    #  simple script to set some parameters to increase performance on a mdadm
    # raid5 or raid6. Ajust the ## parameters ##-section to your system!
    #
    #  WARNING: depending on stripesize and the number of devices the array might
    # use QUITE a lot of memory after optimization!
    #
    #  27may2010 by Alexander Peganz
    ###############################################################################
    
    
    ## parameters ##
    MDDEV=md51              # e.g. md51 for /dev/md51
    CHUNKSIZE=1024          # in kb
    BLOCKSIZE=4             # of file system in kb
    NCQ=disable             # disable, enable. ath. else keeps current setting
    NCQDEPTH=31             # 31 should work for almost anyone
    FORCECHUNKSIZE=true     # force max sectors kb to chunk size > 512
    DOTUNEFS=false          # run tune2fs, ONLY SET TO true IF YOU USE EXT[34]
    RAIDLEVEL=raid5         # raid5, raid6
    
    
    ## code ##
    # test for priviledges
    if [ "$(whoami)" != 'root' ]
    then
      echo $(date): Need to be root >> /data51/smbshare1/#tuneraid.log
      exit 1
    fi
    
    # set number of parity devices
    NUMPARITY=1
    if [[ $RAIDLEVEL == "raid6" ]]
    then
      NUMPARITY=2
    fi
    
    # get all devices
    DEVSTR="`grep \"^$MDDEV : \" /proc/mdstat` eol"
    while \
     [ -z "`expr match \"$DEVSTR\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\((S)\)\? \)'`" ]
    do
      DEVSTR="`echo $DEVSTR|cut -f 2- -d \ `"
    done
    
    # get active devices list and spares list
    DEVS=""
    SPAREDEVS=""
    while [ "$DEVSTR" != "eol" ]; do
      CURDEV="`echo $DEVSTR|cut -f -1 -d \ `"
      if [ -n "`expr match \"$CURDEV\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\((S)\)\)'`" ]
      then
        SPAREDEVS="$SPAREDEVS${CURDEV:2:1}"
      elif [ -n "`expr match \"$CURDEV\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\)'`" ]
      then
        DEVS="$DEVS${CURDEV:2:1}"
      fi
      DEVSTR="`echo $DEVSTR|cut -f 2- -d \ `"
    done
    NUMDEVS=${#DEVS}
    NUMSPAREDEVS=${#SPAREDEVS}
    
    # test if number of devices makes sense
    if [ ${#DEVS} -lt $[1+$NUMPARITY] ]
    then
      echo $(date): Need more devices >> /data51/smbshare1/#tuneraid.log
      exit 1
    fi
    
    # set read ahead
    RASIZE=$[$NUMDEVS*($NUMDEVS-$NUMPARITY)*2*$CHUNKSIZE]   # in 512b blocks
    echo read ahead size per device: $RASIZE blocks \($[$RASIZE/2]kb\)
    MDRASIZE=$[$RASIZE*$NUMDEVS]
    echo read ahead size of array: $MDRASIZE blocks \($[$MDRASIZE/2]kb\)
    blockdev --setra $RASIZE /dev/sd[$DEVS]
    blockdev --setra $RASIZE /dev/sd[$SPAREDEVS]
    blockdev --setra $MDRASIZE /dev/$MDDEV
    
    # set stripe cache size
    STRCACHESIZE=$[$RASIZE/8]                               # in pages per device
    echo stripe cache size of devices: $STRCACHESIZE pages \($[$STRCACHESIZE*4]kb\)
    echo $STRCACHESIZE > /sys/block/$MDDEV/md/stripe_cache_size
    
    # set max sectors kb
    DEVINDEX=0
    MINMAXHWSECKB=$(cat /sys/block/sd${DEVS:0:1}/queue/max_hw_sectors_kb)
    until [ $DEVINDEX -ge $NUMDEVS ]
    do
      DEVLETTER=${DEVS:$DEVINDEX:1}
      MAXHWSECKB=$(cat /sys/block/sd$DEVLETTER/queue/max_hw_sectors_kb)
      if [ $MAXHWSECKB -lt $MINMAXHWSECKB ]
      then
        MINMAXHWSECKB=$MAXHWSECKB
      fi
      DEVINDEX=$[$DEVINDEX+1]
    done
    if [ $CHUNKSIZE -le $MINMAXHWSECKB ] &&
      ( [ $CHUNKSIZE -le 512 ] || [[ $FORCECHUNKSIZE == "true" ]] )
    then
      echo setting max sectors kb to match chunk size
      DEVINDEX=0
      until [ $DEVINDEX -ge $NUMDEVS ]
      do
        DEVLETTER=${DEVS:$DEVINDEX:1}
        echo $CHUNKSIZE > /sys/block/sd$DEVLETTER/queue/max_sectors_kb
        DEVINDEX=$[$DEVINDEX+1]
      done
      DEVINDEX=0
      until [ $DEVINDEX -ge $NUMSPAREDEVS ]
      do
        DEVLETTER=${SPAREDEVS:$DEVINDEX:1}
        echo $CHUNKSIZE > /sys/block/sd$DEVLETTER/queue/max_sectors_kb
        DEVINDEX=$[$DEVINDEX+1]
      done
    fi
    
    # enable/disable NCQ
    DEVINDEX=0
    if [[ $NCQ == "enable" ]] || [[ $NCQ == "disable" ]]
    then
      if [[ $NCQ == "disable" ]]
      then
        NCQDEPTH=1
      fi
      echo setting NCQ queue depth to $NCQDEPTH
      until [ $DEVINDEX -ge $NUMDEVS ]
      do
        DEVLETTER=${DEVS:$DEVINDEX:1}
        echo $NCQDEPTH > /sys/block/sd$DEVLETTER/device/queue_depth
        DEVINDEX=$[$DEVINDEX+1]
      done
      DEVINDEX=0
      until [ $DEVINDEX -ge $NUMSPAREDEVS ]
      do
        DEVLETTER=${SPAREDEVS:$DEVINDEX:1}
        echo $NCQDEPTH > /sys/block/sd$DEVLETTER/device/queue_depth
        DEVINDEX=$[$DEVINDEX+1]
      done
    fi
    
    # tune2fs
    if [[ $DOTUNEFS == "true" ]]
    then
      STRIDE=$[$CHUNKSIZE/$BLOCKSIZE]
      STRWIDTH=$[$CHUNKSIZE/$BLOCKSIZE*($NUMDEVS-$NUMPARITY)]
      echo setting stride to $STRIDE blocks \($CHUNKSIZEkb\)
      echo setting stripe-width to $STRWIDTH blocks \($[$STRWIDTH*$BLOCKSIZE]kb\)
      tune2fs -E stride=$STRIDE,stripe-width=$STRWIDTH /dev/$MDDEV
    fi
    
    # exit
    echo $(date): Success >> /data51/smbshare1/#tuneraid.log
    exit 0
    Last edited by apeganz; June 15th, 2010 at 04:51 PM.

  2. #2
    Join Date
    Jan 2006
    Beans
    2

    Re: Script to tune mdadm raid5 (others too, maybe)

    Thanks much for sharing your script. I've run it on my new Arch Linux system and it works great.

    I edited out:


    #blockdev --setra $RASIZE /dev/sd[$SPAREDEVS]


    As I have no spare devices; though I don't think that really mattered in the end anyway.

  3. #3
    Join Date
    Oct 2004
    Beans
    161

    Re: Script to tune mdadm raid5 (others too, maybe)

    Any benchmarks?
    Archlinux / Ubuntu.

  4. #4
    Join Date
    Aug 2007
    Beans
    53

    Re: Script to tune mdadm raid5 (others too, maybe)

    This worked for me with my seagate 2tb drives in a raid5 array.

    I did have to modify some lines in the script, the lines where it is expecting /dev/sd[a-z]1 in my case needed to be changed to /dev/sd[a-z]3 and then it ran.

    Before:
    Write speed 55 MB/s
    Read speed 115 MB/s

    After
    Write speed is 107 MB/s
    Read speed is 164 MB/s

    A significant improvement!

  5. #5
    Join Date
    May 2011
    Beans
    10

    Re: Script to tune mdadm raid5 (others too, maybe)

    I came across this post while searching for a way to change the stripe cache size and read ahead values in my RAID 5 array (10.4 desktop).

    A few questions:

    If I put this script in /etc/init.d and then execute an update rc.d defaults command, should this change these parameters every time the system boots up? I've tried to manually set them in the rc.local file, but it doesn't appear to work. I've read a few posts which inquire how to make these parameter changes permanent, and there don't seem to be any definitive solutions.

    How does one find out the chunk size and block size of one's array? Would the disk utility provide this info?

    And finally, what is the algorithm for setting these parameters in this script based upon?

    Thanks in advance for any answers to my queries,

    Nick

    PS Also thanks to the OP for sharing this script!

  6. #6
    Join Date
    Jul 2011
    Beans
    23

    Re: Script to tune mdadm raid5 (others too, maybe)

    This is a nice script, but it is lacking some documentation.

    1. How did you get to the 'formulas' to calculate the read ahead size for the component drives and the entire array?

      Code:
      RASIZE=$[$NUMDEVS*($NUMDEVS-$NUMPARITY)*2*$CHUNKSIZE]   # in 512b blocks
      echo read ahead size per device: $RASIZE blocks \($[$RASIZE/2]kb\)
      MDRASIZE=$[$RASIZE*$NUMDEVS]
      echo read ahead size of array: $MDRASIZE blocks \($[$MDRASIZE/2]kb\)
      This formula works great for me on a 7-disk RAID-6 array, but I'd like to understand it.

      I guess you wrote 512-byte blocks, because you assume that the logical sector size for all components is 512 bytes?
    2. The stripe cache size formula reduces the read and write performance of my array. I've checked the kernel documentation for MD but it doesn't give much information. For sequential read/write throughput I'm better off setting it to 32768. Are there any obvious disadvantages to setting it to the maximum value?
    3. What is 'set max sectors kb' about and is there a reason to not increase it above 512 kB (since you have a specific FORCECHUNKSIZE variable for that)?

  7. #7
    Join Date
    Oct 2004
    Beans
    161

    Re: Script to tune mdadm raid5 (others too, maybe)

    I modified the script a little;

    * More verbose
    * "Fix" indent

    Tested on a 7 drive RAID6 array (ext4), output:

    Code:
     $ sudo ./tuneraid.sh
    Using 2 parity devices (raid6)
    Found 7 devices with 0 spares
    read ahead size per device: 4480 blocks (2240kb)
    read ahead size of array: 31360 blocks (15680kb)
    stripe cache size of devices: 560 pages (2240kb)
    Setting max sectors KB to match chunk size
    Set max sectors KB to 64 on h
    Set max sectors KB to 64 on g
    Set max sectors KB to 64 on f
    Set max sectors KB to 64 on e
    Set max sectors KB to 64 on d
    Set max sectors KB to 64 on b
    Set max sectors KB to 64 on c
    Setting NCQ queue depth to 31 on h
    Setting NCQ queue depth to 31 on g
    Setting NCQ queue depth to 31 on f
    Setting NCQ queue depth to 31 on e
    Setting NCQ queue depth to 31 on d
    Setting NCQ queue depth to 31 on b
    Setting NCQ queue depth to 31 on c
    setting stride to 16 blocks (64 KB)
    setting stripe-width to 80 blocks (320 KB)
    tune2fs -E stride=16,stripe-width=80 /dev/md
    The script:

    Code:
    #!/bin/bash
    ###############################################################################
    #  simple script to set some parameters to increase performance on a mdadm
    # raid5 or raid6. Ajust the ## parameters ##-section to your system!
    #
    #  WARNING: depending on stripesize and the number of devices the array might
    # use QUITE a lot of memory after optimization!
    #
    #  27may2010 by Alexander Peganz
    #  31jul211 modified by Mathias B
    ###############################################################################
    
    
    ## parameters ##
    MDDEV=md0		# e.g. md51 for /dev/md51
    CHUNKSIZE=64		# in KB
    BLOCKSIZE=4		# of file system in KB
    NCQ=enable		# disable, enable. ath. else keeps current setting
    NCQDEPTH=31		# 31 should work for almost anyone
    FORCECHUNKSIZE=true	# force max sectors kb to chunk size > 512
    DOTUNEFS=true		# run tune2fs, ONLY SET TO true IF YOU USE EXT[34]
    RAIDLEVEL=raid6		# raid5, raid6
    
    
    ## code ##
    # test for priviledges
    if [ "$(whoami)" != 'root' ]; then
    	echo $(date): Need to be root >> /#tuneraid.log
    	exit 1
    fi
    
    # set number of parity devices
    NUMPARITY=1
    [[ $RAIDLEVEL == "raid6" ]] && NUMPARITY=2
    echo "Using $NUMPARITY parity devices ($RAIDLEVEL)"
    # get all devices
    DEVSTR="`grep \"^$MDDEV : \" /proc/mdstat` eol"
    while \
     [ -z "`expr match \"$DEVSTR\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\((S)\)\? \)'`" ]
    do
    	DEVSTR="`echo $DEVSTR|cut -f 2- -d \ `"
    done
    
    # get active devices list and spares list
    DEVS=""
    SPAREDEVS=""
    while [ "$DEVSTR" != "eol" ]; do
    	CURDEV="`echo $DEVSTR|cut -f -1 -d \ `"
    	if [ -n "`expr match \"$CURDEV\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\((S)\)\)'`" ]; then
    		SPAREDEVS="$SPAREDEVS${CURDEV:2:1}"
    	elif [ -n "`expr match \"$CURDEV\" '\(\<sd[a-z]1\[[12]\?[0-9]\]\)'`" ]; then
    		DEVS="$DEVS${CURDEV:2:1}"
    	fi
    	DEVSTR="`echo $DEVSTR|cut -f 2- -d \ `"
    done
    
    NUMDEVS=${#DEVS}
    NUMSPAREDEVS=${#SPAREDEVS}
    
    # test if number of devices makes sense
    if [ ${#DEVS} -lt $[1+$NUMPARITY] ]; then
    	echo $(date): Need more devices >> /#tuneraid.log
    	exit 1
    fi
    
    echo "Found $NUMDEVS devices with $NUMSPAREDEVS spares"
    
    # set read ahead
    RASIZE=$[$NUMDEVS*($NUMDEVS-$NUMPARITY)*2*$CHUNKSIZE]   # in 512b blocks
    echo read ahead size per device: $RASIZE blocks \($[$RASIZE/2]kb\)
    MDRASIZE=$[$RASIZE*$NUMDEVS]
    echo read ahead size of array: $MDRASIZE blocks \($[$MDRASIZE/2]kb\)
    blockdev --setra $RASIZE /dev/sd[$DEVS]
    #blockdev --setra $RASIZE /dev/sd[$SPAREDEVS]
    blockdev --setra $MDRASIZE /dev/$MDDEV
    
    # set stripe cache size
    STRCACHESIZE=$[$RASIZE/8]                               # in pages per device
    echo stripe cache size of devices: $STRCACHESIZE pages \($[$STRCACHESIZE*4]kb\)
    echo $STRCACHESIZE > /sys/block/$MDDEV/md/stripe_cache_size
    
    # set max sectors kb
    DEVINDEX=0
    MINMAXHWSECKB=$(cat /sys/block/sd${DEVS:0:1}/queue/max_hw_sectors_kb)
    until [ $DEVINDEX -ge $NUMDEVS ]; do
    	DEVLETTER=${DEVS:$DEVINDEX:1}
    	MAXHWSECKB=$(cat /sys/block/sd$DEVLETTER/queue/max_hw_sectors_kb)
    	if [ $MAXHWSECKB -lt $MINMAXHWSECKB ]; then
    		MINMAXHWSECKB=$MAXHWSECKB
    	fi
    	DEVINDEX=$[$DEVINDEX+1]
    done
    if [ $CHUNKSIZE -le $MINMAXHWSECKB ] && ( [ $CHUNKSIZE -le 512 ] || [[ $FORCECHUNKSIZE == "true" ]] ); then
    	echo Setting max sectors KB to match chunk size
    	DEVINDEX=0
    	until [ $DEVINDEX -ge $NUMDEVS ]; do
    		DEVLETTER=${DEVS:$DEVINDEX:1}
    		echo "Set max sectors KB to $CHUNKSIZE on $DEVLETTER"
    		echo $CHUNKSIZE > /sys/block/sd$DEVLETTER/queue/max_sectors_kb
    		DEVINDEX=$[$DEVINDEX+1]
    	done
    	DEVINDEX=0
    	until [ $DEVINDEX -ge $NUMSPAREDEVS ]; do
    		DEVLETTER=${SPAREDEVS:$DEVINDEX:1}
    		echo "Set max sectors KB to $CHUNKSIZE on $DEVLETTER"
    		echo $CHUNKSIZE > /sys/block/sd$DEVLETTER/queue/max_sectors_kb
    		DEVINDEX=$[$DEVINDEX+1]
    	done
    fi
    
    # enable/disable NCQ
    DEVINDEX=0
    if [[ $NCQ == "enable" ]] || [[ $NCQ == "disable" ]]; then
    	if [[ $NCQ == "disable" ]]; then
    		NCQDEPTH=1
    	fi
    	until [ $DEVINDEX -ge $NUMDEVS ]; do
    		DEVLETTER=${DEVS:$DEVINDEX:1}
    		echo Setting NCQ queue depth to $NCQDEPTH on $DEVLETTER
    		echo $NCQDEPTH > /sys/block/sd$DEVLETTER/device/queue_depth
    		DEVINDEX=$[$DEVINDEX+1]
    	done
    	DEVINDEX=0
    	until [ $DEVINDEX -ge $NUMSPAREDEVS ]; do
    		DEVLETTER=${SPAREDEVS:$DEVINDEX:1}
    		echo Setting NCQ queue depth to $NCQDEPTH on $DEVLETTER
    		echo $NCQDEPTH > /sys/block/sd$DEVLETTER/device/queue_depth
    		DEVINDEX=$[$DEVINDEX+1]
    	done
    fi
    
    # tune2fs
    if [[ $DOTUNEFS == "true" ]]; then
    	STRIDE=$[$CHUNKSIZE/$BLOCKSIZE]
    	STRWIDTH=$[$CHUNKSIZE/$BLOCKSIZE*($NUMDEVS-$NUMPARITY)]
    	echo setting stride to $STRIDE blocks \($CHUNKSIZE KB\)
    	echo setting stripe-width to $STRWIDTH blocks \($[$STRWIDTH*$BLOCKSIZE] KB\)
    	echo tune2fs -E stride=$STRIDE,stripe-width=$STRWIDTH /dev/$MDDEV
    fi
    
    # exit
    echo $(date): Success >> /#tuneraid.log
    exit 0
    Archlinux / Ubuntu.

  8. #8
    Join Date
    Jul 2011
    Beans
    23

    Re: Script to tune mdadm raid5 (others too, maybe)

    fackamato, i noticed you are using a chunk size of 64k. How did you pick the chunk size for your array?

  9. #9
    Join Date
    Oct 2004
    Beans
    161

    Re: Script to tune mdadm raid5 (others too, maybe)

    Quote Originally Posted by nipennem View Post
    fackamato, i noticed you are using a chunk size of 64k. How did you pick the chunk size for your array?
    I actually don't remember. This is running in an Atom board so I can't check the max performance anyway. It only hosts large ( >1GB) files, 3GB RAM.
    Archlinux / Ubuntu.

  10. #10
    Join Date
    Mar 2007
    Beans
    31

    Re: Script to tune mdadm raid5 (others too, maybe)

    Just wanted to revive this thread... I recently started using a software raid on my Ubuntu server and came across this thread to help tune an mdadm raid array.

    Anyway.... I've rewritten the tuning script to be a bit more automated. I've only tested on my own system, so there might be bugs for others for situations that I haven't accounted for.

    So... here are some differences:

    1. auto detect the mdadm configured raid arrays
    2. auto detect NCQ capable disks
    3. does not currently tune the filesystems, but does echo out the command for tuning the filesystem
    4. auto detect chunk size of array
    5. auto detect raid level
    6. Minor modification of stripe cache size calculation


    Here is the script (obviously, same caveats as before, run at your own risk, modify as required) -
    Code:
    #!/bin/sh
    
    # NEED FOLLOWING UTILS
    # -- hdparm
    # -- lvm
    
    
    # Add VM tuning stuff?
    #vm.swappiness = 1               # set low to limit swapping
    #vm.vfs_cache_pressure = 50      # set lower to cache more inodes / dir entries
    #vm.dirty_background_ratio = 5   # set low on systems with lots of memory
                                     # Too HIGH on systems with lots of memory 
                                     # means huge page flushes which will hurt IO performance
    #vm.dirty_ratio = 10             # set low on systems with lots of memory
    
    
    # DEFAULTS
    BLOCKSIZE=4		# of filesystem in KB (should I determine?)
    FORCECHUNKSIZE=true	# force max  sectors KB to chunk size > 512
    TUNEFS=true		# run tune2fs on filesystem if ext[3|4]
    SCHEDULER=deadline      # cfq / noop / anticipatory / deadline
    NR_REQUESTS=64          # NR REQUESTS
    NCQDEPTH=31             # NCQ DEPTH
    MDSPEEDLIMIT=200000     # Array speed_limit_max in KB/s
    
    
    
    # ----------------------------------------------------------------------
    # 
    # BODY
    # 
    # ----------------------------------------------------------------------
    
    # determine list of arrays
    mdadm -Es | grep ARRAY | while read x1 x2 x3 x4 x5
    do
        # INIT VARIABLES
        RAIDLEVEL=0
        NDEVICES=0
        CHUNKSIZE=0
        ARRAYSTATUS=0
        DISKS=""
        SPARES=""
        NUMDISKS=0
        NUMSPARES=0
        NUMPARITY=0
        NCQ=0
        NUMNCQDISKS=0
    
        RASIZE=0
        MDRASIZE=0
        STRIPECACHESIZE=0
        MINMAXHWSECKB=999999999
    
        STRIDE=0
        STRIPEWIDTH=0
    
    
        # GET DETAILS OF ARRAY
        ARRAY=`basename $x2`
        RAIDLEVEL=`echo $x3 | cut -d'=' -f2`
    
        case $RAIDLEVEL in
    	"raid6") NUMPARITY=2 ;;
    	"raid5") NUMPARITY=1 ;;
    	"raid4") NUMPARITY=1 ;;
    	"raid3") NUMPARITY=1 ;;
    	"raid1") NUMPARITY=1 ;;
    	"raid0") NUMPARITY=0 ;;
    	*) 
    	    echo "Unknown RAID level"
        esac
    
        echo ""
        echo "======================================================================"
        echo "FOUND ARRAY - $ARRAY / $RAIDLEVEL"
        CHUNKSIZE=`mdadm --detail /dev/$ARRAY | grep 'Chunk Size' | tr -d A-Za-z':'[:blank:]`
    
        echo "-- Chunk Size = $CHUNKSIZE KB"
    
        FOO1=`grep "$ARRAY : " /proc/mdstat`
        ARRAYSTATUS=`echo $FOO1 | cut -f 3`
    
    
        # GET LIST OF DISKS IN ARRAY
        echo ""
        echo "Getting active devices and spares list"
        for DISK in `echo $FOO1 | cut -f 5- -d \ `
        do
    	LETTER=`echo $DISK | cut -c 1-3`
    	echo $DISK | grep '(S)'
            RC=$?
    	if [ $RC -gt 0 ]
    	then
    	    echo "-- $DISK - Active"
    	    DISKS="$DISKS $LETTER"
    	    NUMDISKS=$((NUMDISKS+1))
    	else
    	    echo "-- $DISK - Spare"
    	    SPARES="$SPARES $LETTER"
    	    NUMSPARES=$((NUMDISKS+1))
    	fi
        done
        echo ""
        echo "Active Disks ($NUMDISKS) - $DISKS"
        echo "Spares Disks ($NUMSPARES) - $SPARES"
    
            
        # DETERMINE SETTINGS
        RASIZE=$(($NUMDISKS*($NUMDISKS-$NUMPARITY)*2*$CHUNKSIZE))  # Disk read ahead in 512byte blocks
        MDRASIZE=$(($RASIZE*$NUMDISKS))                            # Array read ahead in 512byte blocks
        STRIPECACHESIZE=$(($RASIZE*2/8))                           # in pages per device
    
    
        for DISK in $DISKS $SPARES
        do
    	# check max_hw_sectors_kb
    	FOO1=`cat /sys/block/$DISK/queue/max_hw_sectors_kb | awk '{print $1}'`
    	if [ $FOO1 -lt $MINMAXHWSECKB ]
    	then
    	    MINMAXHWSECKB=$FOO1
    	fi
    
    	# check NCQ
    	hdparm -I /dev/$DISK | grep NCQ >> /dev/null
    	if [ $? -eq 0 ]
    	then
    	    NUMNCQDISKS=$((NUMNCQDISKS+1))
    	fi
        done
    
        if [ $CHUNKSIZE -le $MINMAXHWSECKB ]
        then
    	MINMAXHWSECKB=$CHUNKSIZE
        fi
    
        if [ $NUMNCQDISKS -lt $NUMDISKS ]
        then
    	NCQDEPTH=1
    	echo "WARNING! ONLY $NUMNCQDISKS DISKS ARE NCQ CAPABLE!"
        fi
    
        echo ""
        echo "TUNED SETTINGS"
        echo "-- DISK READ AHEAD  = $RASIZE blocks"
        echo "-- ARRAY READ AHEAD = $MDRASIZE blocks"
        echo "-- STRIPE CACHE     = $STRIPECACHESIZE pages"
        echo "-- MAX SECTORS KB   = $MINMAXHWSECKB KB"
        echo "-- NCQ DEPTH        = $NCQDEPTH"
    
        # TUNE ARRAY
        echo ""
        echo "TUNING ARRAY"
        blockdev --setra $MDRASIZE /dev/$ARRAY
        echo "-- $ARRAY read ahead set to $MDRASIZE blocks"
    
        echo "$STRIPECACHESIZE" > /sys/block/$ARRAY/md/stripe_cache_size
        echo "-- $ARRAY stripe_cache_size set to $STRIPECACHESIZE pages"
    
        echo $MDSPEEDLIMIT > /proc/sys/dev/raid/speed_limit_max
        echo "-- $ARRAY speed_limit_max set to $MDSPEEDLIMIT"
    
        # TUNE DISKS
        echo ""
        echo "TUNING DISKS"
        echo "Settings : "
        echo "        read ahead = $RASIZE blocks"
        echo "    max_sectors_kb = $MINMAXHWSECKB KB"
        echo "         scheduler = $SCHEDULER"
        echo "       nr_requests = $NR_REQUESTS"
        echo "       queue_depth = $NCQDEPTH"
    
        
        for DISK in $DISKS $SPARES
        do
    	echo "-- Tuning $DISK"
    	blockdev --setra $RASIZE /dev/$DISK
    	echo $MINMAXHWSECKB > /sys/block/$DISK/queue/max_sectors_kb
    	echo $SCHEDULER > /sys/block/$DISK/queue/scheduler
    	echo $NR_REQUESTS > /sys/block/$DISK/queue/nr_requests
    	echo $NCQDEPTH > /sys/block/$DISK/device/queue_depth
        done
    
        # TUNE ext3/exti4 FILESYSTEMS
        STRIDE=$(($CHUNKSIZE/$BLOCKSIZE))
        STRIPEWIDTH=$(($CHUNKSIZE/$BLOCKSIZE*($NUMDISKS-$NUMPARITY)))
        echo ""
        echo "TUNING FILESYSTEMS"
        echo "For each filesystem on this array, run the following command:"
        echo "  tune2fs -E stride=$STRIDE,stripe-width=$STRIPEWIDTH <filesystem>"
        echo ""
    
    done
    Something to note... with systems with lots of memory, you'll want to set vm.dirty_background_ratio and vm.dirty_ratio to a fairly low value, otherwise the page flushes will kill your write IO performance. I had mine set fairly high and my write performance would hover around 100MB/s (using dd to write a 20GB file). Setting the values to 5 and 10 respectively on my server (6GB RAM), I now get 180 MB/s with the same command.

    My array -
    Code:
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md0 : active raid6 sdj[7] sdc[0] sde[2] sdf[3] sdd[1] sdg[4] sdi[6] sdh[5]
          5860574208 blocks level 6, 256k chunk, algorithm 2 [8/8] [UUUUUUUU]
    All disks are 1TB Samsung disks.

    Using dd to write a 20GB file -
    Code:
    5120+0 records in
    5120+0 records out
    21474836480 bytes (21 GB) copied, 117.421 s, 183 MB/s
    Using dd to read in the same 20GB file -
    Code:
    5120+0 records in
    5120+0 records out
    21474836480 bytes (21 GB) copied, 44.612 s, 481 MB/s

    Considering my numbers were ~ 90MB/s writes and 300MB/s reads when I started, I'm very pleased with the results.

    Many thanks to the OP for providing a starting point for me.

    Joo
    Last edited by jchung; September 12th, 2011 at 02:52 PM.

Page 1 of 3 123 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •