Page 2 of 14 FirstFirst 123412 ... LastLast
Results 11 to 20 of 136

Thread: Seemingly sporadic slow ZFS IO since 22.04

  1. #11
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Seemingly sporadic slow ZFS IO since 22.04

    Quote Originally Posted by tkae-lp View Post
    I had wondered this, but my limited knowledge of ZFS is that the performance shouldn't degrade by any meaningful amount until about 80%? And why not on Bionic?

    There is an unused NMVe slot, are you thinking I should throw something in there for arc and logs? (Edit: And, to be honest, along with the NMVe, probably time to drop some more memory in?)

    No, there are 10 on board SATA slots the board is a beast (or was at the time!) - I don't have an HBA card. 8 x 4TB SATA disks for the array, 1 x 500GB SATA for downloads (torrents/nzb so it doesn't cane the array) 1 x SATA SSD for OS... all on main board.

    GPU is in slot 5, I think? IIRC this is because it touches the Noctua CPU heat sink if in slot 6.

    Here is the board: https://www.asrock.com/mb/Intel/X99%20WS/ it's from the workstation line.

    The only thing I miss is IPMI, but we got it cheap and it supports ECC so that's what we ended up with.

    Thanks - watch this space.. I'll report back asap.

    Edit: If the ARC is the issue, and more memory is required, what I don't get is how significantly different Jammy is to Bionic. I always start off with minimal server for my installs, and then because I do like a GUI do a bare minimum xfce install without using group packages wherever possible. Is Jammy really using that much more memory than Bionic?!
    RE: https://ubuntuforums.org/showthread.php?t=2486010

    That is a thread 1fallen and I helped a user with where we dove deeply into ZFS RAIDZ performance, with hardware SLOG and L2ARC caches. (It was a lot of fun!!!) That was also about the time I was upgrading my server and workstation. The result we found from that, drove what I did for them.

    As you can see, both 1fallen and I are not afraid to go outside the norms to test things. Some times what they tell us are assumptions. Many times, I know I might break things by trying things that they tell us will not work. I'll try them to see if they work or not... and if it doesn't what it actually does when it blows up. That is what I found there in those tests with disk caches.

    I have been using ZFS since 2005 with OpenSolaris. I started testing ZFS for Linux since Ubuntu 12.04... Then the builds from FreeBSD. I didn't see any hit in performance from 18.04 through 22.04. Actually the opposite.

    18.04 used 0.7.5, we are currently at, testing and verifying 2.2.1 from Noble proposed.

    These are from my notes from 20.04 to 22.04:
    Performance
    • Improved performance for interactive I/O. #11166 #11116
    • Optimized prefetch for parallel workloads. #11652
    • Improved scalability by reduced contention on locks and atomics. #11288 #12172 #12145 #11904
    • Reduced pool import time. #11470 #11502 #11469 #11467 #11467
    • Reduced fragmentation from ZIL blocks. #11389
    • Improved zfs receive performance with lightweight write. #11105
    • Improved memory management. #12152 #11429 #11574 #12150
    • Improved module load time. #11282
    I'm guilty of challenging 1fallen to try, learn and use ZFS... LOL.

    1fallen tells me that he notices on his, that he starts noticeably seeing performance changes at about 65%. ZFS does use a lot of memory. Ubuntu says you need 4GB for Ubuntu. Since 20.04, I see pegging 4GB's during an install of Server. ZFS needs at least 8GB to run. Then 1 GB for each TB of storage. Both my Server and Workstation have 132GB of Gaming memory.

    But on the other end of that, my Old Lenovo ThinkPad T520 Laptop is maxed out at 16G RAM, and has 6TB SSD storage. For that much ZFS storage it should have 18GB RAM. I tuned it by limiting the ARC, to where it performs it's best. It runs great.

    It doesn't cost me anything but time, to play with things, and make them run well.

    Your's is very much "a curiosity" to me, as playing movies is just a read. My stored media for my media server is just on HDD. I run mine on my workstation, and notice nothing while my wife watches movies, and I am running my tests, thrashing that machine doing other things. Media services doesn't take much. My son runs his off of a Rasp Pi4, running Ubuntu Server and USB attached storage.

    Yours is very strange in that it is intermittent. We can see your "best case" performance, but that does not explain what is going on when it is dragging down slow. <-- We have not found what is dragging that system down. We have not found the "cause". It could be something with ZFS, or something else that is dragging down ZFS performance with it. That is what my curiosity is.

    The fio command I gave you was for writes, which would be slower than reads. Just change "write'" to "read" and that will test your reads.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  2. #12
    Join Date
    Nov 2023
    Beans
    76

    Re: Seemingly sporadic slow ZFS IO since 22.04

    That linked thread is very interesting - I'm all up for doing things outside the norms or recommended ways if it works I only care about results. I love ZFS so I will try anything and persevere Haven't had a 'blip' in my FLAC collection since... well, using ZFS! <3

    Quote Originally Posted by MAFoElffen View Post
    That would free up about 8GB of memory...
    Wow... actually it freed up nearer 18GB

    So after clearing ARC caches I get:

    Code:
    dd if=/dev/zero of=/mnt/Tank/testfile bs=1G count=6 oflag=dsync
    6+0 records in
    6+0 records out
    6442450944 bytes (6.4 GB, 6.0 GiB) copied, 226.487 s, 28.4 MB/s
    and:

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    TEST: Laying out IO file (1 file / 2048MiB)
    Jobs: 1 (f=1): [R(1)][100.0%][r=3094MiB/s][r=3094 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=1): err= 0: pid=1512586: Fri Dec  1 01:26:23 2023
      read: IOPS=2869, BW=2870MiB/s (3009MB/s)(10.0GiB/3568msec)
        slat (usec): min=178, max=1970, avg=345.04, stdev=110.20
        clat (usec): min=3, max=49773, avg=10664.54, stdev=3102.23
         lat (usec): min=192, max=50837, avg=11010.13, stdev=3203.26
        clat percentiles (usec):
         |  1.00th=[ 3851],  5.00th=[ 5932], 10.00th=[ 5997], 20.00th=[ 6325],
         | 30.00th=[11207], 40.00th=[11338], 50.00th=[11338], 60.00th=[11469],
         | 70.00th=[11600], 80.00th=[11600], 90.00th=[13435], 95.00th=[13566],
         | 99.00th=[16712], 99.50th=[19530], 99.90th=[44827], 99.95th=[48497],
         | 99.99th=[49546]
       bw (  MiB/s): min= 1968, max= 3102, per=98.28%, avg=2820.57, stdev=394.85, samples=7
       iops        : min= 1968, max= 3102, avg=2820.57, stdev=394.85, samples=7
      lat (usec)   : 4=0.05%, 250=0.05%, 500=0.05%, 750=0.05%, 1000=0.09%
      lat (msec)   : 2=0.25%, 4=0.49%, 10=20.02%, 20=78.52%, 50=0.44%
      cpu          : usr=1.01%, sys=98.68%, ctx=36, majf=0, minf=8206
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=2870MiB/s (3009MB/s), 2870MiB/s-2870MiB/s (3009MB/s-3009MB/s), io=10.0GiB (10.7GB), run=3568-3568msec
    Whilst doing the above read, I see:

    Code:
    Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
    loop0             0.00         0.00         0.00         0.00          0          0          0
    loop1             0.00         0.00         0.00         0.00          0          0          0
    loop10            0.00         0.00         0.00         0.00          0          0          0
    loop11            0.00         0.00         0.00         0.00          0          0          0
    loop12            0.00         0.00         0.00         0.00          0          0          0
    loop13            0.00         0.00         0.00         0.00          0          0          0
    loop14            0.00         0.00         0.00         0.00          0          0          0
    loop2             0.00         0.00         0.00         0.00          0          0          0
    loop3             0.00         0.00         0.00         0.00          0          0          0
    loop4             0.00         0.00         0.00         0.00          0          0          0
    loop5             0.00         0.00         0.00         0.00          0          0          0
    loop6             0.00         0.00         0.00         0.00          0          0          0
    loop7             0.00         0.00         0.00         0.00          0          0          0
    loop8             0.00         0.00         0.00         0.00          0          0          0
    loop9             0.00         0.00         0.00         0.00          0          0          0
    sda              16.50         0.00      1030.00         0.00          0       2060          0
    sdb              12.50         0.00      1030.00         0.00          0       2060          0
    sdc              19.00         0.00      1028.00         0.00          0       2056          0
    sdd              20.00         0.00      1026.00         0.00          0       2052          0
    sde               0.00         0.00         0.00         0.00          0          0          0
    sdf              19.00         0.00      1030.00         0.00          0       2060          0
    sdg               0.00         0.00         0.00         0.00          0          0          0
    sdh               6.00         0.00      1044.00         0.00          0       2088          0
    sdi              18.00         0.00      1032.00         0.00          0       2064          0
    sdj              18.50         0.00      1030.00         0.00          0       2060          0
    (presumably the above is whilst it writes the file first to read back, but it's SLOW... took 5 mins or so)

    And at the same time, a 1080p film froze and my whole XFCE environment got very sluggish

    Then after:

    Code:
    sudo echo $RESET > /sys/module/zfs/parameters/zfs_arc_shrinker_limit
    I see and increase in mem usage by about 2-3GB within a minute or so and then:

    Code:
    dd if=/dev/zero of=/mnt/Tank/testfile bs=1G count=6 oflag=dsync
    6+0 records in
    6+0 records out
    6442450944 bytes (6.4 GB, 6.0 GiB) copied, 21.0576 s, 306 MB/s
    and

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    TEST: Laying out IO file (1 file / 2048MiB)
    Jobs: 1 (f=1): [R(1)][80.0%][r=2721MiB/s][r=2721 IOPS][eta 00m:01s]
    TEST: (groupid=0, jobs=1): err= 0: pid=1520710: Fri Dec  1 01:31:49 2023
      read: IOPS=2699, BW=2700MiB/s (2831MB/s)(10.0GiB/3793msec)
        slat (usec): min=303, max=885, avg=367.12, stdev=31.61
        clat (usec): min=3, max=25151, avg=11375.32, stdev=1025.73
         lat (usec): min=348, max=26038, avg=11742.93, stdev=1040.92
        clat percentiles (usec):
         |  1.00th=[ 7308],  5.00th=[11207], 10.00th=[11207], 20.00th=[11338],
         | 30.00th=[11338], 40.00th=[11338], 50.00th=[11338], 60.00th=[11338],
         | 70.00th=[11469], 80.00th=[11469], 90.00th=[11600], 95.00th=[11731],
         | 99.00th=[13566], 99.50th=[13566], 99.90th=[21627], 99.95th=[23462],
         | 99.99th=[24773]
       bw (  MiB/s): min= 2464, max= 2752, per=99.69%, avg=2691.43, stdev=100.89, samples=7
       iops        : min= 2464, max= 2752, avg=2691.43, stdev=100.89, samples=7
      lat (usec)   : 4=0.05%, 500=0.05%, 750=0.05%
      lat (msec)   : 2=0.15%, 4=0.26%, 10=0.81%, 20=98.50%, 50=0.14%
      cpu          : usr=1.56%, sys=98.36%, ctx=5, majf=0, minf=8204
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=2700MiB/s (2831MB/s), 2700MiB/s-2700MiB/s (2831MB/s-2831MB/s), io=10.0GiB (10.7GB), run=3793-3793msec
    The above completed quickly (less than 20 seconds?). Please excuse the rough time estimates. It's nearly 2.30am here and I need to turn in.

    Memory usage still climbing by another 3GB after several minutes.

    At this point, movie watching is nice and snappy again. You can skip forwards/backwards by an hour and it's instant. I know the dd command I am using isn't ideal, but it's still certainly slower than I would have hoped.

    So I guess the obvious next question is: where do I go from here? I'm quite happy to get an NMVe and more RAM. Right time of year to be grabbing stuff. I have a 512GB NMVe spare here, but I am happy to grab a 1TB or 2TB.

    More memory is always good but unfortunately the board is 128GB max, so I could grab 2 x 64GB modules but it's a waste of the existing 16s. I currently have 2 x Micron 36ASF2G72PZ-2G1A2 modules in there and Micron 64s seem to be about £154 each new... I know it's not ideal having mismatched brands and sizes but I could get 2 x 32GB SK-Hynix modules for £80 preowned, giving me 96GB for not a lot of money. This seems like the logical choice to me.

    1TB crucial NMVe is ~£53 a 2TB is ~£87. I don't mind spending if needed but also have to bear in mind Christmas is round the corner. What would you guys do? At the end of the day I just want it to work, so if I have to spend £167 on it, then it is what it is.

    I said in my original post that there was plenty of free memory, but I have been watching it, and there's actually not a lot... I must have had a brain fart. I think at the very least memory is on the cards. (I still don't understand why I didn't encounter this in Bionic though)

    Edit: I could also look at having a little box set prune to get cap down, but again Bionic was fine and cap has not changed at all. It literally happened as soon as Jammy was installed!

    Edit2: The cause couldn't be something to do with ksplice could it? I've been trying to think about any possible differences and that is the only one I can think of.... but:

    Code:
    uptrack-show                                                                          
    Installed updates:
    [r9hrmw28] Enablement update for live patching.
    [qeibbp29] Provide an interface to freeze tasks.
    [5r11iivt] Denial-of-service when checking if an address is a jump label.
    
    Effective kernel version is 5.15.0-89.99
    I don't see anything there that could affect this.

    (Cheers Oracle for the Ubuntu exclusive free live patching btw <3)

    Edit3: Actually suggesting the above is really stupid. If it was a kernel patch it wouldn't come and go... I need to go to bed! will pick this up again tomorrow!
    Last edited by tkae-lp; December 1st, 2023 at 03:47 AM.

  3. #13
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Seemingly sporadic slow ZFS IO since 22.04

    Yes, continued tomorrow. I spent a long day getting around a recent update bug with zfs-dkms. I think this last update was "evil" vs 6.2 kernels.

    Bringing up ksplice is something I didn't think about until you brought it up... It does the updates, applies it, but it all exists in memory until the next reboot, then goes as physical. (and frees up that memory)...

    Dang, let me sleep on that.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  4. #14
    Join Date
    Nov 2023
    Beans
    76

    Re: Seemingly sporadic slow ZFS IO since 22.04

    About ksplice, can't imagine that's it then, as the problem has been seen directly after a reboot.

    I may have hit a snag in my memory master plan.

    1. That preowned memory for £80 has been snapped up
    2. The manual for the board here:
    https://download.asrock.com/Manual/X99%20WS.pdf on page 16 says modules have to be identical for quad channel, but doesn't say if they have to be for triple, presumably not.
    3. According to the system info script, my memory is Micron 36ASF2G72PZ-2G1A2, which Micron's site gives info about here:
    https://www.micron.com/products/dram-modules/rdimm/part-catalog/mta36asf2g72pz-2g1 but all the modules for sale appear to be different: https://uk.crucial.com/compatible-upgrade-for/asrock/x99-ws#memory?module-type(-)RDIMM
    4. There is a list of compatible board modules here:
    https://www.asrock.com/mb/Intel/X99%20WS/#Memory but there's one 32GB listed and I am not even sure if that is an RDIMM.
    5. The manual doesn't specify module size limit per memory type (for example, I know some mobos that can take RDIMMs and LRDIMMs can take larger LRDIMM sizes per slot - I bet that Crucial 32GB is a UDIMM)

    Looks like I could very well be limited to using the more expensive option of 2 x 16GB modules and sacrificing such a jump in memory and quad channel. Bummer

    Edit: Actually all might not be lost - the manual says: for quad channel configuration, you always need to install identical (the same
    brand, speed, size and chip-type) DDR4 DIMM pairs.

    So maybe two of these would do it:
    https://www.ebay.co.uk/itm/402574671972 and I can look at another 2/4 down the line. But 64GB of memory will give the array more breathing room.

    Edit2: Wow.. this is a learning curve. Just read you can't mix rank configs. So ebay listing is no good as they are 2Rx8. Looks like mine are:



    DDR4-MODULE ORG. (PACKAGE RANKS DEVICE WIDTH) 08

    Presumably this means 1Rx8, so I would need these:
    https://uk.crucial.com/memory/server-ddr4/mta9asf2g72pz-3g2r/

    Edit 3: Right, Couldn't find a strike through button so ignore everything in grey above - it was easier just to shut the machine down and take a stick out. The exact PN is: MTA36ASF2G72PZ-2G1A2IG

    Note the IG at the end.I find it frustrating that Micron do not list these variant part numbers... BUT the fantastic news is I found an ebay seller with two sticks for £31, so next week I'll have 64GB of RAM. I can also find another seller that has TONS of MTA36ASF2G72PZ-2G1A2IJ (J at end)

    So I have emailed them to ask what the difference is between that and IG, because it turns out that this stuff is actually dirt cheap. They're selling 8 x 16GB modules for £144 or best offer! Have just ordered a 2TB NMVe for ARC and logs too. Now I just need to fix this slow down issue. In the middle of having a little box set prune as we speak.
    Last edited by tkae-lp; December 1st, 2023 at 06:40 PM.

  5. #15
    #&thj^% is offline I Ubuntu, Therefore, I Am
    Join Date
    Aug 2016
    Beans
    Hidden!

    Re: Seemingly sporadic slow ZFS IO since 22.04

    My NMVe Drives, all have differences ie:
    Code:
    /dev/nvme0n1 vendor: Western Digital model: WD Blue SN570 500GB
    It's a cheaper WD NMVE drive, and my speeds are much slower on write speed.
    Code:
    Run status group 0 (all jobs):
      WRITE: bw=642MiB/s (674MB/s), 642MiB/s-642MiB/s (674MB/s-674MB/s), io=10.0GiB (10.7GB), run=15942-15942msec
    The read speed is decent, not great:
    Code:
    Run status group 0 (all jobs):
       READ: bw=2985MiB/s (3130MB/s), 2985MiB/s-2985MiB/s (3130MB/s-3130MB/s), io=10.0GiB (10.7GB), run=3431-3431msec
    MAFoElffen will remember when I added a SSD drive for additional swap, but, it was still very slow for my liking.
    I'm waiting for a HBA card before I go crazy on tweaking my performance. (Note I'm on Noble 24.04 Testing)
    I'm now looking at your link for " compatible board modules "
    And My Memory are "DDR4 Crucial 2666 8GB" X2
    Code:
    Memory:
      System RAM: total: 16 GiB available: 15.46 GiB used: 7.17 GiB (46.4%)
      Array-1: capacity: 32 GiB slots: 2 modules: 2 EC: None
      Device-1: DIMM 0 type: DDR4 size: 8 GiB speed: 3200 MT/s
      Device-2: DIMM 0 type: DDR4 size: 8 GiB speed: 3200 MT/s
    Last edited by #&thj^%; December 1st, 2023 at 06:41 PM.

  6. #16
    Join Date
    Nov 2023
    Beans
    76

    Re: Seemingly sporadic slow ZFS IO since 22.04

    I went against what I said yesterday and didn't totally cheap out on the NMVe. I went for the old reliable Sammy 980 Pro because I have it in my Thinkpad and it's a great unit. Should be able to max out the M.2 Ultra port (in theory). I'll post what it benches at when I get it. IIRC it disables lane 3 but there's nothing in it.

    When I got the 980 Pro for my Thinkpad it was....... a LOT more... I'm embarrassed to say how much I paid for that when it was released

    That ASRock page was a PITA... why go to all that effort to list that info and not list some part numbers or DIMM types?!
    Last edited by tkae-lp; December 1st, 2023 at 07:39 PM.

  7. #17
    #&thj^% is offline I Ubuntu, Therefore, I Am
    Join Date
    Aug 2016
    Beans
    Hidden!

    Re: Seemingly sporadic slow ZFS IO since 22.04

    Complete information regarding to hardware, will always be lacking.
    You chose wisely.
    Please keep us updated, it helps others like myself for one, when considering new zfs hardware

  8. #18
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Seemingly sporadic slow ZFS IO since 22.04

    Strike through bbcodes are like this [s] Text[/s]... Will show up like this: Text

    Yes. I remember when TheFu got his mother board (new), he shopped around and did his homework. He finally ordered a match set of two RAM sticks (at first), then months later made sure that he ordered "the same memory" from "the same vendor"...

    He was so P'ed off when his system RAM slowed down about 20-30% just by adding those additional two chips in. Sometimes that just doesn't make sense.
    Last edited by MAFoElffen; December 1st, 2023 at 11:17 PM.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  9. #19
    Join Date
    Nov 2023
    Beans
    76

    Post Re: Seemingly sporadic slow ZFS IO since 22.04

    Quote Originally Posted by 1fallen View Post
    You chose wisely.


    Sorry, couldn't resist

    Quote Originally Posted by 1fallen View Post
    Please keep us updated, it helps others like myself for one, when considering new zfs hardware
    Thanks, will do.

    Quote Originally Posted by MAFoElffen View Post
    Strike through bbcodes are like this [s] Text[/s]... Will show up like this: Text
    Ah, ok!

    Quote Originally Posted by MAFoElffen View Post
    Yes. I remember when TheFu got his mother board (new), he shopped around and did his homework. He finally ordered a match set of two RAM sticks (at first), then months later made sure that he ordered "the same memory" from "the same vendor"...

    He was so P'ed off when his system RAM slowed down about 20-30% just by adding those additional two chips in. Sometimes that just doesn't make sense.
    Well... I'm no expert but just based on what I've recently learned that usually boils down to (in no real order):

    1. The mobo is a poor design
    2. The position of the memory was incorrect (not optimal)
    3. The memory was different (different PN)
    4. The memory was a different type

    A friend of mine used to work for a small business that refurbed tape drives. The dodgy guy who owned it would print his own PN labels..... yup, you read that right.... he said it used to frustrate the boss that the unit was 'identical' but because the part number was 1 digit different (like a slightly older hw rev.), they would not be able to shift certain things.. so he'd remove the PN label and put their own on. Dodgy dodgy ******* business...... I never saw any of these labels, but they were doing it for years before the business folded (for an entirely different, but equally dodgy reason) so they must have been at least half convincing, I think they were quite well known as well, as a business I mean.. they did a lot of trade. I'm not saying this was the case with TheFu, but it can/does happen. For example, I'm sure that IJ memory will work just fine in my board if it's all the IJ stuff. As will all the ones listed on Crucial if it's all that PN. But the ones on Crucial for example will definitely slow my board down if I mix them, even though they work. Could even be that the label only showed the main part of the PN.. like dmidecode only showed 36ASF2G72PZ-2G1A2 but there seem to be 2 variants of this (IG and IJ) as well as a 36ASF2G72PZ-2G1B2.

    So at the moment:

    Code:
    free -m
                   total        used        free      shared  buff/cache   available
    Mem:           32010        9999       13893          61        8117       21480
    Swap:          12286           0       12286
    and I get:

    Code:
    rm /mnt/Tank/testfile && dd if=/dev/zero of=/mnt/Tank/testfile bs=1G count=6 oflag=dsync
    6+0 records in
    6+0 records out
    6442450944 bytes (6.4 GB, 6.0 GiB) copied, 10.3578 s, 622 MB/s
    and:

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=write --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    Jobs: 1 (f=1): [W(1)][53.8%][w=484MiB/s][w=484 IOPS][eta 00m:06s]
    Jobs: 1 (f=1): [W(1)][86.7%][w=525MiB/s][w=525 IOPS][eta 00m:02s] 
    Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s]                        
    Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s] 
    TEST: (groupid=0, jobs=1): err= 0: pid=446642: Sat Dec  2 00:04:32 2023
      write: IOPS=473, BW=473MiB/s (496MB/s)(10.0GiB/21639msec); 0 zone resets
        slat (usec): min=233, max=3623, avg=1521.72, stdev=671.83
        clat (usec): min=3, max=6053.1k, avg=65226.74, stdev=329496.49
         lat (usec): min=329, max=6054.9k, avg=66749.10, stdev=329553.81
        clat percentiles (msec):
         |  1.00th=[   11],  5.00th=[   12], 10.00th=[   13], 20.00th=[   16],
         | 30.00th=[   50], 40.00th=[   51], 50.00th=[   55], 60.00th=[   57],
         | 70.00th=[   59], 80.00th=[   63], 90.00th=[   67], 95.00th=[   77],
         | 99.00th=[   88], 99.50th=[   94], 99.90th=[ 6007], 99.95th=[ 6074],
         | 99.99th=[ 6074]
       bw (  KiB/s): min=51200, max=2265088, per=100.00%, avg=638016.00, stdev=400354.97, samples=32
       iops        : min=   50, max= 2212, avg=623.06, stdev=390.97, samples=32
      lat (usec)   : 4=0.01%, 10=0.04%, 500=0.02%, 750=0.02%, 1000=0.01%
      lat (msec)   : 2=0.08%, 4=0.15%, 10=0.65%, 20=21.60%, 50=16.66%
      lat (msec)   : 100=60.46%, >=2000=0.30%
      fsync/fdatasync/sync_file_range:
        sync (nsec): min=1353, max=1353, avg=1353.00, stdev= 0.00
        sync percentiles (nsec):
         |  1.00th=[ 1352],  5.00th=[ 1352], 10.00th=[ 1352], 20.00th=[ 1352],
         | 30.00th=[ 1352], 40.00th=[ 1352], 50.00th=[ 1352], 60.00th=[ 1352],
         | 70.00th=[ 1352], 80.00th=[ 1352], 90.00th=[ 1352], 95.00th=[ 1352],
         | 99.00th=[ 1352], 99.50th=[ 1352], 99.90th=[ 1352], 99.95th=[ 1352],
         | 99.99th=[ 1352]
      cpu          : usr=3.15%, sys=19.69%, ctx=75103, majf=0, minf=15
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=0,10240,0,1 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
      WRITE: bw=473MiB/s (496MB/s), 473MiB/s-473MiB/s (496MB/s-496MB/s), io=10.0GiB (10.7GB), run=21639-21639msec
    and:

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    Jobs: 1 (f=1): [R(1)][-.-%][r=3020MiB/s][r=3020 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=1): err= 0: pid=515253: Sat Dec  2 00:05:26 2023
      read: IOPS=2920, BW=2921MiB/s (3063MB/s)(10.0GiB/3506msec)
        slat (usec): min=273, max=1151, avg=339.36, stdev=47.28
        clat (usec): min=3, max=31257, avg=10501.61, stdev=1305.05
         lat (usec): min=324, max=32409, avg=10841.35, stdev=1340.08
        clat percentiles (usec):
         |  1.00th=[ 6718],  5.00th=[10159], 10.00th=[10159], 20.00th=[10159],
         | 30.00th=[10159], 40.00th=[10290], 50.00th=[10290], 60.00th=[10290],
         | 70.00th=[10421], 80.00th=[10552], 90.00th=[12125], 95.00th=[12125],
         | 99.00th=[13698], 99.50th=[15270], 99.90th=[26346], 99.95th=[28705],
         | 99.99th=[30802]
       bw (  MiB/s): min= 2344, max= 3040, per=99.76%, avg=2913.71, stdev=252.00, samples=7
       iops        : min= 2344, max= 3040, avg=2913.71, stdev=252.00, samples=7
      lat (usec)   : 4=0.05%, 500=0.05%, 750=0.05%, 1000=0.02%
      lat (msec)   : 2=0.13%, 4=0.31%, 10=0.96%, 20=98.21%, 50=0.22%
      cpu          : usr=0.54%, sys=99.43%, ctx=12, majf=0, minf=8205
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=2921MiB/s (3063MB/s), 2921MiB/s-2921MiB/s (3063MB/s-3063MB/s), io=10.0GiB (10.7GB), run=3506-3506msec
    During each job, memory usage climbs by a few gigs then drops back down again. Looks like I'm in a reasonable patch at the moment.

    But again 5 mins later:

    Code:
    rm /mnt/Tank/testfile && dd if=/dev/zero of=/mnt/Tank/testfile bs=1G count=6 oflag=dsync
    6+0 records in
    6+0 records out
    6442450944 bytes (6.4 GB, 6.0 GiB) copied, 72.1945 s, 89.2 MB/s
    and:

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=write --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    Jobs: 1 (f=1): [W(1)][50.0%][w=593MiB/s][w=593 IOPS][eta 00m:07s]
    Jobs: 1 (f=1): [W(1)][72.2%][w=123MiB/s][w=123 IOPS][eta 00m:05s] 
    Jobs: 1 (f=1): [W(1)][76.0%][w=31.0MiB/s][w=31 IOPS][eta 00m:06s] 
    Jobs: 1 (f=1): [W(1)][75.8%][w=10.0MiB/s][w=10 IOPS][eta 00m:08s] 
    Jobs: 1 (f=1): [W(1)][75.6%][w=2050KiB/s][w=2 IOPS][eta 00m:10s] 
    Jobs: 1 (f=1): [W(1)][80.4%][w=133MiB/s][w=133 IOPS][eta 00m:09s]
    Jobs: 1 (f=1): [W(1)][97.7%][w=148MiB/s][w=148 IOPS][eta 00m:01s] 
    Jobs: 1 (f=1): [W(1)][98.0%][eta 00m:01s] 
    Jobs: 1 (f=1): [W(1)][98.1%][eta 00m:01s] 
    TEST: (groupid=0, jobs=1): err= 0: pid=520493: Sat Dec  2 00:11:57 2023
      write: IOPS=199, BW=199MiB/s (209MB/s)(10.0GiB/51363msec); 0 zone resets
        slat (usec): min=295, max=1187.2k, avg=4112.57, stdev=17844.91
        clat (usec): min=3, max=9229.9k, avg=154951.33, stdev=640495.91
         lat (usec): min=348, max=9231.6k, avg=159064.75, stdev=647743.56
        clat percentiles (msec):
         |  1.00th=[   11],  5.00th=[   23], 10.00th=[   27], 20.00th=[   29],
         | 30.00th=[   49], 40.00th=[   53], 50.00th=[   54], 60.00th=[   56],
         | 70.00th=[   56], 80.00th=[   78], 90.00th=[  201], 95.00th=[  426],
         | 99.00th=[ 2265], 99.50th=[ 5000], 99.90th=[ 9194], 99.95th=[ 9194],
         | 99.99th=[ 9194]
       bw (  KiB/s): min= 2048, max=1263616, per=100.00%, avg=245982.07, stdev=305800.34, samples=83
       iops        : min=    2, max= 1234, avg=240.22, stdev=298.63, samples=83
      lat (usec)   : 4=0.02%, 10=0.02%, 20=0.01%, 500=0.01%, 750=0.01%
      lat (usec)   : 1000=0.01%
      lat (msec)   : 2=0.06%, 4=0.09%, 10=0.63%, 20=2.89%, 50=34.20%
      lat (msec)   : 100=43.91%, 250=10.29%, 500=3.43%, 750=1.62%, 1000=0.98%
      lat (msec)   : 2000=0.74%, >=2000=1.08%
      fsync/fdatasync/sync_file_range:
        sync (nsec): min=1345, max=1345, avg=1345.00, stdev= 0.00
        sync percentiles (nsec):
         |  1.00th=[ 1352],  5.00th=[ 1352], 10.00th=[ 1352], 20.00th=[ 1352],
         | 30.00th=[ 1352], 40.00th=[ 1352], 50.00th=[ 1352], 60.00th=[ 1352],
         | 70.00th=[ 1352], 80.00th=[ 1352], 90.00th=[ 1352], 95.00th=[ 1352],
         | 99.00th=[ 1352], 99.50th=[ 1352], 99.90th=[ 1352], 99.95th=[ 1352],
         | 99.99th=[ 1352]
      cpu          : usr=1.32%, sys=10.72%, ctx=70195, majf=0, minf=14
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=0,10240,0,1 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
      WRITE: bw=199MiB/s (209MB/s), 199MiB/s-199MiB/s (209MB/s-209MB/s), io=10.0GiB (10.7GB), run=51363-51363msec
    and:

    Code:
    fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.28
    Starting 1 process
    Jobs: 1 (f=1): [R(1)][-.-%][r=2833MiB/s][r=2833 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=1): err= 0: pid=559190: Sat Dec  2 00:13:26 2023
      read: IOPS=2754, BW=2755MiB/s (2889MB/s)(10.0GiB/3717msec)
        slat (usec): min=290, max=1156, avg=359.97, stdev=47.84
        clat (usec): min=2, max=32686, avg=11134.73, stdev=1355.69
         lat (usec): min=337, max=33843, avg=11495.11, stdev=1391.41
        clat percentiles (usec):
         |  1.00th=[ 7111],  5.00th=[10814], 10.00th=[10814], 20.00th=[10814],
         | 30.00th=[10945], 40.00th=[10945], 50.00th=[10945], 60.00th=[10945],
         | 70.00th=[10945], 80.00th=[11076], 90.00th=[12911], 95.00th=[12911],
         | 99.00th=[14091], 99.50th=[16057], 99.90th=[27919], 99.95th=[30278],
         | 99.99th=[32375]
       bw (  MiB/s): min= 2192, max= 2846, per=99.54%, avg=2742.29, stdev=242.91, samples=7
       iops        : min= 2192, max= 2846, avg=2742.29, stdev=242.91, samples=7
      lat (usec)   : 4=0.05%, 500=0.05%, 750=0.05%
      lat (msec)   : 2=0.15%, 4=0.29%, 10=0.83%, 20=98.33%, 50=0.25%
      cpu          : usr=1.26%, sys=98.68%, ctx=5, majf=0, minf=8203
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=2755MiB/s (2889MB/s), 2755MiB/s-2755MiB/s (2889MB/s-2889MB/s), io=10.0GiB (10.7GB), run=3717-3717msec
    Memory usage seemed heavier this time. Used more like 5-6gigs and only dropped back down to:

    Code:
    free -m
                   total        used        free      shared  buff/cache   available
    Mem:           32010       10180       13712          60        8117       21301
    Swap:          12286           0       12286
    But this does seem to solve part of the riddle... it's nothing to do with running out of memory.
    Last edited by tkae-lp; December 2nd, 2023 at 01:26 AM. Reason: Add pic. Fix code tags. Typo. Additional info.

  10. #20
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Seemingly sporadic slow ZFS IO since 22.04

    Ready to try a test on the cheap, without spending any money yet?

    This is the entries I want you to change, noted from my /etc/modprobe.d/zfs.conf file:
    Code:
    mafoelffen@Mikes-B460M:~$ grep zfs_arc_ /etc/modprobe.d/zfs.conf
    # This is the default for mine, which is half the total memory (set to 67GB). Yours will be about 16GB... 
    options zfs zfs_arc_max=68719476736
    # The default for minimum is about 1GB
    options zfs zfs_arc_min=1073741824
    Try editing your and set to 4GB. It is in bytes, so 4x1024x10000=40960000...

    Then update the intramfs image to take that change
    Code:
    sudo update-intramfs -c -k all
    You said Samsung 980 pro right... What about 1TB? Pop that in. Give it a a GPT partition table, and 2 partitions of 512GB... Type BF07.

    Do
    Code:
    ls -l /dev/disk/by-id/
    To get the unique_diskid... then
    Code:
    #Set variable for the NVme disk's Ubique_DiskID
    DISK=/dev/disk/bi-id/<unique_diskid>
    ## Adding L2ARC
    # tank
    sudo zpool add tank cache $DISK-part1
    ## Add single SLOG
    sudo zpool add -f tank log $DISK-part2
    After you add it in, test.

    Since this is disk caches for HHD vdev's, even using an SSD would help. But yes, I use NVMe for mine also:
    Code:
    mafoelffen@Mikes-B460M:~$ sudo zpool status -v datapool
      pool: datapool
     state: ONLINE
      scan: scrub repaired 0B in 00:18:49 with 0 errors on Tue Nov 28 12:13:28 2023
    config:
    
        NAME                                                   STATE     READ WRITE CKSUM
        datapool                                               ONLINE       0     0     0
          raidz2-0                                             ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_2TB_S6PNNM0TA09560A-part1  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_2TB_S6PNNM0TA11601H-part1  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_2TB_S6PNNM0TA47393M-part1  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_2TB_S6PNNS0W330507J-part1  ONLINE       0     0     0
            ata-Samsung_SSD_870_EVO_2TB_S6PNNM0TB08933B-part1  ONLINE       0     0     0
        logs    
          nvme-Samsung_SSD_970_EVO_2TB_S464NB0KB10521K-part2   ONLINE       0     0     0
        cache
          nvme-Samsung_SSD_970_EVO_2TB_S464NB0KB10521K-part1   ONLINE       0     0     0
    
    errors: No known data errors

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

Page 2 of 14 FirstFirst 123412 ... LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •