Page 1 of 3 123 LastLast
Results 1 to 10 of 21

Thread: RocketRAID 622 rr62x:hpt_reset

  1. #1
    Join Date
    Nov 2009
    Beans
    12

    Exclamation RocketRAID 622 rr62x:hpt_reset

    Well this is getting frustrating! I've had Linux software raid (mdadm) working for over 3 years. I needed to expand, and slowly replace the old hard drives as they will be dieing soon.

    I bought the Sans Digital TR5M-BP kit:

    http://www.sansdigital.com/towerraid-plus/tr5mbp.html

    It's a VERY slick piece of hardware. The only negative is that it comes with the Highpoint RocketRAID 622 PCI-e card.

    I'm running Ubuntu 10.04.1 LTS Desktop on a Asus P5Q-E and have successfully migrated the drives out of my case and into this TR5M case, plus another TR5M case with all new drives. I went and compiled the driver, since Highpoint is STILL on 9.10 for precompiled drivers. I created the rr62x.ko driver and I am able to load it and see all of my drives and access them. So I'm using the RocketRAID as a pass-through or JBOD setting, NOT fake-raid.

    Code:
    [   63.201986] rr62x:RocketRAID 62x SATA controller driver v1.1 (Oct  7 2010 19:38:39)
    [   63.202012] pci 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
    [   63.202017] pci 0000:02:00.0: setting latency timer to 64
    [   63.202034] rr62x:adapter at PCI 2:0:0, IRQ 16
    [   63.205002] rr62x:[0 0  ] start port.
    [   63.205002] rr62x:[0 0  ] start port hard reset (probe 1).
    [   63.205002] rr62x:[0 1  ] start port.
    [   63.205002] rr62x:[0 1  ] start port hard reset (probe 1).
    [   66.380095] rr62x:[0 0  ] start port soft reset (probe 1).
    [   66.384091] rr62x:[0 1  ] start port soft reset (probe 1).
    [   67.675183] rr62x:[0 0  ] pmp attached: vendor 1095 device 3726.
    [   67.678166] rr62x:[0 1  ] pmp attached: vendor 1095 device 3726.
    [   72.961072] rr62x:[0 0 0] start device soft reset.
    [   73.645074] rr62x:[0 1 0] start device soft reset.
    [   74.296111] rr62x:[0 0 1] start device soft reset.
    [   74.950130] rr62x:[0 1 1] start device soft reset.
    [   74.950130] rr62x:[0 0 2] start device soft reset.
    [   76.263120] rr62x:[0 1 2] start device soft reset.
    [   76.263120] rr62x:[0 0 3] start device soft reset.
    [   77.575113] rr62x:[0 1 3] start device soft reset.
    [   77.575113] rr62x:[0 0 4] start device soft reset.
    [   78.887102] rr62x:[0 1 4] start device soft reset.
    [   78.887102] rr62x:[0 0  ] port started successfully.
    [   78.887102] rr62x:[0 0 0] device probed successfully.
    [   78.887102] rr62x:[0 0 1] device probed successfully.
    [   78.887102] rr62x:[0 0 2] device probed successfully.
    [   78.887102] rr62x:[0 0 3] device probed successfully.
    [   78.887102] rr62x:[0 0 4] device probed successfully.
    [   79.870613] rr62x:[0 1  ] port started successfully.
    [   79.870613] rr62x:[0 1 0] device probed successfully.
    [   79.870613] rr62x:[0 1 1] device probed successfully.
    [   79.870613] rr62x:[0 1 2] device probed successfully.
    [   79.870613] rr62x:[0 1 3] device probed successfully.
    [   79.870613] rr62x:[0 1 4] device probed successfully.
    [   80.269774] scsi8 : rr62x
    [   80.270012] scsi 8:0:0:0: Direct-Access     ATA      WDC WD10EACS-00D 01.0 PQ: 0 ANSI: 5
    [   80.270078] scsi 8:0:1:0: Direct-Access     ATA      WDC WD10EACS-00D 01.0 PQ: 0 ANSI: 5
    [   80.270144] scsi 8:0:2:0: Direct-Access     ATA      WDC WD10EACS-00D 01.0 PQ: 0 ANSI: 5
    [   80.270207] scsi 8:0:3:0: Direct-Access     ATA      WDC WD10EACS-00Z 01.0 PQ: 0 ANSI: 5
    [   80.270270] scsi 8:0:4:0: Direct-Access     ATA      WDC WD10EACS-00D 01.0 PQ: 0 ANSI: 5
    [   80.270331] scsi 8:0:5:0: Direct-Access     ATA      WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5
    [   80.270392] scsi 8:0:6:0: Direct-Access     ATA      WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5
    [   80.270454] scsi 8:0:7:0: Direct-Access     ATA      WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5
    [   80.270515] scsi 8:0:8:0: Direct-Access     ATA      WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5
    [   80.270576] scsi 8:0:9:0: Direct-Access     ATA      WDC WD20EARS-00J 80.0 PQ: 0 ANSI: 5
    See everything works great so far!! Happy, Happy!

    Now for the kick in the head....

    On long writes (Lets say that I'm copying directories over from one raid to another, or large 4GB+ files) The damn driver resets on me!

    Code:
    [  517.988113] rr62x:hpt_reset(8/0/5)
    [  517.992099] rr62x:[0 1  ] failed to disable comm status change bits
    [  517.992099] rr62x:[0 1  ] start port.
    [  517.992099] rr62x:[0 1  ] start port hard reset (probe 1).
    [  521.167458] rr62x:hpt_reset(8/0/6)
    [  521.167463] rr62x:hpt_reset(8/0/7)
    [  521.167466] rr62x:hpt_reset(8/0/8)
    [  521.167469] rr62x:hpt_reset(8/0/9)
    [  524.169022] rr62x:[0 1  ] start port soft reset (probe 1).
    [  529.061091] rr62x:[0 1 0] start device soft reset.
    [  529.712159] rr62x:[0 1  ] port started successfully.
    [  529.712159] rr62x:[0 1 0] device done to reset (reset 1)
    [  594.988549] rr62x:hpt_reset(8/0/5)
    [  594.992535] rr62x:[0 1  ] failed to disable comm status change bits
    [  594.992535] rr62x:[0 1  ] start port.
    [  594.992535] rr62x:[0 1  ] start port hard reset (probe 1).
    [  601.104084] rr62x:[0 1  ] start port soft reset (probe 1).
    [  606.001084] rr62x:[0 1 0] start device soft reset.
    [  606.652602] rr62x:[0 1 2] start device soft reset.
    [  607.309411] rr62x:[0 1  ] port started successfully.
    [  607.309411] rr62x:[0 1 0] device done to reset (reset 1)
    [  607.309411] rr62x:[0 1 2] device done to reset (reset 1)
    [  607.789615] rr62x:hpt_reset(8/0/6)
    [  607.789619] rr62x:hpt_reset(8/0/7)
    [  607.789623] rr62x:hpt_reset(8/0/8)
    [  607.789626] rr62x:hpt_reset(8/0/9)
    This happens constantly and randomly!! So in the middle of a file copy the port that is on the writing set of raid drives, decides to reset!! This causes a file-system freeze for over a minute everytime!

    In the above example you can see that port 2 (0 1 0) decides to reset, causing ALL 5 drives to be restarted. Port 1 is still ok, since I was reading from that array.

    Has anyone else seen this problem??? This is rendering my setup useless..

    Thanks.

  2. #2
    Join Date
    May 2008
    Location
    Cowtown
    Beans
    573
    Distro
    Ubuntu 8.04 Hardy Heron

    Re: RocketRAID 622 rr62x:hpt_reset

    Try disabling the read ahead and write caches on each drive. I had the same problem but after disabling the caches I have not had any more disconnects or kernel panics.

  3. #3
    Join Date
    Apr 2007
    Beans
    5

    Re: RocketRAID 622 rr62x:hpt_reset

    Out of curiosity, are you using RE drives, or standard desktop drives?

    I'm thinking about purchasing a TR5M-BP and am curious if with Software raid it would make a difference or not.

    Also, has any looked at fake raid performance vs. Linux software raid on this device? Just curious mostly....would probably stick with software raid anyhow.

    -Steve

  4. #4
    Join Date
    May 2008
    Location
    Cowtown
    Beans
    573
    Distro
    Ubuntu 8.04 Hardy Heron

    Re: RocketRAID 622 rr62x:hpt_reset

    Quote Originally Posted by smeuse View Post
    Out of curiosity, are you using RE drives, or standard desktop drives?
    -Steve
    I am using Samsung HD204UI drives; regular desktop drives.

  5. #5
    Join Date
    Dec 2010
    Beans
    2
    Distro
    Ubuntu

    Re: RocketRAID 622 rr62x:hpt_reset

    Is the sata_mv module loaded? I'm having a similar problem with the rr64x module loaded. rr and mv don't seem to play nice together.

    I've removed the sata_mv module and I'm going to see if it clears up that ugly hpt_reset issue on large file transfers.

    My setup is:
    SuperMicro T6016T SuperServer (Intel E5500 w/ 8GB installed)
    Highpoint RocketRAID 644
    15 bay port-multiplier enclosure
    15x WD Caviar Black 2TB
    OpenFiler

    I realize that this machine is running rPath, not Ubuntu, but the problem seems to exist across distributions.

    Anyone running *BSD with Highpoint RocketRAID cards? (just for comparison)

  6. #6
    Join Date
    May 2008
    Location
    Cowtown
    Beans
    573
    Distro
    Ubuntu 8.04 Hardy Heron

    Re: RocketRAID 622 rr62x:hpt_reset

    I have moved to Gentoo for the system with the rr622 in it. The relevant config bits from the kernel:
    Code:
    # CONFIG_SATA_MV is not set
    # CONFIG_SATA_NV is not set
    I am not using sata_mv. What stopped the kernel panics for me was disabling the drive caches noted above.

  7. #7
    Join Date
    Dec 2010
    Beans
    2
    Distro
    Ubuntu

    Re: RocketRAID 622 rr62x:hpt_reset (add rr64x)

    I confirm that my problem was not with sata_mv.

    I rethought my strategy (ease-of-management(==pure laziness)) and went from OpenFiler (kernel was older) back to a generic Ubuntu server (AMD64) installation. The problem persisted, however.

    I duly filed a support request with Highpoint about a week ago, but didn't receive any reply. So, in frustration, I went out and purchased a low-end Silicon Image 3132 based card. (I spent my own money on this so that my boss would know that I was serious about it.)

    I rebuilt my linux software RAID, scanned and mounted the LVM volumes and didn't get a single error on the simple (but large-ish) file transfer tests. I haven't given it a full battery yet, but initial results indicate that it will work better (for my situation, at least).

    (Linux software RAID behind a decent UPS system is solid in my book).

    I neglected to consider (or report earlier) that the port multipliers in our external enclosure are using the Silicon Image 3726 PM chip. I had read about BackBlaze's field research into using PMs and SATA controllers (see footnote), but wasn't really ready to give up on the (enclosure-vendor-supplied) rr644 card right away; Highpoint's current slogan states that it is "Port Multiplier ready". So much for stubbornness (wish in one hand and shift in the other, see which fills up faster).

    I've also made sure that our enclosure vendor (Norco) is aware of the problem; their hardware is pretty cheap, but it does seem as though it's designed to do what it says on the tin. Hopefully they'll be interested in making sure that they supply interface hardware that actually works with their enclosure systems. i.e. something other than the "RocketRAID" junk.

    The only problem is that the low-end Silicon Image card only has two external ports and I need three ports to use all the disks in our enclosure. The results so far, however, have given me enough encouragement to order a 4 port Silicon Image based controller.

    The SMART data, etc are available to the OS when using the SI3132 card. With the RocketRAID, the drives were only accessible as abstracted RocketRAID generic SCSI drives. I'll dig into a bit more as time is available, but it seems like the SI card gives direct access to the drives.

    And there hasn't been a glut of bus resets either. It's all gravy.

    I'll report back when I get a >2 port SI-based card. Hopefully it'll all be golden.

    If anyone ever hears anything useful from Highpoint tech support, I'd like to hear it. They haven't given me any help, and there's almost no useful information to be gleaned by searching the web.

    Footnote (from BackBlaze' blog):
    A note about SATA chipsets: Each of the port multiplier backplanes has a Silicon Image SiI3726 chip so that five drives can be attached to one SATA port. Each of the SYBA two-port PCIe SATA cards has a Silicon Image SiI3132, and the four-port PCI Addonics card has a Silicon Image SiI3124 chip. We use only three of the four available ports on the Addonics card because we have only nine backplanes. We don’t use the SATA ports on the motherboard because, despite Intel’s claims of port multiplier support in their ICH10 south bridge, we noticed strange results in our performance tests. Silicon Image pioneered port multiplier technology, and their chips work best together.

    I.e. Check your PM chipset, fakeraid JBOD on RocketRAID doesn't seem to be a pass-through.
    Last edited by rod.batten; December 27th, 2010 at 06:04 PM. Reason: Added link to BackBlaze blog notes

  8. #8
    Join Date
    Aug 2006
    Beans
    11

    Re: RocketRAID 622 rr62x:hpt_reset

    I've had a rough time w/my RR 622 and the Sans Digital TR8MP. I was getting kernel panic as previously stated. Disabling the read ahead and write cached helped, but I'm only getting ~10MB writes in RAID10 using 8 WD1001FALS drives.

    I contacted tech support detailing the issues and received the following email:
    == Begin Response ==
    Thank you for purchasing our product.
    RR622 should be compatible with Ubuntu. Please try download and install the driver from HighPoint website.
    http://www.highpoint-tech.com/USA_new/series_rr600.htm
    Please click on "Download" and choose the correct version according to your OS under the "RocketRAID 622".
    == End Response ==

    Not really helpful.

    Rod, Interested to hear your feedback on the 3132 card.

  9. #9
    Join Date
    Aug 2006
    Beans
    11

    Re: RocketRAID 622 rr62x:hpt_reset

    Just wanted to follow up in case anyone else runs into this issue.

    Purchased a SYBA PCIe card to replace the RR622. Was able to switch cards no problem. Card recognized without issue (Ubuntu 10.10 server), mdadm raid came back up after adjusting mdadm.conf to new device mappings.

    Attempted a large write and a drive immediately fell out. A few searches indicated that NCQ needed to be disabled on the drives so I added the following for each drive in rc.local:
    Code:
    echo 1 > /sys/block/sdX/device/queue_depth
    was able to then read and write successfully, but still very slow and still generating errors:
    Code:
    Jan  8 14:29:54 i5server kernel: [    7.937687] ata4.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
    Jan  8 14:29:54 i5server kernel: [    7.938211] ata4.00: hard resetting link
    Jan  8 14:29:54 i5server kernel: [    8.287772] ata4.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
    Jan  8 14:29:54 i5server kernel: [    8.287847] ata4.01: hard resetting link
    Jan  8 14:29:54 i5server kernel: [    8.637741] ata4.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    Jan  8 14:29:54 i5server kernel: [    8.637819] ata4.02: hard resetting link
    Jan  8 14:29:54 i5server kernel: [    8.987704] ata4.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    Jan  8 14:29:54 i5server kernel: [    8.987782] ata4.03: hard resetting link
    Jan  8 14:29:54 i5server kernel: [    9.337684] ata4.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    Jan  8 14:29:54 i5server kernel: [    9.337759] ata4.04: hard resetting link
    Jan  8 14:29:54 i5server kernel: [    9.687631] ata4.04: SATA link down (SStatus 0 SControl 320)
    Jan  8 14:29:54 i5server kernel: [    9.687777] ata4.05: hard resetting link
    Jan  8 14:29:54 i5server kernel: [   10.037617] ata4.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
    Doing some searching revealed that the esata cables may be suspect. I picked up 2 startech.com shielded esata cables and replaced them. Now everything is working great. 120MB/s writes and 1.5GB/s reads on raid10. The cables may have been the only problem initially, but it's hard to say given the behvior of the RR622 card.

    Justin

  10. #10
    Join Date
    Dec 2007
    Beans
    18
    Distro
    Kubuntu 10.10 Maverick Meerkat

    Re: RocketRAID 622 rr62x:hpt_reset

    Another FYI, in case someone comes along with the same problem. I started out with a vanilla install of Lucid (Kubuntu, but that's irrelevant) and had this exact issue. Finally got it sorted out last night / today after two weeks on-and-off banging my head on it. I have the TR8M running off the included RocketRAID card. The array has been syncing overnight and has generated zero hpt_reset errors. Long story short, I think the solution is this:

    1. IF RUNNING A VERSION OF UBUNTU LOWER THAN 10.10, update to a 2.6.35 or higher kernel (note: 10.10 has a 2.6.35 kernel by default; you can skip this step by installing or upgrading to that version).
    #sudo apt-get update; sudo apt-cache search linux-image
    Find a suitable kernel version number (x) and build type (y - generic, server, whatever) on the resulting list.
    #sudo apt-get install linux-image-x.x.xx-xx-yyyy linux-headers-x.x.xx-xx-yyyy
    Reboot.

    2. Read and do: https://help.ubuntu.com/community/RocketRaid

    3. Insert the driver.
    #sudo modprobe rr62x
    Edit your /etc/modules and add rr62x on a new line at the end of the file.
    Reboot

    4. Confirm the driver installs on boot.
    #lsmod | grep rr62x

    5. Set up the drives in the card's BIOS (either reboot and hotkey into the card's BIOS directly during boot, or install and use the Highpoint web-GUI software from http://www.highpoint-tech.com/USA_new/series_rr600.htm with the system running). Note that setting up the actual RAID array directly on the card's BIOS limits arrays to five drives per - that's just a limitation of the card's BIOS. I prefer to set each drive up in the card's BIOS as its own JBOD array (a direct pass-through, basically), then use mdadm to set up the RAID array; that gets around the five-drive limit since the actual array control is in mdadm and not BIOS, and I personally prefer mdadm to on-chip fakeRAID anyway.

    6. Wait. For scale, syncing a RAID-5 array of 8 2TB 5900RPM drives (Seagate Barracuda LPs) on an Atom D525 rig (it's just a home media server - needed small footprint and low heat production more than performance) takes about 24 hours. You can track the status via proc:
    #watch cat /proc/mdstat
    When the sync finishes, your array is done; set up the filesystem, fstab mounts, whatever, and rock on.

    On previous attempts I'd have a huge list of hpt_reset errors by now, accompanied by the lights on the front of the enclosure flashing in sequence as the driver re-initializes and tries again, wash-rinse-repeat every few minutes. YMMV, but I've got an array as described in #6 at 85% sync after 16-some hours of syncing without a single error or a single observed enclosure reset.
    Last edited by Dr. Strange; April 17th, 2011 at 08:27 PM. Reason: Additional information for different versions of Ubuntu

Page 1 of 3 123 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •