Sunday, October 4, 2015

HP P410i - P420i Self-test failure lockup

Recently I was messing around with a DL360G8 from HP. It all looked well until my raid controller, a p420i, quit while I was installing a virtual machine in ESX6.0. When I reboot, I got this nice message; 1783-slot 0 Drive Array Controller Failure. [Self-test failure (cmd=0h, err=00h, lockup=013:0h)]



At first I was kind of clueless, what to do? Was it a failure in my raid controller, or maybe in the cache module? After some Google work I found most people who had a similar problem had it because they did not have OEM disks (neither did I, I just replaced them with some aftermarket Intel SSD DC S3500 ssd's). But after removing them and trying to boot without disks, the error persisted, so it must me something else. I tried booting without the cache module, since many people had issues with this as-well. Unfortunately, this didn't help either. Then I figured I would let the server sit without power for a while to drain any leftover power from batteries or other kind of capacitators, when this didn't work I disabled the p420i in the BIOS/UEFI, I figured maybe if I reinitialise the card in this way, it might work... Wishful thinking, it didn't.

Then, I got the idea to disconnect the cache module, and more importantly, the SAS cables on the motherboard, so there wouldn't be any signal to the raid controller at all. Hallelujah and praise the Flying Spaghetti Monster! It works! In all my enthusiasm I powered down the server to re-attach the cache module and the SAS cables. But hey, when I rebooted the server, the error message was back... Bummer! I figured it must of been the SAS cables, because the cache module alone didn't fix it. So I powered the machine down again and disconnected the SAS cables. I rebooted the machine and held my fingers crossed (with a G8, it takes a while to boot..). Yes! Again no error message. So I hit F8 to enter the adapters setup. Whilst in setup I reconnected the SAS cables to the motherboard and one by one my ssd's reinitialised. When I clicked on the create a raid array, there where my disks, and I could successfully create a new array! After rebooting there was no error message anymore and I was able to boot safely into ESX and continue with my work!


I hope you guys find this useful, and as always... If you find a controller that is helped with the same fix, leave a comment, and you might help someone else with it to!