PDA

View Full Version : Oh glorious RAID


MidniteArrow
May 24th, 2005, 11:30 AM
Well, my investment just provided a return, although could anything else go wrong? This is just a thread to vent...

About a week ago, I rebooted my server on my lunch break. When it was starting up, it started making a very pocket-emptying noise. On boot-up, my RAID card told me my RAID array had issues. It is a RAID 5 with 4 200 GB Maxtor SATA drives. RAID 5 will allow 1 drive to die without data loss.

So the debugging begins. I identified that one of the drives in the array had gone belly up. I pulled it and submitted a RMA to maxtor.

Not wanting to wait for the drive, I went out and bought a new 200 GB Maxtor SATA drive at Best Buy - I felt so dirty, but I wanted my data back ASAP. I plugged it in and booted up. Now, at this point, the RAID array should have noticed that it could rebuild itself and start re-building. But I didn't get that option, I only had the option to ignore the error (and not have access to the drive) or delete the array.

I really, really, really did not want to delete the array, so I called up Highpoint, the manufacturers of my RAID card. Together we identified that one of the other 3 drives, that still appeared to be working, had an issue. They sent me a hacking tool they use internally. I was able to trace the issue to the RAID header on the drive. It was suppsoed to be drive #2 in the array, but the header showed it as drive #1. No biggie - their tool will re-write the headers. This is when I became a Highpoint tester. While their card, and thereby the software that they wrote running on their card, recognizes the new drive I bought on boot-up, their hacking tool did not. So I've now got a useless 200 GB drive sitting on my desk.

I then remembered that I've got a 200 GB WD drive sitting in the corner gathering dust. I mean, come on, parallel ATA - who uses those? but I've got a converter unit. I hooked it up, and what do I get for my trouble? Nothing. nada. No recognition at all. My guess is that the ATA->SATA convertor doesn't work for UDMA drives. So, again, at a standstill.

Eventually, the replacement drive from Maxtor arrives. I hook it up and wahoo. We're off. Well, not so much. First I have to re-write the RAID headers. So I boot up their hacking tool. It recognizes all the drives. I start the RAID header re-build and it fails, of course. It seems there's something wrong with drive #3 now. After some phone time with Highpoint, and some with Maxtor, I download a Maxtor tool called Powermax. It finds a problem with the data on that drive and fixes it. The Highpoint tool finally works.

So now it's time to re-build the array. That starts up as expected, but slow is an understatement. About 4% an hour. I let it run overnight. This morning I get up and it hit a data error at 48%. WinXP checks the filesystem and finds 4 or 5 errors, but for the most part, it worked (drive was about 48% full). I haven't found what was lost yet, so a pseudo happy ending...

What is it with me? Do I emit some sort of anit-technology aura?

Casper
May 24th, 2005, 01:35 PM
I've always had luck with Powermax - it's a great tool. But dam, that's a whole lot of work - hope the data was worth the extreme hassle.

Btw, are you in NY? I swore I saw White Plains or something from you - didn't think you were that close.

MidniteArrow
May 24th, 2005, 06:01 PM
Yeah, it was so worth it. I run a website for our family, an ftp server for my gaming friends, and we have all of my family's pictures for the last 4-5 years on it. Which is why I invested in the RAID hardware.

I'm not in NY, I'm in Huntsville, Alabama. Rokit citee. I would have looked you up by now if I were in NY. Haven't been there since just after 9/11.

MidniteArrow
May 25th, 2005, 01:10 AM
Well now the RAID array noticed a problem with Disk #1 that was verified and theoretically fixed by PowerMax. It's rebuilding AGAIN. Here we go again. On the up-side - it does rebuild in the background in Windows while the array is available for use.

MidniteArrow
May 25th, 2005, 02:38 AM
Ok - this is getting fun. That time it got to 99.5% complete when rebuilding before it failed - drive #4 this time. Starting a full-read test with PowerMax, I'm sure it will find and fix the same problem and I'll get to start re-building again. But I'm done for tonight. At least I'm out of drives for PowerMax to "fix".

Casper
May 25th, 2005, 06:02 AM
You have another card to try out? It sounds like that might be what's causing your problems.

MidniteArrow
June 1st, 2005, 07:00 PM
I guess I never dropped by and closed this loop. It took 5 rebuilds, but I got all the data back, at least enough so I haven't noticed anything missing. woot. Time will tell if I lost anything I guess.