Jan. 12, 2012, 11:34 a.m.
IT

Replacing a RAID drive using FakeRaid (dmraid) on Linux

I recently had a client that needed to replace a hard drive in a Linux FakeRaid RAID5 array using dmraid. The process is very unintuitive:

  1. Identify the failed drive by looking at /var/log/messages, dmesg or dmraid -r and run sudo smartctl -i /dev/sda and look at the Serial Number (change /dev/sda to the correct disk). Match this serial number to the physical disk.
  2. Replace the failed drive with a new drive matching or exceeding its capacity.
  3. In the BIOS for the FakeRaid controller, add the new drive back and ensure the controller is rebuilding the volume.
  4. Boot into the OS.
  5. run

    dmraid -a y

  6. This will activate the raid set and should start the rebuild process. In my case this did not persist after reboot, hence /dev/mapper did not show the partition table. The solution was to rebuild the initrd image:

    uname -a
    mkinitrd /boot/initrd-2.6.18-128.1.6.el5.img.NEW 2.6.18-128.1.6.el5

  7. uname -a was used to identify the correct kernel. Obviously you need to update grub with the new image. Reboot otherwise you will not see the partitions on the mapper device to mount.

For some reason the whole ext3 filesystem was corrupted. I had to recreate the filesystem and rebuild the data from scratch. Not sure if it is FakeRaid's fault, or whether I made a mistake. But it certainly did not boost my confidence in software based RAID.