Lab #4: RAID
Date: February 4, 2008
In this lab, we're going to create RAID volumes using Linux Software RAID. We will use the mdadm command to create, modify, and destroy RAID volumes. We will create both a RAID 1 mirror set and a RAID 0 stripe set, then we will cause each of them to fail and observe the failure modes of both types of RAID volume.
Since we only have one disk, we're going to use multiple partitions on a single disk to create our RAID set. Note that you should never do this in production use, because to gain either the performance or reliability advantages of RAID, each member of the RAID set must be an independent storage device.
RAID 1 Mirror
- Create four 1G partitions to use in this lab. You'll have to reboot after you save your partition table for CentOS to be able to recognize the new partitions.
fdisk /dev/sda
- Create the RAID device with the mdadm command. The missing argument takes the place of a device, allowing us to create the RAID set with only one disk partition. While normally you would create the RAID mirror with both devices initially, we do this so that you can watch the RAID mirror rebuild later in this lab.
mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdaM
- Check that the new RAID device exists.
cat /proc/mdstat
- The output should look similar to the output below. The [2/1] [_U] indicates that only one disk of the RAID 1 mirror is currently available.
Personalities : [raid1]
md0 : active raid1 sda8[1]
104320 blocks [2/1] [_U]
unused devices:
- Create a filesystem on the RAID set.
mke2fs -v -j /dev/md0
- Mount the volume and copy some files to it.
mount /dev/md0 /mnt
cp -a /boot/* /mnt
df -h /mnt
- Add the second partition to the RAID set, which will trigger a rebuild.
mdadm /dev/md0 --add /dev/sdaN
- Watch the rebuild process using the watch command to repeatedly view the contents of /proc/mdstat once per second.
watch -n 1 cat /proc/mdstat
- Since we can't wait around for the disk to fail on its own, cause a RAID failure manually using the mdadm command. View the status afterwards.
mdadm /dev/md0 --fail /dev/sdaN
cat /proc/mdstat
- Remove the failed device from the array.
mdadm /dev/md0 --remove /dev/sdaN
cat /proc/mdstat
- Verify that you can still access the data on the array, even though one device has failed.
ls -l /mnt
cat /mnt/grub/menu.lst
- Add the device back and watch the RAID set rebuild. This should look similar to the rebuild you triggered at the beginning of the lab.
mdadm /dev/md0 --add /dev/sdaN
watch -n 1 cat /proc/mdstat
- Unmount the filesystem then stop the RAID set, breaking the mirror.
umount /mnt
mdadm --stop /dev/md0
- One of the great features about a mirror is that you can use any of the disks from the mirror individually since all of them contain the same data. Now that the mirror is broken, let's mount one of the disks on /mnt.
mount /dev/sdaN /mnt
- Verify that you can still access the data on the remaining partition just as you could through the mirror.
ls -l /mnt
cat /mnt/grub/menu.lst
umount /mnt
- Add your RAID 1 mirror configuration to your /etc/mdadm.conf file so that your RAID set will be assembled on reboot. You'll need to reassemble the array first.
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdaM /dev/sdaN
mdadm --detail --scan >>/etc/mdadm.conf
- Add your RAID device to /etc/fstab so that it will be mounted on boot.
mkdir /raid1
vim /etc/fstab
RAID 0 Stripe
- Remove the RAID1 set so that we can reuse the /dev/md0 device.
umount /raid1
mdadm --stop /dev/md0
- Create a RAID 0 stripe with the mdadm command.
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdaP /dev/sdaQ
- Check that the new RAID device exists.
cat /proc/mdstat
- Create a filesystem on the RAID stripe.
mke2fs -v -j /dev/md0
- Mount the volume and verify that it's empty
mount /dev/md0 /mnt
df -h /mnt
ls -l /mnt
- Copy a lot of files to the RAID stripe.
cp -a /boot/* /mnt
for dir in a b c d e f
do
mkdir /mnt/$dir
cp -a /boot/* /mnt/$dir
done
df -h /mnt
ls -l /mnt
- Since we can't wait around for the disk to fail on its own, we need to cause a RAID failure. Linux will not allow you to remove a drive from a RAID 0 stripe, so we have to zero out one of the partitions.
dd if=/dev/zero of=/dev/sdaP bs=1M count=1024
- Check the contents of the disk. Some directory entries may be corrupt.
ls -l /mnt
- If nothing appears wrong at the top level directory, look deeper.
ls -lR /mnt | less
- Another way to look deeper is to use the find command.
find /mnt -name menu.lst
- If you find multiple copies, check them to see if they're different. The following command is an example. You should use the path names that match the ones you found above.
diff /mnt/a/menu.lst /mnt/b/menu.lst
©2008 James Walden, Ph.D.