Lab #4: RAID
Date: February 4, 2008

In this lab, we're going to create RAID volumes using Linux Software RAID. We will use the mdadm command to create, modify, and destroy RAID volumes. We will create both a RAID 1 mirror set and a RAID 0 stripe set, then we will cause each of them to fail and observe the failure modes of both types of RAID volume.

Since we only have one disk, we're going to use multiple partitions on a single disk to create our RAID set. Note that you should never do this in production use, because to gain either the performance or reliability advantages of RAID, each member of the RAID set must be an independent storage device.

RAID 1 Mirror

  1. Create four 1G partitions to use in this lab. You'll have to reboot after you save your partition table for CentOS to be able to recognize the new partitions.
    fdisk /dev/sda
    
  2. Create the RAID device with the mdadm command. The missing argument takes the place of a device, allowing us to create the RAID set with only one disk partition. While normally you would create the RAID mirror with both devices initially, we do this so that you can watch the RAID mirror rebuild later in this lab.
    mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdaM
    
  3. Check that the new RAID device exists.
    cat /proc/mdstat 
    
  4. The output should look similar to the output below. The [2/1] [_U] indicates that only one disk of the RAID 1 mirror is currently available.
    Personalities : [raid1] 
    md0 : active raid1 sda8[1]
          104320 blocks [2/1] [_U]
          
    unused devices: 
    
  5. Create a filesystem on the RAID set.
    mke2fs -v -j /dev/md0
    
  6. Mount the volume and copy some files to it.
    mount /dev/md0 /mnt
    cp -a /boot/* /mnt
    df -h /mnt
    
  7. Add the second partition to the RAID set, which will trigger a rebuild.
    mdadm /dev/md0 --add /dev/sdaN
    
  8. Watch the rebuild process using the watch command to repeatedly view the contents of /proc/mdstat once per second.
    watch -n 1 cat /proc/mdstat
    
  9. Since we can't wait around for the disk to fail on its own, cause a RAID failure manually using the mdadm command. View the status afterwards.
    mdadm /dev/md0 --fail /dev/sdaN
    cat /proc/mdstat
    
  10. Remove the failed device from the array.
    mdadm /dev/md0 --remove /dev/sdaN
    cat /proc/mdstat
    
  11. Verify that you can still access the data on the array, even though one device has failed.
    ls -l /mnt
    cat /mnt/grub/menu.lst
    
  12. Add the device back and watch the RAID set rebuild. This should look similar to the rebuild you triggered at the beginning of the lab.
    mdadm /dev/md0 --add /dev/sdaN
    watch -n 1 cat /proc/mdstat
    
  13. Unmount the filesystem then stop the RAID set, breaking the mirror.
    umount /mnt
    mdadm --stop /dev/md0
    
  14. One of the great features about a mirror is that you can use any of the disks from the mirror individually since all of them contain the same data. Now that the mirror is broken, let's mount one of the disks on /mnt.
    mount /dev/sdaN /mnt
    
  15. Verify that you can still access the data on the remaining partition just as you could through the mirror.
    ls -l /mnt
    cat /mnt/grub/menu.lst
    umount /mnt
    
  16. Add your RAID 1 mirror configuration to your /etc/mdadm.conf file so that your RAID set will be assembled on reboot. You'll need to reassemble the array first.
    mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdaM /dev/sdaN
    mdadm --detail --scan >>/etc/mdadm.conf
    
  17. Add your RAID device to /etc/fstab so that it will be mounted on boot.
    mkdir /raid1
    vim /etc/fstab
    

RAID 0 Stripe

  1. Remove the RAID1 set so that we can reuse the /dev/md0 device.
    umount /raid1
    mdadm --stop /dev/md0
    
  2. Create a RAID 0 stripe with the mdadm command.
    mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdaP /dev/sdaQ
    
  3. Check that the new RAID device exists.
    cat /proc/mdstat 
    
  4. Create a filesystem on the RAID stripe.
    mke2fs -v -j /dev/md0
    
  5. Mount the volume and verify that it's empty
    mount /dev/md0 /mnt
    df -h /mnt
    ls -l /mnt
    
  6. Copy a lot of files to the RAID stripe.
    cp -a /boot/* /mnt
    for dir in a b c d e f
    do
        mkdir /mnt/$dir
        cp -a /boot/* /mnt/$dir
    done
    df -h /mnt
    ls -l /mnt
    
  7. Since we can't wait around for the disk to fail on its own, we need to cause a RAID failure. Linux will not allow you to remove a drive from a RAID 0 stripe, so we have to zero out one of the partitions.
    dd if=/dev/zero of=/dev/sdaP bs=1M count=1024
    
  8. Check the contents of the disk. Some directory entries may be corrupt.
    ls -l /mnt
    
  9. If nothing appears wrong at the top level directory, look deeper.
    ls -lR /mnt | less
    
  10. Another way to look deeper is to use the find command.
    find /mnt -name menu.lst
    
  11. If you find multiple copies, check them to see if they're different. The following command is an example. You should use the path names that match the ones you found above.
    diff /mnt/a/menu.lst /mnt/b/menu.lst
    
 

©2008 James Walden, Ph.D.