Debian Soft Raid
How to switch a Debian system on software RAID 1 (mirroring)
Intro
Here are just some quick notes to update this page as now everything can be done relatively easily with just mdadm.
A very interesting page: http://www200.pair.com/mecham/raid/raid1-degraded-etch.html
It helps getting familiar with the concepts.
Creating degraded array
Here we're missing /dev/hda3 so we start with only /dev/hdd3:
mdadm --create /dev/md0 --raid-devices=2 --level=raid1 missing /dev/hdd3
And to get it properly set after reboot, we can create mdadm.conf:
echo DEVICE partitions > /etc/mdadm/mdadm.conf mdadm --examine --scan >> /etc/mdadm/mdadm.conf
Edit and check the file manually...
Then prepare for reboot
Preparing for reboot
During the configuration, every time we want to reboot, we've to make sure to:
- have the intended partition layout (fdisk)
- have the intended partitions mounted on the intended mountpoints (mount)
- have /etc/fstab reflecting the current mounts
- have /etc/mdadm/mdadm.conf reflecting the current Raid arrays
- have an initrd reflecting the current situation:
dpkg-reconfigure linux-image-...
This is true also after having added the 2nd partition to a raid1 array
Diagnostic
Some useful commands to inspect the raid situation:
# From what's currently assembled: cat /proc/mdstat mdadm --detail --scan mdadm --detail /dev/md1 # From what's available as raid partitions mdadm --examine --scan mdadm --examine /dev/hda5
Repairing a degraded array
Later when we'll be able to integrate /dev/hda3 we'll do:
mdadm /dev/md0 --add /dev/hda3
Then prepare for reboot
/boot
Here is one example:
Initially /boot was not on raid1 but as now it's possible with grub I did so.
I had /boot=/dev/hda1 and /boot-img=/dev/hdd1 and I did sth like:
umount /dev/hdd1 mdadm --create /dev/md1 --raid-devices=2 --level=raid1 missing /dev/hdd1 mount /dev/md1 /boot-img cp -a /boot/* /boot-img umount /boot umount /boot-img mdadm /dev/md1 --add /dev/hda1 vi /etc/fstab #/dev/md0 /boot ... and delete /boot-img entry grub-install "(hd1)" grub-install "(hd0)" dpkg-reconfigure linux-image-... mdadm --examine --scan |grep md1>> /etc/mdadm.conf reboot
Changing super-minor
During the process I wanted to change the number associated to an array (/dev/mdX):
Suppose /dev/md3 = /dev/hda5+/dev/hdd5
And we want /dev/md2 = /dev/hda5+/dev/hdd5
mdadm --stop /dev/md3 mdadm --assemble /dev/md2 /dev/hda5 /dev/hdd5
Then we've to update again /etc/mdadm/mdadm.conf
Grub
I moved from lilo to grub and installed the first stage on both drives:
grub-install "(hd1)" grub-install "(hd0)"
I edited /etc/kernel-img.conf to have the hooks for Debian kernel automatic installation:
postinst_hook = /usr/sbin/update-grub postrm_hook = /usr/sbin/update-grub do_bootloader = no
I edited /boot/grub/menu.lst and added a fallback directive:
default 0 fallback 1
When executing update-grub it creates the following entry:
title Debian GNU/Linux, kernel 2.6.21-2-vserver-k7 root (hd0,0) kernel /vmlinuz-2.6.21-2-vserver-k7 root=/dev/md0 ro initrd /initrd.img-2.6.21-2-vserver-k7 savedefault
And I added manually the following one:
title Debian GNU/Linux, kernel 2.6.21-2-vserver-k7 (hd1) root (hd1,0) kernel /vmlinuz-2.6.21-2-vserver-k7 root=/dev/md0 ro initrd /initrd.img-2.6.21-2-vserver-k7 savedefault
But I don't know how to make it happening automatically via update-grub, anyway in case of a failure of the first harddrive I'll probably have to reboot manually and Grub is rich enough to allow reconfiguration on-the-fly.
That's the major reason why I moved away from lilo.
How to switch a Debian system on software RAID 1 (mirroring) (OLD)
Here is how to switch your root (/) filesystem on RAID 1:
Have 2 same disks, let's say hda and hdc (yep, put them on different IDE controllers!)
Create a specific small partition for /boot at the very beginning of the first disk (hda) because some (most?) bootloaders don't understand RAID.
Mine is hda1->/boot hda2->swap hda3->/
Install Debian as usual on hda
Format hdc with a same partition as the / on hda, it'll be the RAID mirror of /
My second disk (same vendor, same size) didn't have the same geometry (C/H/S) but after dd if=/dev/zero of=/dev/hdc bs=512 count=1 fdisk used the same geometry...
Now it is the same as hda: hdc1->/boot-img hdc2->swap hdc3->/
apt-get install initrd-tools raidtools2 mdadm (decline offer to start RAID at boot time)
Create /etc/raidtab:
raiddev /dev/md0 raid-level 1 nr-raid-disks 2 nr-spare-disks 0 persistent-superblock 1 device /dev/hdc3 raid-disk 0 device /dev/hda3 failed-disk 1
So the actual / partition is declared as "broken" for the RAID
Create the RAID:
mkraid /dev/md0 (it will say disk1: failed) mkfs.ext3 /dev/md0 mount -v /dev/md0 /mnt/root
Copy the / content:
cd / find . -xdev | cpio -pm /mnt/root
Prepare to reboot on the RAID:
Edit /etc/mkinitrd/mkinitrd.conf:
ROOT=probe -> ROOT=/dev/md0
mkinitrd -o /boot/initrd.img-raid
Edit /mnt/root/etc/fstab:
/dev/md0 / ext3 defaults,errors=remount-ro 0 1
Edit /etc/lilo.conf:
image=/boot/vmlinuz... label=LinuxRAID root=/dev/md0 read-only initrd=/boot/initrd.img-raid
umount /dev/md0 raidstop /dev/md0 lilo reboot
Restore the "broken" RAID:
cat /proc/mdstat: we see only one disk raidhotadd /dev/md0 /dev/hda3
Now the system is synchonizing the "new" RAID partition
watch cat /proc/mdstat
Prepare for next reboot:
Edit /etc/fstab:
failed-disk -> raid-disk mkinitrd -o /boot/initrd.img-raid lilo reboot
Automatic watching
dpkg-reconfigure mdadm -> accept mdadm survey daemon and give user who should get alert emails
Simulating RAID 0 (striping) for the swap:
Simply give the same priority to both swap partitions:
/dev/hda2 swap swap defaults,pri=1 0 0 /dev/hdc2 swap swap defaults,pri=1 0 0
Troubleshooting
Don't simply dd the MBR from hda to hdc otherwise lilo will complain about a timestamp error, actually that's because now both disks got the same ID number.
You can mount the initrd image to check if it contains well the RAID instructions:
mount /boot/initrd.img-raid /mnt/disk -o loop,ro
/mnt/disk/script should contain a last line with mdadm
check the end of the line, first time only /dev/hdc3 is mentioned, second time /dev/hda3 should also be present (or the system will be mounted again in degraded mode)
If reboot fails:
Boot on a Knoppix
modprobe raid1 mdadm --assemble /dev/md0 /dev/hdd3 /dev/hda3 mount /dev/md0 /mnt/xxx chroot /mnt/xxx mount also /proc /boot etc mkinitrd -o /boot/initrd.img-raid <kernel version>
Useful commands
- Diagnostics
mdadm -D /dev/mdXX mdadm -E /dev/hdXX cat /proc/mdstat
- Add
mdadm /dev/mdXX -a /dev/hdXX