Debian Soft Raid
How to switch a Debian system on software RAID 1 (mirroring)
Here are just some quick notes to update this page as now everything can be done relatively easily with just mdadm.
A very interesting page: http://www200.pair.com/mecham/raid/raid1-degraded-etch.html
It helps getting familiar with the concepts.
Here it's not a step-by-step anymore as the first time I used the old method described later and I used the new mdadm-only way to replace a broken drive, build new raid1 arrays as the new drive was larger (and the old smaller than the surviving one) and a raid1 on the /boot and moving from lilo to grub.
Creating degraded array
Here we're missing /dev/hda3 so we start with only /dev/hdd3:
mdadm --create /dev/md0 --raid-devices=2 --level=raid1 missing /dev/hdd3
And to get it properly set after reboot, we can create mdadm.conf:
echo DEVICE partitions > /etc/mdadm/mdadm.conf mdadm --examine --scan >> /etc/mdadm/mdadm.conf
Edit and check the file manually...
Then prepare for reboot
Preparing for reboot
During the configuration, every time we want to reboot, we've to make sure to:
- have the intended partition layout (fdisk)
- have the intended partitions mounted on the intended mountpoints (mount)
- have /etc/fstab reflecting the current mounts
- have /etc/mdadm/mdadm.conf reflecting the current Raid arrays
- have an initrd reflecting the current situation:
This is true also after having added the 2nd partition to a raid1 array
Some useful commands to inspect the raid situation:
# From what's currently assembled: cat /proc/mdstat mdadm --detail --scan mdadm --detail /dev/md1 # From what's available as raid partitions mdadm --examine --scan mdadm --examine /dev/hda5
Repairing a degraded array
Later when we'll be able to integrate /dev/hda3 we'll do:
mdadm /dev/md0 --add /dev/hda3
Then prepare for reboot
Here is one example:
Initially /boot was not on raid1 but as now it's possible with grub I did so.
I had /boot=/dev/hda1 and /boot-img=/dev/hdd1 and I did sth like:
umount /dev/hdd1 mdadm --create /dev/md1 --raid-devices=2 --level=raid1 missing /dev/hdd1 mount /dev/md1 /boot-img cp -a /boot/* /boot-img umount /boot umount /boot-img mdadm /dev/md1 --add /dev/hda1 vi /etc/fstab #/dev/md0 /boot ... and delete /boot-img entry grub-install "(hd1)" grub-install "(hd0)" dpkg-reconfigure linux-image-... mdadm --examine --scan |grep md1>> /etc/mdadm.conf reboot
During the process I wanted to change the number associated to an array (/dev/mdX):
Suppose /dev/md3 = /dev/hda5+/dev/hdd5
And we want /dev/md2 = /dev/hda5+/dev/hdd5
mdadm --stop /dev/md3 mdadm --assemble /dev/md2 /dev/hda5 /dev/hdd5
Then prepare for reboot
To rebuild the initrd there are several tools but finally I used yaird which allowed me to preload my IDE driver and get UDMA modes working, which was essential to get sth like 20x faster data transfers!
I inserted just before the MOUNTDIR keyword which takes care of inserting the needed generic IDE drivers the amd74xx driver I needed for my nVidia chipset:
/etc/yaird/Default.cfg MODULE amd74xx MOUNTDIR "/" "/mnt"
I had also some difficulties when I broke my initrd and had to reboot on a 2.6.14 because apparently kernels pre-2.6.18 cannot generate properly initrd images.
Hopefully I had a backup of the initrd otherwise try to reboot on a liveCD and chroot or build a new kernel from source without initrd then boot on that one to prepare the initrd.
You can always inspect the initrd by yourself to check things like modules, raid assembly etc, the file is a cpio archive gzip compressed.
I moved from lilo to grub and installed the first stage on both drives:
grub-install "(hd1)" grub-install "(hd0)"
I edited /etc/kernel-img.conf to have the hooks for Debian kernel automatic installation:
postinst_hook = /usr/sbin/update-grub postrm_hook = /usr/sbin/update-grub do_bootloader = no
I edited /boot/grub/menu.lst and added a fallback directive:
default 0 fallback 1
When executing update-grub it creates the following entry:
title Debian GNU/Linux, kernel 2.6.21-2-vserver-k7 root (hd0,0) kernel /vmlinuz-2.6.21-2-vserver-k7 root=/dev/md0 ro initrd /initrd.img-2.6.21-2-vserver-k7 savedefault
And I added manually the following one:
title Debian GNU/Linux, kernel 2.6.21-2-vserver-k7 (hd1) root (hd1,0) kernel /vmlinuz-2.6.21-2-vserver-k7 root=/dev/md0 ro initrd /initrd.img-2.6.21-2-vserver-k7 savedefault
But I don't know how to make it happening automatically via update-grub, anyway in case of a failure of the first harddrive I'll probably have to reboot manually and Grub is rich enough to allow reconfiguration on-the-fly.
That's the major reason why I moved away from lilo.
How to switch a Debian system on software RAID 1 (mirroring) (OLD)
Here is how to switch your root (/) filesystem on RAID 1:
Have 2 same disks, let's say hda and hdc (yep, put them on different IDE controllers!)
Create a specific small partition for /boot at the very beginning of the first disk (hda) because some (most?) bootloaders don't understand RAID.
Mine is hda1->/boot hda2->swap hda3->/
Install Debian as usual on hda
Format hdc with a same partition as the / on hda, it'll be the RAID mirror of /
My second disk (same vendor, same size) didn't have the same geometry (C/H/S) but after dd if=/dev/zero of=/dev/hdc bs=512 count=1 fdisk used the same geometry...
Now it is the same as hda: hdc1->/boot-img hdc2->swap hdc3->/
apt-get install initrd-tools raidtools2 mdadm (decline offer to start RAID at boot time)
raiddev /dev/md0 raid-level 1 nr-raid-disks 2 nr-spare-disks 0 persistent-superblock 1 device /dev/hdc3 raid-disk 0 device /dev/hda3 failed-disk 1
So the actual / partition is declared as "broken" for the RAID
Create the RAID:
mkraid /dev/md0 (it will say disk1: failed) mkfs.ext3 /dev/md0 mount -v /dev/md0 /mnt/root
Copy the / content:
cd / find . -xdev | cpio -pm /mnt/root
Prepare to reboot on the RAID:
ROOT=probe -> ROOT=/dev/md0
mkinitrd -o /boot/initrd.img-raid
/dev/md0 / ext3 defaults,errors=remount-ro 0 1
image=/boot/vmlinuz... label=LinuxRAID root=/dev/md0 read-only initrd=/boot/initrd.img-raid
umount /dev/md0 raidstop /dev/md0 lilo reboot
Restore the "broken" RAID:
cat /proc/mdstat: we see only one disk raidhotadd /dev/md0 /dev/hda3
Now the system is synchonizing the "new" RAID partition
watch cat /proc/mdstat
Prepare for next reboot:
failed-disk -> raid-disk mkinitrd -o /boot/initrd.img-raid lilo reboot
dpkg-reconfigure mdadm -> accept mdadm survey daemon and give user who should get alert emails
Simulating RAID 0 (striping) for the swap:
Simply give the same priority to both swap partitions:
/dev/hda2 swap swap defaults,pri=1 0 0 /dev/hdc2 swap swap defaults,pri=1 0 0
Fresh install debian
Don't simply dd the MBR from hda to hdc otherwise lilo will complain about a timestamp error, actually that's because now both disks got the same ID number.
You can mount the initrd image to check if it contains well the RAID instructions:
mount /boot/initrd.img-raid /mnt/disk -o loop,ro
/mnt/disk/script should contain a last line with mdadm
check the end of the line, first time only /dev/hdc3 is mentioned, second time /dev/hda3 should also be present (or the system will be mounted again in degraded mode)
If reboot fails: Boot on a Knoppix
modprobe raid1 mdadm --assemble /dev/md0 /dev/hdd3 /dev/hda3 mount /dev/md0 /mnt/xxx chroot /mnt/xxx mount also /proc /boot etc mkinitrd -o /boot/initrd.img-raid <kernel version>
mdadm -D /dev/mdXX mdadm -E /dev/hdXX cat /proc/mdstat
mdadm /dev/mdXX -a /dev/hdXX