Monday, 25 May 2015

grub> Software RAID devices assemble after grub prompt - CentOS 7

Short story is that I tried to make few changes to the grub without having taken backup which costed me several hours to recover when I had only GRUB prompt. I installed the grub2-install and after reboot, I am at the same grub prompt where I was earlier. Hence, I decided to remove the /boot/grub2/grub.cfg file and again re-create the file, which had no luck for me.

I had installed my server using software RAID1, I am referring them by the descriptive names(md0, md1 and md2). I figured out my boot drive which is second partition of the first hard-disk drive, tried loading the kernel and initramfs as below - 

grub>set prefix=(hd0,msdos2)/grub2
grub>set root=(hd0,msdos2)
grub>linux16 /vmlinuz-3.10.0-123.el7.x86_64
grub>initrd16  /initramfs-3.10.0-123.el7.x86_64.img
grub>boot

it would start to boot, and unfortunately the system drops into the initramfs rescue shell with the following information in 'journalctl'.  

#  journalctl
Not switching root: /sysroot does not seem to be an OS tree.  <<<============= 
/etc/os-release is missing.
Initrd-switch-root.service: main process exited, code=exited, 
status=1/FAILURE
Failed to start Switch Root.  <<<<=====================
. . . . .
Triggering OnFailure= dependencies of initrd-switch-root.service.
Starting Emergency Shell. . .
Failed to issue method call: Invalid argument

From this log I had to start suspect that root file system was not mounted as it appears to me that there was 'md' devices weren't assembled to run or the naming could have been chaged in the device mappers or because it doesn't know the file system or layers under it.
So, I thought to assemble the RAID volumes, mount the root volumes and fix the grub devices to match what 'mdadm --detail' says ..

I had booted into 'CentOS' server install's DVD rescue system, Chose to execute a shell from the installer environment and not to use a root file system.

#cat /proc/mdstat

- this would show the meta devices, (md125, md126, md127), stop those devices. 
# mdadm -S /dev/md125
# mdadm -S /dev/md126
# mdadm -S /dev/md127

- Assemble the volumes 
# mdadm -Av --run /dev/md0 /dev/sda1 /dev/sdb1
# mdadm -Av --run /dev/md1 /dev/sda2 /dev/sdb2
# mdadm -Av --run /dev/md2 /dev/sda3 /dev/sdb3

- Check the outcome
# cat /proc/mdstat
# mdadm --detail /dev/md0
# mdadm --detail /dev/md1
# mdadm --detail /dev/md2

- create the directory 'sysroot' and would mount the root file system to fix the device maps.
# mkdir /sysroot
# mount -o bind /dev/ /sysroot/dev
# mount -o bind /proc /sysroot/proc
# mount -o bind /sys /sysroot/sys
# mount -o bind /dev/pts /sysroot/dev/pts 

- chroot into it
# chroot /sysroot /bin/bash

- correct your device names so that current names of the 'md' devices and their UUIDs are properly read by GRUB.
#cp -p /boot/grub2/device.map /boot/grub2/device.map.old
#for i in /dev/disk/by-id/md-uuid-*; do DEV=$(readlink $i); echo "(${DEV##*/}) $i"; done|sort|tee /boot/grub2/device.map

- Recrete your initramfs
#dracut -vf /boot/initramfs-3.10.0-123.el7.x86_64.img

- Just make sure that you install GRUB2 on the two drives which are eligible for booting
# grub2-install /dev/sda
# grub2-install /dev/sdb

- update the grub to make sure that latest configuration to take effect.
# grub2-mkconfig -o /boot/grub2/grub.cfg

- you can leave the chroot environment and reboot.