I recently had an issue with a server hosted at OVH.com. One Saturday, it completely failed to boot and there was no SSH access. After a little pleading with OVH to fix the server(since I had no access) , I was flatly told “No” several times. This was a dedicated server and as such, I was responsible for all aspects of its software. “But your physical disks are failing to mount!” No help. Grrr, ok I can solve this.
Here’s the only problem, I don’t even know what the error is on boot! I can’t see any of the boot screen, so I don’t know where its failing.
OVH has a convenient feature know as Rescue Mode which allows you to boot to an alternative OS, so you can mount and correct any issues on the primary drive. Utilizing this feature I got access to the disks and RAID array. Everything seemed fine, but I ran through all the checks to be sure.
- Hard Disks – No errors
- Raid Array – Needed to be resynced, but does not fix boot
- FileSystem – OK
- Boot Logs – No help
- Partitions – Disks are out of order, but does not fix boot
At this point, I’m out of ideas, so I call OVH one more time and ask them to look at the boot screen and tell me where its stuck. They agree and tell me it boots to GRUB> prompt and stops. Ok, this is good information.
I log back into rescue and scour the internet for a way to fix this. The answer is found in an obscure ubuntu forum, which perfectly describes a way to reset the grub loader on each disk in the array, when utilizing a rescue mode.
$ sudo fdisk -l (From this you need to find the device name of your physical drive that won't boot, something like “/dev/sdxy″ - where x is the drive and y is the root partition. Since I was using a software RAID, root (/) was on md1)
$ sudo mount /dev/sdxy /mnt (Mount the root partition)
$ sudo mount --bind /dev /mnt/dev
$ sudo mount --bind /proc /mnt/proc
$ sudo mount --bind /sys /mnt/sys
$ sudo chroot /mnt (This will change the root of executables to your your drive that won't boot)
$ grub-mkconfig -o /boot/grub/grub.cfg (insure that there are NO error messages)
$ grub-install /dev/sdx (NOTE that this is the drive and not the partition. try grub-install --recheck /dev/sdxy if it fails)
Ctrl+D (to exit out of chroot)
$ sudo umount /mnt/dev
$ sudo umount /mnt/proc
$ sudo umount /mnt/sys
$ sudo umount /mnt
Reboot!
Hopefully this will save someone some agony in the future and give you a few hours of your life back.