Knowledge Base

The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides
 
Search the VMware Knowledge Base (KB)   View by Article ID
 

Recovering an ESX host from GRUB prompt (1007908)

Symptoms

  • ESX host stops booting at GRUB prompt. The screen displays grub> , and it seems to be waiting for commands
  • ESX host does not recover after the GRUB prompt
  • The /boot/grub/grub.conf file has incorrect entries

Resolution

Running the  df -h command generates a view of the ESX host partition table similar to:

Filesystem         Size   Used   Avail   Use%   Mounted on
/dev/sda2          4.9G   1.3G    3.4G    27%   /
/dev/sda1           97M    26M     67M    29%   /boot
none               132M      0    132M     0%   /dev/shm
/dev/sda6          2.0G    33M    1.8G     2%   /var/log

Your /boot partition is on the first disk's first partition, and / is on the second. /var/log might vary from 5th to 7th depending on configuration.

Note: There are three options for recovering your ESX host from this point. Each option is progressively more complex. Perform each option in the order provided, until your ESX host has recovered.

Option 1 – Running Commands in the GRUB Prompt

With this option, you run commands from the GRUB prompt to let it boot. 
 
This option is simple (you only need to run a few commands, your ESX host boots normally, and you can edit your grub.conf file later), but y ou need to know the UUID of your / partition, which can be hard to find in many cases.
 
While you are in the GRUB prompt, you must specify four things to let grub to continue booting:
 
Note: You may want to change the kernel name and initrd name from this example depending on your ESX version.   
  1. Type the location of the of the /boot directory and press Enter :

    root (hd0,0)

  2. Type the kernel name and the UUID and press Enter:

    For example:


    kernel /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=UUID=932dad41-f43a-4a60-9257-198f026da80e


    Note: The UUID in this command is only an example. You need to find or have it from your system before you issue this command.  You can sometimes use cat /grub/grub.conf to get the UUID.

  3. Type the initrd value and press Enter:

    For example:

    initrd /initrd-2.4.21-47.0.1.ELvmnix.img

  4. Run the boot command:

    boot

If you can not remember the names of kernel and initrd, press Tab after typing /. This gives you the possibilities. You can also use it to check if the filesystem is valid.

Run these commands after your ESX host is booted:
  • esxcfg-boot -p
  • esxcfg-boot -b
  • esxcfg-boot -r

Ensure your /boot/grub/grub.conf appears like:

title VMware ESX Server
root (hd0,0)
kernel --no-mem-option /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=UUID=44fc4a1c-d5ac-4ce1-a9cb-74acab0e61e8 mem=272M
initrd /initrd-2.4.21-47.0.1.ELvmnix.img
 

Option 2 – Using your Live CD to Boot your ESX Host

Use a Live CD to boot your ESX host and fix the host from a chroot environment.
 
Unlike Option 1, you do not need to know the UUID of your / to recover. You can find the UUID as part of this option and continue to use option 1. However, more Linux commands are involved and you need to have the Live CD.

Note: You can also use a live CD to take the advantage of esxcfg-boot command.

Examples of Live Linux CDs:
  • Gentoo Live CD
  • Redhat rescue CD/DVD
  • Knoppix Live CD
  • PClinuxOS Live CD
  • Ubuntu Live CD   
Run these commands after you boot your ESX host from the Live CD:
  • fdisk -l > lists the device names of your filesystems. 

Note: Use this command to discover the device names of your filesystems as they may not be named /dev/sda. The proceeding commands are an example of what your filesystems could be named.

  • mkdir /mnt/esx
  • mount /dev/sda2 /mnt/esx
  • mount /dev/sda1 /mnt/esx/boot
  • mount /dev/sda6 /mnt/esx/var/log > You may need to use fdisk /dev/sda to find which one is your /var/log
  • chroot /mnt/esx
  • bash
  • touch /boot/grub/grub.conf > Only if there is no grub.conf file
  • esxcfg-boot -gr
  • vi /boot/grub/grub.conf > Correct problems of the file if needed   
esxcfg-boot  does not know /boot and / are 2 separated partitions in the live chroot environment.

You receive this information inside your /boot/grub/grub.conf
 
title VMware ESX Server
        root (hd0,1)‏
        uppermem 277504
        kernel --no-mem-option /boot/vmlinuz-2.4.21-47.0.1.ELvmnix ro root=UUID=44fc4a1c-d5ac-4ce1-a9cb-74acab0e61e8 mem=272M
        initrd /boot/initrd-2.4.21-47.0.1.ELvmnix.img 
 
You need to change root to the correct partition, and remove /boot from the kernel path:
title VMware ESX Server
        root (hd0,0)‏
        uppermem 277504
        kernel --no-mem-option /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=UUID=44fc4a1c-d5ac-4ce1-a9cb-74acab0e61e8 mem=272M
        initrd /initrd-2.4.21-47.0.1.ELvmn
 

Option 3 – Using a Live CD to boot your ESX Host (Advanced)

Use a Live CD to boot your ESX host and fix the host from a chroot environment.
 
This option has more detail than Option 2 and covers repairing damaged stage files and manipulation of device.map , all without using esxcfg commands. However, you must be completely comfortable with your Linux command knowledge to perform this option. You also need to have the Live CD.
 
Caution: Be aware that this process uses more advanced Linux commands than Option 1 and Option 2. If you are not comfortable with Option 3, and your ESX host is still not recovering,
file a support request with VMware Support and note this KB Article ID in the problem description. For more information, see How to Submit a Support Request .
  1. Boot from the Rescue CD.
     
  2. List the device names of your filesystems by using the  fdisk -l or df -h command.

  3. Check device names for / and /boot filesystems.

    For Example, Internal RAID /boot  can be /dev/cciss/c0d0p1  and / can be /dev/cciss/c0d0p7

  4. Run this command to mount the / filesystem and chroot to it:
    • mkdir /mnt/root
    • mount /dev/cciss/c0d0p7 /mnt/root
    • chroot /mnt/root

  5. Run this command to mount /boot filesystem to /boot mountpoint:

    mount /dev/cciss/c0d0p1 /boot

  6. Ensure the /boot  contains the kernel, initrd, grub/ subdir with stage* files, grub.conf and menu.lst , which is a symlink to grub.conf.

  7. You need to replace anything from step 5 that is missing. Run this command if any of the stage files are missing:

    cp /usr/share/grub/i386-redhat/* /boot/grub/

    You can copy all the files from /usr/share/grub/i386-redhat/ to /boot/grub/ .

    If grub.conf is missing, you have to create a new one or take a copy from another server. 

    An example of /boot/grub/grub.conf is:

    vmware:configversion 1
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this file
    # NOTICE: You have a /boot partition. This means that
    # all kernel and initrd paths are relative to /boot/, eg.
    # root (hd0,0)
    # kernel /vmlinuz-version ro root=/dev/sdc2
    # initrd /initrd-version.img
    #boot=/dev/sdc
    timeout=10
    default=0
    title VMware ESX Server
    #vmware:autogenerated esx
    root (hd0,0)
    uppermem 277504
    kernel --no-mem-option /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=/dev/cciss/c0d0p7 mem=272M
    initrd /initrd-2.4.21-47.0.1.ELvmnix.img
    title VMware ESX Server (debug mode)
    #vmware:autogenerated esx
    root (hd0,0)
    uppermem 277504
    kernel --no-mem-option /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=/dev/cciss/c0d0p7 mem=272M console=ttyS0,115200 console=tty0 debug
    initrd /initrd-2.4.21-47.0.1.ELvmnix.img-dbg
    title Service Console only (troubleshooting mode)
    #vmware:autogenerated esx
    root (hd0,0)
    uppermem 277504
    kernel --no-mem-option /vmlinuz-2.4.21-47.0.1.ELvmnix ro root=/dev/cciss/c0d0p7 mem=272M tblsht
    initrd /initrd-2.4.21-47.0.1.ELvmnix.img-sc

  8. If the server has multiple drives, LUNs, etc., it may be useful to create/edit a /boot/grub/device.map file with the following content:

    (hd0) /dev/cciss/c0d0p1


    Where the device name in /dev/ is the boot partition device. Usage of the device.map file significantly speeds up the process, as the GRUB does not have to autodetect devices.

  9. Run the /sbin/grub command if you are using device map file:

    /sbin/grub --device-map=/boot/grub/device.map

  10. Run this command in the GRUB shell:

    root (hd0,0)


  11. Run this command in the GRUB shell:

    setup --stage2=stage2 --prefix=/grub (hd0)


    Note: This is for setup, where /boot is (hd0) . If this does not work, try:

    setup (hd0)

  12. Run the quit command to exit the GRUB shell.
  13. Run this command:

    sync

  14. Reboot the server and remove the Rescue CD.

Additional Information

If booting does not move past the GRUB screen and you cannot use the grub shell, there may be an issue with the MBR not being properly written by GRUB.
 
To resolve this, after having the chroot environment, execute:
 
/sbin/grub-install /dev/sda

Update History

02/07/2011 - Added information on how to get the UUID. 10/29/2010 - Added additional information about fixing the GRUB prompt.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 14 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (4000 or fewer characters)
  • 14 Ratings
Actions
KB: