Cannot remount a datastore after an unplanned permanent device loss (PDL)

Products

VMware vSphere ESXi

Issue/Introduction

This article provides steps to resolve the issue when you are unable to remount a datastore after an unplanned permanent device loss.

Symptoms:

After a storage device has unexpectedly unpresented from the storage array, you are unable to mount it again.
This issue occurs when there was a running virtual machine when the storage device went offline.
An ESXi host cannot mount the storage after the LUN is online again.
In the vmkernel.log file, you see entries similar to:

cpu36:5590)Vol3: 1665: Error refreshing FD resMeta: Device is permanently unavailable
cpu34:5590)VC: 1449: Device rescan time 165 msec (total number of devices 75)
cpu34:5590)VC: 1452: Filesystem probe time 504 msec (devices probed 48 of 75)
cpu38:5590)ScsiDevice: 4592: naa.################################## device :Open count > 0, cannot be brought online
cpu34:5590)Vol3: 647: Couldn't read volume header from control: Invalid handle
cpu34:5590)FSS: 4333: No FS driver claimed device 'control': Not supported
cpu38:5590)ScsiDeviceIO: 2316: Cmd(0x4124c0ea2e80) 0x28, CmdSN 0x70509 to dev "naa.##################################" failed H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vSphere ESXi 7.0
VMware vSphere 7.0.x
VMware vSphere ESXi 8.0

Resolution

To resolve this issue:

Run this command to see the world that has the device open for the LUN:

#esxcli storage core device world list -d naa_id

For example:

#esxcli storage core device world list -d naa.##################################

You see output similar to:

Device World ID Open Count World Name
------------------------------------ -------- ---------- ----------
naa.################################## 2060 1 idle0

If a VMFS volume is using the device indirectly, the world name includes the string idle0. If a virtual machine uses the device as an RDM, the virtual machine World ID is displayed. If any other process is using the raw device, the corresponding information is displayed.

Notes:
- If the host is not responding, run the esxcfg-scsidevs –m | grep naa.id command to get the corresponding datastore name.
- Ensure all virtual machines registered on the volume in a PDL state do not require any further steps. If you have a virtual machine in that state, attempting to Retry or Cancel an operation will not return the virtual machine world ID. Click Cancel as the Retry operation cannot succeed unless the volume is remounted.
Run this command to list all virtual machines running on the ESXi host and identify the virtual machine registered on that LUN:

#esxcli vm process list
To kill the virtual machine World ID, run this command:

#esxcli vm process kill --type=force --world-id=World ID

For example:

#esxcli vm process kill --type=force --world-id=12131
To clean up other processes that might be using this datastore:
Remove any templates from vCenter that are using this datastore.

Restart the management service (See Restarting the Management agents in ESXi):
#/etc/init.d/hostd restart

Clean up the SIOC service, see KB: 2011220
Rescan the storage using this command:

#esxcfg-rescan -u vmhba#
Run this command to see the device state:

#esxcli storage core device list -d naa.##################################
If the issue persists, reboot the ESXi 5.x host where virtual machine was registered.

Additional Information

How to unmount a LUN or detach a datastore device from ESXi hosts
Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere 5.x and 6.x

Cannot remount a datastore after an unplanned permanent device loss (PDL)

Article ID: 323145

Updated On: 03-20-2025

Products

Issue/Introduction

Environment

Resolution

Additional Information

Feedback