Снова произошел инцидент с подвисанием хоста. На СХД IOmega px-450r которая подключена по iSCSI через софтварный инициатор ESXi, размечена луна VMFS, на которой расположена ВМ для резервного копирования, ВМ постоянно включена. Дак вот периодически СХД падает (раз в месяц теряет диски, лечиться только перезагрузкой, саппорт ничем помочь пока не может) и во время такого падения, ESXi судорожно пытает восстановить связь с датастором, при этом ВМ расположенная на упавшем датасторе остается включенной. попытки ее выключить в том числе через hard результата не приносят, в итоге вешаются намертво агенты управления хостом, рестарт служб управления не помогает, даже консоль F2 подвисает, при этом остальные ВМ на повисшем продолжают работать, но хост становиться не управляемым, и смигрировать их на другой хост не удается, приходиться выключать все ВМ и перезагружать хост. Изучив логи vmkernel (в атаче) нашел статью описывающую данную ситуацию http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684вот только решения данной проблемы не увидел.
листинг лога vmkernel
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.5000144f649871d7" - issuing command 0x412401bea940
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.5000144f649871d7" - issuing command 0x412401bea940
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.5000144f649871d7" - failed to issue command due to Not found (APD), try again...
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.5000144f649871d7" - failed to issue command due to Not found (APD), try again...
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:708:Logical device "naa.5000144f649871d7": awaiting fast path state update...
<181>2013-06-24T07:33:51.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:708:Logical device "naa.5000144f649871d7": awaiting fast path state update...
<181>2013-06-24T07:33:51.569Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:51.819Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:52.069Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:52.319Z esx3.ms.ru vmkernel: cpu3:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:52.319Z esx3.ms.ru vmkwarning: cpu7:1388698)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:52.319Z esx3.ms.ru vmkernel: cpu7:1388698)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:52.319Z esx3.ms.ru vmkwarning: cpu9:1387144)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:52.319Z esx3.ms.ru vmkernel: cpu9:1387144)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:52.569Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:52.819Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:53.009Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:53.069Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:53.319Z esx3.ms.ru vmkwarning: cpu3:1396611)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:53.319Z esx3.ms.ru vmkernel: cpu3:1396611)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:53.319Z esx3.ms.ru vmkwarning: cpu10:4276)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:53.319Z esx3.ms.ru vmkernel: cpu10:4276)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:53.319Z esx3.ms.ru vmkernel: cpu1:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:53.569Z esx3.ms.ru vmkernel: cpu1:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 3 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 3 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 4 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 4 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 5 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 5 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 6 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 6 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 7 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 7 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 8 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 8 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 9 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 9 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 10 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 10 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 11 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 11 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 12 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 12 not found
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaMoVm::CheckMoVm] did not find a VM with ID 13 in the vmList
<166>2013-06-24T07:33:53.625Z esx3.ms.ru Vpxa: [FFA10780 verbose 'Default' opID=SWI-26802da] [VpxaAlarm] VM with vmid = 13 not found
<181>2013-06-24T07:33:53.819Z esx3.ms.ru vmkernel: cpu14:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:54.069Z esx3.ms.ru vmkernel: cpu14:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:54.319Z esx3.ms.ru vmkernel: cpu14:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:54.319Z esx3.ms.ru vmkwarning: cpu9:1398322)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:54.319Z esx3.ms.ru vmkernel: cpu9:1398322)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:54.319Z esx3.ms.ru vmkwarning: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:54.319Z esx3.ms.ru vmkernel: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:54.569Z esx3.ms.ru vmkernel: cpu14:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:54.819Z esx3.ms.ru vmkernel: cpu12:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:55.011Z esx3.ms.ru vmkernel: cpu12:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:55.069Z esx3.ms.ru vmkernel: cpu12:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<166>2013-06-24T07:33:55.292Z esx3.ms.ru Hostd: [47A02B90 verbose 'SoapAdapter'] Responded to service state request
<181>2013-06-24T07:33:55.319Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:55.319Z esx3.ms.ru vmkwarning: cpu15:1359158)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:55.319Z esx3.ms.ru vmkernel: cpu15:1359158)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:55.319Z esx3.ms.ru vmkwarning: cpu15:1359158)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:55.319Z esx3.ms.ru vmkernel: cpu15:1359158)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:55.569Z esx3.ms.ru vmkernel: cpu21:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:55.819Z esx3.ms.ru vmkernel: cpu2:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:56.069Z esx3.ms.ru vmkernel: cpu2:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu2:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkwarning: cpu9:1387144)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu9:1387144)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkwarning: cpu7:1388698)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu7:1388698)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.5000144f00286753" - issuing command 0x41240038cb40
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.5000144f00286753" - issuing command 0x41240038cb40
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.5000144f00286753" - failed to issue command due to Not found (APD), try again...
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.5000144f00286753" - failed to issue command due to Not found (APD), try again...
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkwarning: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:708:Logical device "naa.5000144f00286753": awaiting fast path state update...
<181>2013-06-24T07:33:56.319Z esx3.ms.ru vmkernel: cpu5:4786)WARNING: NMP: nmpDeviceAttemptFailover:708:Logical device "naa.5000144f00286753": awaiting fast path state update...
<166>2013-06-24T07:33:56.473Z esx3.ms.ru Vpxa: [FFF9DB90 verbose 'SoapAdapter.HTTPService'] User agent is 'VMware-client/5.0.0'
<166>2013-06-24T07:33:56.474Z esx3.ms.ru Vpxa: [FFF9DB90 verbose 'SoapAdapter.HTTPService'] HTTP Response: Auto-completing at 129/129 bytes
<166>2013-06-24T07:33:56.474Z esx3.ms.ru Vpxa: [FFF9DB90 verbose 'SoapAdapter'] Responded to service state request
<181>2013-06-24T07:33:56.569Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:56.819Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:57.013Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:57.069Z esx3.ms.ru vmkernel: cpu0:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:57.319Z esx3.ms.ru vmkernel: cpu7:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:57.319Z esx3.ms.ru vmkwarning: cpu3:1396611)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:57.319Z esx3.ms.ru vmkernel: cpu3:1396611)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:57.319Z esx3.ms.ru vmkwarning: cpu5:4276)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:57.319Z esx3.ms.ru vmkernel: cpu5:4276)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:57.569Z esx3.ms.ru vmkernel: cpu7:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:57.819Z esx3.ms.ru vmkernel: cpu7:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:58.069Z esx3.ms.ru vmkernel: cpu7:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:58.319Z esx3.ms.ru vmkwarning: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:58.319Z esx3.ms.ru vmkernel: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T3:L0 is being removed
<181>2013-06-24T07:33:58.319Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:58.319Z esx3.ms.ru vmkwarning: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:58.319Z esx3.ms.ru vmkernel: cpu10:1380211)WARNING: ScsiPath: 3577: Path vmhba32:C0:T2:L0 is being removed
<181>2013-06-24T07:33:58.569Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:58.819Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:59.013Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:59.069Z esx3.ms.ru vmkernel: cpu10:4287)ScsiDevice: 6094: device naa.5000144f00286392 refCount is 12; waiting for 1.
<181>2013-06-24T07:33:59.314Z esx3.ms.ru vmkernel: cpu7:4291)ScsiDeviceIO: 2324: Cmd(0x412401bea940) 0x16, CmdSN 0x6552a from world 0 to dev "naa.5000144f649871d7" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x25 0x0.
<181>2013-06-24T07:33:59.314Z esx3.ms.ru vmkwarning: cpu7:4291)WARNING: NMP: nmp_DeviceStartLoop:721:NMP Device "naa.5000144f649871d7" is blocked. Not starting I/O from device.