ONTAP Hardware
ONTAP Hardware
Can I just start of by saying I only have a basic understanding of netapp so please bare with me, unfortunately our netapp admins has been signed off work for a few weeks sick.....
We have ESXI hosts 2 for managment (mgt) and 6 for development (dev). All ProLiant BL460c Gen8 Intel Xeon CPU E5-2670 0 @ 2.60GHz x 2 262gb ram
Both Dev and Mgt connect to the same Netapp We upgraded our hosts from 5.5 to 6.5 and since then we've been having latency issues with the dev cluster which eventually brings everything to a halt.
We have 2 netapp FAS2240-4
Type: HA Pair
Version 8.2.3p6 7-mode
netapp1
In aggregate i see 1aggregate
hybrid0
RAID Type:
mixed_raid_type, hybrid
netapp2 has 2 agrregates
aggr0
sata0
RAID Type:
RAID-DP
The 2 ESXI mgt hosts connect only use datastores from volumes on netapp2
The 6 ESXI dev hosts connect to both netapp1 and netapp2
We are using Emulex Corporation 2 x HP FlexFabric 10Gb 2-port 554FLB Adapters for iscsi on each host.
Sometimes we are seeing latency for some vm's up to .5 seconds. Every now and then a host will fail and the vm's will start to be moved around which causes more heavy disk usage and one by one the other hosts fail.
I see this in the vmkernel logs:
suing command 0x439d40b94cc0
/var/run/log/vmkernel.log:2018-09-04T16:39:52.688Z cpu27:66195)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.60a9800042394542503f49426b6f724a" - issuing command 0x439d5238a4c0
/var/run/log/vmkernel.log:2018-09-04T16:39:52.689Z cpu22:66371)NMP: nmpCompleteRetryForPath:327: Retry world recovered device "naa.60a9800042394542503f49426b6f7242"
/var/run/log/vmkernel.log:2018-09-04T16:39:53.335Z cpu12:66370)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x8a (0x43954c8049c0, 76476) to dev "naa.60a9800042394542503f49426b6f7246" on path "vmhba64:C0:T7:L5" Failed: H:0x1 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:FAILOVER
/var/run/log/vmkernel.log:2018-09-04T16:39:53.335Z cpu12:66370)WARNING: NMP: nmp_DeviceRetryCommand:133: Device "naa.60a9800042394542503f49426b6f7246": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
/var/run/log/vmkernel.log:2018-09-04T16:39:53.693Z cpu22:66195)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.60a9800042394542503f49426b6f7246" - issuing command 0x43954c8049c0
/var/run/log/vmkernel.log:2018-09-04T16:39:53.694Z cpu12:66370)NMP: nmpCompleteRetryForPath:327: Retry world recovered device "naa.60a9800042394542503f49426b6f7246"
/var/run/log/vmkernel.log:2018-09-04T16:39:53.819Z cpu22:66371)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x439d40bab540, 65575) to dev "naa.60a9800042394542503f49426b6f7236" on path "vmhba64:C0:T7:L0" Failed: H:0x1 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:FAILOVER
/var/run/log/vmkernel.log:2018-09-04T16:39:53.819Z cpu22:66371)WARNING: NMP: nmp_DeviceRetryCommand:133: Device "naa.60a9800042394542503f49426b6f7236": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
/var/run/log/vmkernel.log:2018-09-04T16:39:54.687Z cpu3:112734)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.60a9800042394542503f49426b6f7236" - issuing command 0x439d40bab540
or this:
/var/run/log/vmkernel.log:2018-09-04T17:06:56.949Z cpu16:72849)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60a9800042394542503f49426b6f724a" state in doubt; requested fast path state update...
/var/run/log/vmkernel.log:2018-09-04T17:06:56.949Z cpu16:72849)ScsiDeviceIO: 2968: Cmd(0x439d40bf0ac0) 0xfe, CmdSN 0x88e from world 65575 to dev "naa.60a9800042394542503f49426b6f724a" failed H:0x5 D:0x40 P:0x0 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:07:03.078Z cpu18:66394)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x439d4528e0c0, 67172) to dev "naa.60a98000424155764b5d493836386659" on path "vmhba1:C0:T2:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
/var/run/log/vmkernel.log:2018-09-04T17:07:03.078Z cpu18:66394)ScsiDeviceIO: 2933: Cmd(0x439d4521a8c0) 0xfe, CmdSN 0xb54 from world 67172 to dev "naa.60a98000424155764b5d493836386659" failed H:0x0 D:0x2 P:0x5 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:07:20.614Z cpu27:72852)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x439d4cff09c0, 65575) to dev "naa.60a9800042394542503f49426b6f7244" on path "vmhba1:C0:T5:L4" Failed: H:0x5 D:0x40 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL
/var/run/log/vmkernel.log:2018-09-04T17:07:20.614Z cpu27:72852)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60a9800042394542503f49426b6f7244" state in doubt; requested fast path state update...
/var/run/log/vmkernel.log:2018-09-04T17:07:20.614Z cpu27:72852)ScsiDeviceIO: 2968: Cmd(0x439d40bc7fc0) 0xfe, CmdSN 0x65e from world 65575 to dev "naa.60a9800042394542503f49426b6f7244" failed H:0x5 D:0x40 P:0x0 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:07:20.620Z cpu15:66393)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x43954115bac0, 72866) to dev "naa.60a9800042394542503f49426b6f7246" on path "vmhba1:C0:T5:L5" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
/var/run/log/vmkernel.log:2018-09-04T17:07:20.620Z cpu15:66393)ScsiDeviceIO: 2933: Cmd(0x43954cff4900) 0xfe, CmdSN 0xa9e from world 72866 to dev "naa.60a9800042394542503f49426b6f7246" failed H:0x0 D:0x2 P:0x5 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:07:30.676Z cpu27:65935)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x439d45294640, 65575) to dev "naa.60a9800042394542503f49426b6f7248" on path "vmhba1:C0:T6:L6" Failed: H:0x5 D:0x40 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL
/var/run/log/vmkernel.log:2018-09-04T17:07:30.676Z cpu27:65935)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60a9800042394542503f49426b6f7248" state in doubt; requested fast path state update...
/var/run/log/vmkernel.log:2018-09-04T17:07:30.676Z cpu27:65935)ScsiDeviceIO: 2968: Cmd(0x439d40bb7240) 0xfe, CmdSN 0x9be from world 65575 to dev "naa.60a9800042394542503f49426b6f7248" failed H:0x5 D:0x40 P:0x0 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:08:05.677Z cpu8:66393)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x439541001640, 73158) to dev "naa.60a9800042394542503f49426b6f724a" on path "vmhba1:C0:T7:L7" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
/var/run/log/vmkernel.log:2018-09-04T17:08:05.677Z cpu8:66393)ScsiDeviceIO: 2933: Cmd(0x43954101a540) 0xfe, CmdSN 0x921 from world 73158 to dev "naa.60a9800042394542503f49426b6f724a" failed H:0x0 D:0x2 P:0x5 Invalid sense data: 0x80 0x41 0x0.
/var/run/log/vmkernel.log:2018-09-04T17:14:08.592Z cpu0:66393)NMP: nmp_ThrottleLogForDevice:3647: Cmd 0x89 (0x43954ce46800, 73833) to dev "naa.60a9800042394542503f49426b6f7248" on path "vmhba1:C0:T5:L6" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
/var/run/log/vmkernel.log:2018-09-04T17:14:08.592Z cpu0:66393)ScsiDeviceIO: 2933: Cmd(0x439d40a954c0) 0xfe, CmdSN 0xc1a from world 73833 to dev "naa.60a9800042394542503f49426b6f7248" failed H:0x0 D:0x2 P:0x5 Invalid sense data: 0x80 0x41 0x0.
It only happens to the dev cluster which is uses the ssd hybrid volumes.
We reverted back to 5.5 once before and everything was ok. We then updated all device firmware then upgraded to 6.5 and installed the drivers according to vmware compatibility lists.
Hi there,
Can you please verify if you have followed the June 2018 HPE FlexFabric Cookbook - http://vibsdepot.hpe.com/hpq/recipes/HPE-VMware-Recipe.pdf - also what does your FlexFabric connect to?
yes the drivers are the same as that cookbook.
Its connecting to a Cisco C3750X-48T-S
Did you also verify the firmware versions? These are just as important as the drivers.
Bgrds,
Finnur
Hi,
Did you have a fix for this problem ?
we are seeing the same error messages and we are runing 6.5
thx