Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
hi,
we are facing one problem since 1 mounth, VM are disconnected form the storage and from VMwarLog we can see this :
2019-03-24T22:38:15.173Z cpu4:33383)ScsiDeviceIO: 2613: Cmd(0x439e4bb84280) 0x8a, CmdSN 0x80000018 from world 36469 to dev "naa.600a098038303841635d4a2d2d575a63" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-03-24T22:39:42.333Z cpu10:33383)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x8a (0x43a640b73680, 36213) to dev "naa.600a098038303772735d4a37734f4844" on path "vmhba2:C0:T0:L0" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-03-24T22:39:42.333Z cpu10:33383)ScsiDeviceIO: 2613: Cmd(0x43a640b73680) 0x8a, CmdSN 0x800e0038 from world 36213 to dev "naa.600a098038303772735d4a37734f4844" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-03-24T22:43:57.540Z cpu14:33383)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x8a (0x43a64b6f6e40, 35971) to dev "naa.600a098038303841635d4a2d2d575a59" on path "vmhba2:C0:T1:L3" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-03-24T22:43:57.540Z cpu14:33383)ScsiDeviceIO: 2613: Cmd(0x43a64b6f6e40) 0x8a, CmdSN 0x8000007b from world 35971 to dev "naa.600a098038303841635d4a2d2d575a59" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
we need urgent help please !!
Solved! See The Solution
10 REPLIES 10
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
Are all of the best practice settings applied to vmware?
Any changes to the environment in the last month?
Are all parts (server,storage,switch,hypervisor) all compatible in the IMT?
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
hi,
we did an upgrade .
previously we had 1 stack and after we did an ugrade and add a second stack to the controllers
are you there? we really need help
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
I would open a P1 support case with netapp if you help ASAP, that will give you the fastest response time possible.
https://www.netapp.com/us/contact-us/support.aspx
As far as the upgrade goes... do you mean you did an ontap upgrade or added some disks?
Highlighted
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
we added 9 new shelfs to the new stack created and after that big IO latency started on all VM and on both stacks
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
I agree with the other poster. Sounds like more to this than just the virtualization software. I would open a P1 case ASAP. They can do a comprehensive look.
---Karl
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
There could be a lot going on there honestly. A support case and have a perfstat run on the cluster.
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
this is the response from VMwar:
2019-03-24T23:04:41.380Z cpu22:33383)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x2a (0x439e4b6ccd00, 36451) to dev "naa.600a098038303841635d4a2d2d575a63" on path "vmhba2:C0:T1:L6" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-03-24T23:04:41.380Z cpu22:33383)ScsiDeviceIO: 2613: Cmd(0x439e4b6ccd00) 0x2a, CmdSN 0x8000004b from world 36451 to dev "naa.600a098038303841635d4a2d2d575a63" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-03-24T23:05:01.627Z cpu14:33383)ScsiDeviceIO: 2613: Cmd(0x439e4b7a09c0) 0x2a, CmdSN 0x80000073 from world 36451 to dev "naa.600a098038303841635d4a2d2d575a63" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-03-24T23:16:25.753Z cpu10:33383)ScsiDeviceIO: 2613: Cmd(0x439e40b95680) 0x2a, CmdSN 0x800e0017 from world 35697 to dev "naa.600a098038303772735d4a37734f484c" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
-- We tried to bring the production back up by registeing orpahned VM's on the host .170 and powered on successfully
-- In the mean time the host .171 went into non-responding state which houses the vCenter and which had the VM's also
-- We rebooted the VM and it came back online and once it was online the host .170 went into not-responding state
-- The VM LMS has 3 disk with 2 levels of snapshots
[root@localhost:/vmfs/volumes/583f0917-bc1e434e-c61b-000af74ef094/LMS] ls -ltrh | grep -i delta
-rw------- 1 root root 948.7G Dec 18 16:46 LMS_1-000001-delta.vmdk
-rw------- 1 root root 908.4G Mar 24 16:47 LMS_1-000002-delta.vmdk
-rw------- 1 root root 215.2G Mar 24 16:47 LMS-000001-delta.vmdk
[root@localhost:/vmfs/volumes/583f0917-bc1e434e-c61b-000af74ef094/LMS] ls -ltrh /vmfs/volumes/59a78fc2-1606aed3-675e-000af74ef094/LMS | grep -i delta
-rw------- 1 root root 46.7G Dec 15 23:56 LMS-000002-delta.vmdk
-rw------- 1 root root 46.5G Mar 24 16:47 LMS-000003-delta.vmdk
-- Even if we try to power on it will stuck at 33% and told the customer we will have to wait till it completes or we need to do a consolidation
-- Since the reads and writes are failing on all the datastores told the customer to engage storage Vendor on this to check further
-- Checked the driver and firmware on HBA and they are compatible
-- With storage commands failing with above errors, we see host .170 and .171 is going into not-responding state one after the other and it is difficult for us to troubleshoot until we fix the storage issue.
the second stack has been added 6 month after the first one and the first stack is connected to IBM chassis blade, it was working fine all those month but the problem start when we add the second stack wich is connected to a hp blade ( LUN on the second stack are access by HP blade).
we have a replication site where the configuration is same and there everything is fine
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content
I also noticed such a problem, I hope it has already been fixed.
)))