I'm seeing a lot of iScsiPrt errors in Event Viewer and am not sure how to begin resolving them. Event ID's are 1, 5, 7, 9, 20, 39, and 49. All connections are to E-series.
Details for a few of the errors:
"Initiator failed to connect to the target. Target IP address and TCP Port number are given in dump data."
"Connection to the target was lost. The initiator will attempt to retry the connection."
"Target failed to respond in time to a Task Management request."
"Target did not respond in time for a SCSI request. The CDB is given in the dump data."
I initially went to my net admin, but he said there aren't any problems on the network. I'm not really a storage admin, so help getting started is appreciated!
Although these are for ONTAP, maybe they can give you a hint:
Those errors started happening from nowhere? Are you seeing any performance degradation on the storage? Any application issue?
If those hosts are not hyper-v (where you could share NICs for different purposes), I would not use NIC teaming. Just rely on MPIO.
So, from the client (server) perspective, what do you found regarding iSCSI/network connections?
Can you ping the target IP addresses?
iSCSI uses port 3260, can you connect to that port from the server to the target (using telnet, for example)?
You are probably using the iSCSI initiator software, what's the status there?
I should add that I have six hosts accessing five SANs, and each are having these problems. The problems don't point to a single host or SAN, so I'm thinking there's a network issue or common configuration issue among all the hosts. I appreciate you helping me through this. Testing will be performed on a single host then replicated to others as progress is made.
I'm able to ping each interface on each SAN (four each on five SANs).
I tried telnet to a few of the interfaces and the connection timed out without giving a login prompt. I don't see in SANtricity if there's an option to enable telnet anywhere.
Discovered targets in iSCSI Initiator all show connected. All attached LUN's are accessible on the hosts. I just seems that there are network blips causing this issue.
All hosts are connected to the SANs via NIC teaming. Are there NIC settings to be considered?
Because the issue is not isolated to a single host or SAN, I agree that the problem sounds like a network or general settings issue.
You mentioned that all the hosts are using NIC teaming which is not recommended. See the snippet from Configuring the switches - iSCSI, Windows below:
"Port channels/LACP is not supported on the controller's switch ports. Host-side LACP is not recommended; multipathing provides the same, and in some cases better, benefits."
Another thing worth checking is if the hosts, switches and SANs are all set up to use the same MTU size. Jumbo frames are not required, but if the host and SAN do not agree on MTU size, fragmentation occurs causing slower performance and sometimes momentary blips on the connection.
Also, make sure that the E-Series specific DSM is installed on each host.
Configure the multipath software
Let me know if this helps.
Thank you for pointing out that teaming is not recommended. I may unteam one of the hosts and monitor for a few days to see if the errors go away.
I had MPIO Device Specific Module (DSM) installed before, and yesterday I installed SANtricity Storage Manager (with Host option). I noticed DSM driver was updated during installation.
Still getting iScsiPrt event ID's 7 and 20 since breaking the NIC team.
The errors are generic. "Connection to the target was lost. The initiator will attempt to retry the connection." Is there a way to identify which target was lost?
You can check the current status of the iSCSI targets by opening Server Manager > Tools > iSCSI Initiator.
Additionally, it is a good time to open a technical support case so the storage logs can be investigated for more clues.