VMware Solutions Discussions

iSCSI issues in ESX 4 with Netapp 270

jonathanengstrom
3,932 Views

Alright, got an interesting issue here. I have an environment that I had up and running for over a month in ESX 3.5 U4 with iSCSI connected to several luns on a Netapp 270 running OnTap 7.3.1.1. I have reloaded those same servers with ESX 4. When trying to get iscsi working, I am having absolutely no luck. My problem is similar to this issue, but I don’t even have paths listed out.

http://communities.vmware.com/thread/211369?start=0&tstart=0

For sanity an hour ago, I loaded 3.5 on one of the servers again, and it connected right up to iSCSI storage. Here are some log entries, 10.1.1.40 is my netapp.

vmkiscsid.log

2009-08-26-16:32:05: iscsid: Login I/O error, failed to receive a PDU

2009-08-26-16:32:05: iscsid: retrying discovery login to 10.1.1.40

2009-08-26-16:32:06: iscsid: cannot make connection to 10.1.1.40:3260 (111)

2009-08-26-16:32:06: iscsid: connection to discovery address 10.1.1.40 failed

2009-08-26-16:32:06: iscsid: connection login retries (reopen_max) 5 exceeded

vmkwarning

Aug 26 16:17:09 esx1 vmkernel: 0:01:02:22.503 cpu0:4229)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba36:CH:0 T:0 CN:0: iSCSI connection is

being marked "OFFLINE"

Aug 26 16:17:09 esx1 vmkernel: 0:01:02:22.503 cpu0:4229)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID:  TARGET: (null) TPGT: 0 TSIH: 0

]

Aug 26 16:17:09 esx1 vmkernel: 0:01:02:22.503 cpu0:4229)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 10.1.1.212:62577 R: 10.1.1.

40:3260]

I can have them in the same igroup on the filer, but not matter what, the vsphere server will not connect. This happened on both vSphere installs (until I reloaded the one with 3.5 an hour ago, now it’s fine)

Any ideas?

6 REPLIES 6

eric_barlier
3,932 Views

Hej Jonathan,

Is the firewall in ESX4 open? I did see this:

2009-08-26-16:32:06: iscsid: cannot make connection to 10.1.1.40:3260 (111)

Port 3260 needs to be open. maybe its something as minor as this? Other than that have you tried to ping and traceroute and all that from ESX4?

Cheers,

Eric

Message was edited by: eric barlier

jonathanengstrom
3,932 Views

I can ping and vmkping 10.1.1.40 from the esx host, and I can ping both the console ip and the vmkernel from the filer. Firewall expetions were added of course, and even allowed all incoming and outgoing traffic.

I have had 2 other people look at it as well for several hours today, so I am quite sure the basics were covered.

eric_barlier
3,932 Views

Hi Jonathan,

OK, you ve done the basics thats good to hear. I tried to google your error message and ended up back to this thread!

I assume there are no more error messages that would help us here other than starting to read the ISCSI protocol from

beginning to end to understand what this means:

iscsid: failed to receive a PDU

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002785

probably wont help. I found an explanation that what a PDU is: http://docs.hp.com/en/T1452-90011/ch01s02.html

However it does seem to indicate that some info is gone missing on the way to the controller? What do you think?

Eric

jonathanengstrom
3,932 Views

I am convinced it is an issue with ESX 4. I have reloaded both machines several times today from scratch, and each time I can configure esx 3.5 no problem with iscsi and nfs, but it won't work no matter what I do with ESX 4. I don't have an explanation. It seems like ESX 4 hates a FAS 270. It is still on the hardware compatibility list for ESX 4 as well.

amiller_1
3,932 Views

Hmm....possible to just plug the ESX server directly into the NetApp using a crossover cable temporarily?

Would rule out anything network-wise and purely down to ESX/NetApp issues.....

ian_iball
3,932 Views

Hi, I'm getting this exact same problem.

I implemented an ESX 4 and EMC Storage solution with NO problems.  I have now come to do the exact same project but on a FAS2020.  I have read a lot of the posts regarding this and wondered if this had been escalated within NetApp for a resolution..

Cheers.

Public