Data Backup and Recovery

RH Linux/Snapdrive - discovering new lun(s) ...*failed*

bowhuntr09
8,393 Views

Hi all,

I have been trying to get this to work for several days and can't seem to find what I am overlooking. I already have 2 Linux systems with almost the same setup working fine. I can't get this new one to create and mount a lun.

Environment:

RHEL 5.6

sanlun version 5.2.103.1379

snapdrive Version 4.1.1

iscsi-initiator-utils-6.2.0.872-6.el

sg3_utils-1.25-5.el5

multipathd off

4x1G ethernet interfaces running as a bonded interface with 2 VLAN's.

I can reach the filers via ping and it appears to connect and create the luns. It just faisl in the discovering portion.

I have the new volumes created on the filers and when I run the snapdrive storage create command it creates the LUN, maps the LUN, and then fails at discovering the LUN. The specific error is:

0001-476 Admin error: Unable to discover the device associated with netapp2san:/vol/uhsd430_u01/uhsd430_u01_SdLun.  If multipathing in use, there may be a possible multipathing configuration error.

Please verify the configuration and then retry.

I had similar issues on the other two servers when I set them up a year or so ago but my notes indicate it was due to not having the filer names in /etc/hosts. I have checked the new server's configuration against the working one and I just don't see what is causing this. Any help would be greatly appreciated!!!

14 REPLIES 14

bowhuntr09
8,338 Views

I now have the LUN's connected. I had to issue a login command before they would work (iscsiadm -m node -L automatic). Once I did that snapdrive was able to create, map, and discover the LUN's.

The current problem is they are seeing the LUN's over our storage vlan and our regular network vlan. Is there a way I can limit iSCSI to a specific interface?

aborzenkov
8,338 Views

Use “iscsi interface disable” on filer to disable iSCSI protocol on specific interface(s).

bowhuntr09
8,338 Views

Thanks for the reply. I considered doing that, but I am unsure of other servers in our environment that may be using iSCSI on the regular vlan instead of the storage vlan. I have solved my problem by using iscsiadm commands to delete the unwanted path and then used snapdrive to add the storage back to the server. They all show only one path now and its on the right network.

If anyone knows of a command I can run on my filers to see what hosts are using iscsi over what interface then I could confirm (or change) those others to not use the regualr vlan and shut iscsi off at the filer on that interface one and for all.

PRAKASH_BHANDARI
8,338 Views

My issue is not exactly like yours but, its similar. I wonder if anyone else is seeing this.  I was hoping sg3_util would fix the issue but it does not.  Any guidance is appreciated.

I am trying to create a LUN via SNAPDRIVE in UNIX and am getting this the error below.   Looks like the error is pretty common, but the fixes that are recommended haven’t fixed my issue.

OS: Oracle Enterprise Linux 5 Update 1

Snapdrive: snapdrive Version 4.1.1

This may be pertinent to the issue..:

  1. Installed sg3-util and sg3-util-lib
  2. This does not seem to resolve the issue.
  3. Can’t find sdconfchecker under /opt/NetApp/snapdrive/bin/.. Documents seem to suggest there should be a binary there?

This is the ERROR I get when I try to create a new LUN:

[root@ora511 log]# snapdrive storage create -lun 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun -lunsize 10G -igroup prakash_smo_igroup

LUN 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun ... created

mapping new lun(s) ... done

discovering new lun(s) ... *failed*

Cleaning up ...

- LUN 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun ... deleted

0001-476 Admin error: Unable to discover the device associated with 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun.  If multipathing in use, there may be a possible multipathing configuration error.

Please verify the configuration and then retry.

aborzenkov
8,338 Views

Leave a comment on sdconfchecker page that you cannot use this tool. As long as nobody complaints it will remain this way.

Error that you see is far too generic to be able to say anything without detail log analysis. Have you verified that your configuration matches IMT? For a start, you could mention whether you are using iSCSI or FCP.

PRAKASH_BHANDARI
8,338 Views

Thanks for the reply. Yes, I checked IMT before selecting the information. I will update the sdconfchecker tool page as well.

Here is some more information:

1. I am running iscsi.

2. I ran snapdrive.dc. Contents of some of the files

sd-audit.log

19914: Begin uid=0 gid=0 11:48:04 12/16/11 snapdrive storage show -all

19914: FAILED Status=206006 11:48:04 12/16/11

20204: Begin uid=0 gid=0 13:01:03 12/16/11 snapdrive storage show -devices

20204: FAILED Status=206006 13:01:03 12/16/11

20366: Begin uid=0 gid=0 13:08:31 12/16/11 snapdrive storage create -lun 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun -lunsize 10G -igroup prakash_smo_igroup

20366: FAILED Status=6 13:08:40 12/16/11

20727: Begin uid=0 gid=0 13:39:12 12/16/11 snapdrive storage create -lun 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun -lunsize 10G -igroup prakash_smo_igroup

20727: FAILED Status=6 13:39:21 12/16/11

3712: Begin uid=0 gid=0 13:52:25 12/16/11 snapdrive storage create -lun 172.16.100.30:/vol/prakash_redo_vol/prakash_redo_vol_1.lun -lunsize 10G -igroup prakash_smo_igroup

3712: FAILED Status=6 13:52:34 12/16/11

3. sd-daemon-trace.log

14:16:17 12/16/11 [f7fb36c0]i,10,3,Local host name: ora511

14:16:17 12/16/11 [f7fb36c0]i,10,3,Host OS name: Linux

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:server_addr_init(): daemon server init started

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:server_addr_init(): daemon server = http://localhost:4094

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:main(): make soap copy

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:main(): init thread

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:main(): create thread

14:16:17 12/16/11 [f7fb36c0]v,10,1,snapdrived:main(): pthread_detach done with status: 0

14:16:17 12/16/11 [f7f91b90]v,10,1,snapdrived:process_request(): started

14:16:17 12/16/11 [f7f91b90]v,10,0,snapdrived:__SDUCLI__SDUDaemonStatus: rcved daemon status request

14:16:17 12/16/11 [f7f91b90]v,10,1,snapdrived :authenticate : start

14:16:17 12/16/11 [f7f91b90]F,10,1,snapdrived :authenticate: authentication done for root

14:16:17 12/16/11 [f7f91b90]v,10,1,snapdrived :authenticate : exit ret = 0

14:16:17 12/16/11 [f7f91b90]v,10,0,Job[snapdrived]::status_all: status_all started

14:16:17 12/16/11 [f7f91b90]v,10,0,Job[snapdrived]::status_all: job queue lock begin

14:16:17 12/16/11 [f7f91b90]v,10,0,Job[snapdrived]::status_all: job queue unlock success

14:16:17 12/16/11 [f7f91b90]v,10,0,Job[snapdrived]::status_all: output Snapdrive Daemon Version    : 4.1.1  (Change 942392 Built Fri Jul 17 04:56:45 PDT 2009)

Snapdrive Daemon start time : Fri Dec 16 13:51:01 2011

Total Commands Executed     : 1

Job Status:

        No command in execution

14:16:17 12/16/11 [f7f91b90]v,10,0,Job[snapdrived]::status_all: status Snapdrive Daemon Version    : 4.1.1  (Change 942392 Built Fri Jul 17 04:56:45 PDT 2009)

Snapdrive Daemon start time : Fri Dec 16 13:51:01 2011

Total Commands Executed     : 1

Job Status:

        No command in execution

14:16:17 12/16/11 [f7f91b90]v,10,0,snapdrived:__SDUCLI__SDUDaemonStatus: 0 Snapdrive Daemon Version    : 4.1.1  (Change 942392 Built Fri Jul 17 04:56:45 PDT 2009)

Snapdrive Daemon start time : Fri Dec 16 13:51:01 2011

Total Commands Executed     : 1

Job Status:

        No command in execution

14:16:17 12/16/11 [f7f91b90]v,10,1,snapdrived:process_request(): exit

root@ora511 ntap_snapdrive_info]# cat snapdrive_version

snapdrive Version 4.1.1

4. What else can I provide or were should I look to get a better idea of what's going on?

Thanks much!

aborzenkov
8,338 Views

Does igroup on filer exist and is your host IQN included in this igroup?

Does host physically see new LUN when it is created and mapped? You can verify by looking in e.g. dmesg output immediately after running snapdrive for messages about new sdXX device.

If you manually create LUN on filer and map it to this igroup - does host see this LUN?

PRAKASH_BHANDARI
8,338 Views

Here is the list of steps that demonstrates the iSCSI side is okay.  Here I do the following -

a. add a lun in the filer

b. map the lun to the initiator group

c. recan the iscsi session

d. find the device.

Things appear to work from that perspective.  There must be something wrong, I am doing with the snapdrive.. Please let me know additional questions.

Thanks!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1. Does the igroup exist?

yes it does... here you go

3050A> igroup show

    prakash_smo_igroup (iSCSI) (ostype: linux):

        iqn.1994-05.com.redhat:8d526d72fdc9 (logged in on: vif1-100)

[root@ora511 iscsi]# cat initiatorname.iscsi

InitiatorName=iqn.1994-05.com.redhat:8d526d72fdc9

[root@ora511 iscsi]#

2.

[root@ora511 iscsi]# ls -ltr /dev/sd*

brw-r----- 1 root disk 8, 2 Dec 16 13:43 /dev/sda2

brw-r----- 1 root disk 8, 0 Dec 16 13:43 /dev/sda

brw-r----- 1 root disk 8, 1 Dec 16 13:44 /dev/sda1

3050A> lun create -s 20g -t linux -o noreserve /vol/prakash_voting_vol/prakash_voting_vol_1.lun

ls -ltr /dev/sd*

3050A> lun map /vol/prakash_voting_vol/prakash_voting_vol_1.lun  prakash_smo_igroup 4

/vol/prakash_voting_vol/prakash_voting_vol_1.lun 20g (21474836480)   (r/w, online, mapped)

 

3. Scan devices

[root@ora511 dev]# ls -ltr sd*

brw-r----- 1 root disk 8, 2 Dec 16 13:43 sda2

brw-r----- 1 root disk 8, 0 Dec 16 13:43 sda

brw-r----- 1 root disk 8, 1 Dec 16 13:44 sda1

<new device does not exist yet>

[root@ora511 dev]# iscsiadm -m session --rescan

Rescanning session [sid: 1, target: iqn.1992-08.com.netapp:sn.101190664, portal: 172.16.100.30,3260]

[root@ora511 dev]#

[root@ora511 dev]#

[root@ora511 dev]#

[root@ora511 dev]# ls -ltr sd*

brw-r----- 1 root disk 8,  2 Dec 16 13:43 sda2

brw-r----- 1 root disk 8,  0 Dec 16 13:43 sda

brw-r----- 1 root disk 8,  1 Dec 16 13:44 sda1

brw-r----- 1 root disk 8, 16 Dec 17 12:13 sdb

<new device found>

aborzenkov
8,338 Views

Can you see this manually added LUN with /opt/netapp/santools/sanlun lun show all?

What are the values of default-transport and multipathing-type SnapDrive configuration options?

PRAKASH_BHANDARI
6,750 Views

That lun is visible with sanlun show all

[root@ora511 santools]# sanlun lun show all

controller:                    lun-pathname                    device filename  adapter  protocol          lun size         lun state

      3050A:  /vol/prakash_voting_vol/prakash_voting_vol_1.lun  /dev/sdb         host1    iSCSI         20g (21474836480)    GOOD

Here are the entries for the parameters that you asked about... i had changed the multipathing_type to "none" although it was already the default.

#default-transport="iscsi"

multipathing-type="none"

aborzenkov
6,750 Views

Do you have any FC HBA in server or any FC driver loaded (like Emulex lpfc)? If yes, make sure FC drivers are unloaded.

Otherwise you could enable trace and see if there is some hint there: https://kb.netapp.com/support/index?page=content&id=2012677

PRAKASH_BHANDARI
6,750 Views

This is a portion of sd-trace.log.  I don't know if this is even real.. but Everything looks normal except for --

22:13:12 12/18/11 [f7590b90]E,2,2,swzl_command: storage create FAILED 6

==> I was adding this volume --

/vol/prakash_voting_vol/prakash_voting_vol_3.lun  

It clearly works as the /dev/sdc is created.. then something happens and things fail. 

Any thoughts?

Thanks!

++++++++++++++++++++++++++++++++++++++++++++++++++++======

172.16.100.30:/vol/PRAKASH_SMO_BIN

                        172.16.100.30:/vol/lun_test

                        172.16.100.30:/vol/ex2010mb01_dganger_save

                        172.16.100.30:/vol/sharepoint2010

                        172.16.100.30:/vol/sdw_exchange2003_cl_9dcc798ac69f49cfadd57e460444b66d_ss_0

                        172.16.100.30:/vol/exchange2007

                        172.16.100.30:/vol/prakash_redo_vol

                        172.16.100.30:/vol/snapmirr

                        172.16.100.30:/vol/rvbdex

                        172.16.100.30:/vol/sv_test

        2       FilerVolume :: 172.16.100.30:/vol/prakash_voting_vol

                FilerVolume: 172.16.100.30:/vol/prakash_voting_vol

                        172.16.100.30:/vol/prakash_voting_vol/prakash_voting_vol_1.lun

        3       Igroup :: 172.16.100.30:prakash_smo_igroup

                Igroup: 172.16.100.30:prakash_smo_igroup

                Filer: 172.16.100.30, Igroup: prakash_smo_igroup

                Transport Type : iSCSI OS Type: linux

                Ports:

                        iqn.1994-05.com.redhat:8d526d72fdc9

        4       PhysicalDevice :: /dev/sda

                PhysicalDevice: /dev/sda, /dev/sda => :

        5       PhysicalDevice :: /dev/sdb

                PhysicalDevice: /dev/sdb, /dev/sdb => 3050A:/vol/prakash_voting_vol/prakash_voting_vol_1.lun

        6       MetaDevice :: 172.16.100.30:/vol/prakash_voting_vol/prakash_voting_vol_3.lun

                MetaDevice: 172.16.100.30:/vol/prakash_voting_vol/prakash_voting_vol_3.lun

                        PhysicalDevices:

        7       PhysicalDevice :: /dev/sdc

                PhysicalDevice: /dev/sdc, /dev/sdc => 3050A:/vol/prakash_voting_vol/prakash_voting_vol_3.lun

22:13:12 12/18/11 [f7590b90]E,2,2,swzl_command: storage create FAILED 6

22:13:12 12/18/11 [f7f91b90]d,2,34,ScaleableExecutionPort::initScaleableExecutionPort: successful

22:13:12 12/18/11 [f7f91b90]d,2,34,ScaleableExecutionPort::startScaleableExecution: successful

22:13:12 12/18/11 [f7f91b90]d,2,34,ScaleableExecutionPort::initScaleableExecutionPort: successful

22:13:12 12/18/11 [f7f91b90]d,2,34,ScaleableExecutionPort::startScaleableExecution: successful

aborzenkov
6,750 Views

Looks like it cannot find physical device corresponding to LUN. Suspicious is that it is using host name in one place and IP in another.

MetaDevice :: 172.16.100.30:/vol/prakash_voting_vol/prakash_voting_vol_3.lun

: /dev/sdc, /dev/sdc => 3050A:/vol/prakash_voting_vol/prakash_voting_vol_3.lun

Make sure you consistently use either filer host name or filer IP everywhere.

PRAKASH_BHANDARI
6,750 Views

That was it.  An entry into /etc/hosts fixed the issue.  Thanks for your help.

Public