ONTAP Discussions

SnapDrive process failure in Linux

tribadmin
15,076 Views

Hi all,

I'm deploying a new filer and am having some troubles with SnapDrive 4.0 for Linux - specifically CentOS 5.1 x86_64 (fully patched).

snapdrived starts up ok and I can interact with it to the extent of setting the root password for the filer. When I try to perform a filer operation, however, things don't go so well. To start,

[root@db2 log]# snapdrive storage list -all

Status call to SDU daemon failed

[root@db2 log]# ps -ef | grep snapdri
root 7587 1 0 Jul24 ? 00:00:00 snapdrived start
root 11283 7587 0 13:40 ? 00:00:00 [snapdrived] <defunct>

Each re-iteration of a snapdrive storage command will spawn a new defunct process. Commands such as "snapdrive config show" will run fine.

And in sd-trace.log:

13:43:06 07/25/08 [f7f7cb90]?,2,2,Job tag: bEogRP90xw
13:43:06 07/25/08 [f7f7cb90]?,2,2,snapdrive storage list -all
13:43:06 07/25/08 [f7f7cb90]v,2,6,FileSpecOperation::FileSpecOperation: 12
13:43:06 07/25/08 [f7f7cb90]v,2,6,StorageOperation::StorageOperation: 12
13:43:06 07/25/08 [f7f7cb90]i,2,2,Job tag bEogRP90xw
13:43:06 07/25/08 [f7f7cb90]i,2,6,Operation::setUserCred user id from soap context: root
13:43:06 07/25/08 [f7f7cb90]i,2,6,Operation::setUserCred uid:0 gid:0 userName:root
13:43:06 07/25/08 [f7f7cb90]F,0,0,Fatal error: Assertion detected in production code: ../sbl/StorageOperation.cpp:182: Test 'osAssistants.size() == 1' failed

When I strace the snapdrive process I see things conclude with:

connect(3, {sa_family=AF_INET, sin_port=htons(4094), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
send(3, "POST / HTTP/1.1\r\nHost: localhost"..., 1555, 0) = 1555
recv(3, "HTTP/1.1 200 OK\r\nServer: gSOAP/2"..., 65536, 0) = 1722
shutdown(3, 2 /* send and receive */) = -1 ENOTCONN (Transport endpoint is not connected)
close(3) = 0
write(2, "Status call to SDU daemon failed"..., 33) = 33
munmap(0xf7f7d000, 135168) = 0
exit_group(104) = ?

Which follows what I see on the packet capture side of things where the snapdrived port sends RSTs (no doubt after the child process has gone defunct) after a very limited exchange:

POST / HTTP/1.1
Host: localhoHTTP/1.1 200 OK
Server: gSOAP

Any input appreciated.

Thanks in advance.

44 REPLIES 44

sam_wozniak
3,965 Views

Finally a fix that worked for me too!  Thanks man! 

I'm on a RHEL 5.7 system with SDU 4.2P1 with HUK v6.0.  Emulex LPe1150 HBAs. 

For me, "snapdrive storage show -devices" worked, but not "-all"...  Until I tried your fix. 

# export LVM_SUPPRESS_FD_WARNINGS=1

# snapdrived restart

I'll be coding that into the init script as well.

sam_wozniak
3,965 Views

Keith,

Were you able to successfully test the init script modification?  I'm not having much luck over here...  If I run "/etc/init.d/snapdrived restart" while logged in it properly exports the variable and I can run 'snapdrive storage show -all' but the init script running from rc doesn't seem to leave the variable around.  I could just add that variable to my profile or bash_profile but am looking for a universal fix.  I guess this is more of a Linux question that I'll have to research.  But I think I understand the issue more now.  My main concern is that something with SMO will break because it may rely on output from a 'snapdrive storage show -all'. 

aborzenkov
3,965 Views

Adding environment variable to profile is universal fix. Environment is property of current process inherited by its child processes. Your login session has no parent/child relationship to rc scripts and so has no way to inherit environment from them.

Public