2009-06-25 08:40 AM
I am posting this to help my fellow colleagues taht may run into this. Just trying to share.
I installed SDU 4.1 on a Linux RHEL 5 server. This was all NFS, no SAN. Never had this issue until now, and the fix or issue is not all that obvious.
Here is what happened. I went to test the "snapdrive snap list -filer filer01" and I got an error "Cannot contct SDU daemon". Rather uninformative. So I went to the sd-trace.log file and this is what I find:
10:05:21 06/23/09 [f7ec6b90]i,2,2,Job tag BCtUAVUAeB
10:05:21 06/23/09 [f7ec6b90]v,2,6,SnapOperation::init: starting snap show (1)
10:05:21 06/23/09 [f7ec6b90]v,2,6,Operation::getUserCred uid:0 gid:0 userName:root
10:05:21 06/23/09 [f7ec6b90]v,2,6,Operation::isNonRootAllowed Exit ret:1
10:05:21 06/23/09 [f7ec6b90]v,2,6,FileSpecOperation::init: starting
10:05:21 06/23/09 [f7ec6b90]i,2,6,Operation::initParallelOps: started
10:05:21 06/23/09 [f7ec6b90]i,2,6,Operation::initParallelOps: succeeded
10:05:21 06/23/09 [f7ec6b90]v,2,6,StorageStack::StorageStack
10:05:21 06/23/09 [f7ec6b90]i,2,6,Operation::addErrorReport: (1) Operation:??? 6 1877:HBA assistant not found. Commands involving LUNs should fail.
10:05:21 06/23/09 [f7ec6b90]v,2,6,StorageStack::~StorageStack
10:05:21 06/23/09 [f7ec6b90]v,2,6,Operation::saveErrorReportList: saved 1 ErrorReports
10:05:21 06/23/09 [f7ec6b90]v,2,6,ErrorReport::cleanErrorReportList: 1 Error Report objects
10:05:21 06/23/09 [f7ec6b90]F,0,0,Fatal error: Assertion detected in production code: ../sbl/FileSpecOperation.cpp:227: Test 'osAssistants.size() == 1' failed
10:05:21 06/23/09 [f72ffb90]d,2,34,ScaleableExecutionPort::initScaleableExecutionPort: successful
10:05:21 06/23/09 [f72ffb90]d,2,34,ScaleableExecutionPort::startScaleableExecution: successfulOk, now what. I found an bug that said that /etc/issue might be an issue, but no reason why. We copied the original back and restartted the daemon. All no no avail. After spending a lot of time talking to support and other Unix guys in the field, no one else seemed
to figure it out either. So I punted. I removed the snapdrive package, cleaned up the /opt/NetApp/snapdrive directory, rebooted and reinstalled.
Viola! I could now test and see the filers snapshots. Hmm, now I needed an answer to why it worked this time and not the last.
It was the /etc/issue file. The default contains the version of linux that the system is running. The cusotmer had built the Linux system and customized it overwriting the default one. When SDU installs, it looks there to determine the version so it knows the libraries to install. If there is nothing there, it just defaults to a generic library, which BTW does not work all the time. When the /etc/issue is right, it can install the right libraries.
Also, you can change the /etc/issue but you have to keep the contents of the original in tact, otherwise it will cause issues on a restart of sdu.
Hope this helps.
2009-07-08 03:52 AM
First of all, I am not sure why things did not work when you restarted snapdrive daemon after restoring the original /etc/issue file.
I mean, there should not be any need to uninstall and then re-install SDU in order make it working.
I suspect the "snapdrived restart" operation was not successful in the first case (when /etc/issue file was restored to the original content).
However, it is true that there exists a bug related to /etc/issue file which does not allow snapdrived to be loaded properly.
The work-around for this issue is to ensure that the content of /etc/release file is same as that of /etc/issue file during "snapdrived start" or "snapdrived restart".
i.e. one may perform the following steps in case of such issue:
(1) cp /etc/issue /etc/issue.backup
(2) cp /etc/release /etc/issue
(3) snapdrived restart
(4) cp /etc/issue.backup /etc/issue