Data Backup and Recovery

Storage discovery fails with error cannot retrieve storage connection settings on Snapcenter

storageadd
5,336 Views

Hi All

 

I was hoping to get additional help from anyone regarding this issue which has got me pretty stumped

 

Im in the midst of setting up the Snapcenter for SQL for Windows across domain ,

 

  • When i try to configure the log directory on the hosts,  Im getting an error "Unable to find any Netapp Filesystems on the host, ensure that the SVM are configured with Snapcenter.   this is the error that i see on the logs 

2020-06-23T05:56:13.5923927-04:00 Error SDW PID=[2996] TID=[12312] Cannot retrieve storage connection setting from SMS server.

 

here's what's been done so far

 

  • The snapcenter authenticates using domain account , I have added the SVM on Snapcenter - all ok no issues as both Snapcenter Server and the storage are located in Domain A. The Snapcenter Plugin hosts are in Domain B 
  • Ive managed to add the plugin hosts and installed the Snapcenter Plugin for windows and SQL successfully 
  • The Run As name account that was used to add the hosts on Snapcenter in domain A is also being used in Domain B 
  • Im able to see and view the SQL instances and resources within them on Snapcenter.
  • the luns are able to be seen on the SQL hosts and both hosts and SVM are configured as part of the same subnet.
  • But in Snapcenter under Hosts> Disks for some reason the disks are not enumerated under the cluster name and this in turn causes the storage discovery  to fail with error cannot retrieve storage connection settings

Ive attached the logs below

 

Would really appreciate if anyone could point to me on what else im missing

Logs:

 

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] --HostDiscoveryHelper::GetStorageGraph

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] ++StorageSystemManager::GetStorageSystemId

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] SVM: archive_svm

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] ++StorageSystemManager::GetStorageSystemIdFromCache

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] SVM: archive_svm

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] UserOperationcontext: Name: admin.xxx, GroupName:, Domain:a1, RoleId:1, Role Shared:True, UserGroup ObjectType:None, IsAdmin: True, GroupId:

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] GroupId is null in ThreadLocalStorage.GroupId.

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] The SVM 'archive_svm' with User 'a1\admin.xxx' , UserGroupObjectType 'None' and Role 1 RoleObjectsShared True is not in the storage system cache.

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] The SVM is not in the storage systemId cache:

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] --StorageSystemManager::GetStorageSystemIdFromCache

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] archive_svm is missing from storage cache. Starting update from server. Thread ID: 30

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] ++StorageSystemManager::UpdateStorageSystemIDFromServer

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] UserOperationContext: admin.xxx

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] UserOperationContext Domain: a1

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] UserOperationContext RoleId: 1

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] UserGroupObjectType: None

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] Making a rest call to https://localhost:8145//SMCoreCacheService//GetStorageConnection

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] Call remote Rest API  https://localhost:8145//SMCoreCacheService//GetStorageConnection

2020-06-23T05:56:13.5923927-04:00 Error SDW PID=[2996] TID=[12312] Cannot retrieve storage connection setting from SMS server.

9 REPLIES 9

Ontapforrum
5,310 Views

Some pointers:

 

After upgrade of SnapCenter to 4.1.1P3 the storage connection cannot be retrieved:
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Protection_and_Security/SnapCenter/After_upgrade_of_SnapCenter_to_4.1.1P3_the_storage_connection...

 

Note: Please check if storage system <SVM> is resolvable or add an entry to etc/hosts file on snapcenter.


Another pointer to ensure steps are taken: [Ensure SVM mgmt_lif is added as storage connection i.e role=data, data-protocol=none]
http://docs.netapp.com/ocsc-40/index.jsp?topic=%2Fcom.netapp.doc.ocsc-ag%2FGUID-5B2FB84B-91E9-4307-92DF-9B5B1E98A000.html

mrahul
5,287 Views

Eventhough SQL resources are discovered by SnapCenter, underlying filesystems (luns/disks) have to be also discovered.

Under resources, please select 'Filesystem' plug-in and the respective host, trigger a manual 'Refresh'. This operation should discover all the storage foot prints for the disks/luns mapped to the host.

 

Once this is done, you can recheck the 'log drirectory' configuration. It should list out the disks/luns for you to configure.

 

storageadd
5,264 Views

Hi mrahul

 

Yes i agree, ive done the manual refresh and no luck..

 

just to add further on the troubleshooting, 

 

I've ensured that the svm name is resolvable on the A2 domain. 

btw this is  SVM is purely used for FC connection. no CIFS services or data lifs are configured.

 

 

This SVM is not added on the A2 Domain Active Directory..  will it needed to be added on the AD? 

 

Ontapforrum
5,261 Views

Just on this statement 'btw this is SVM is purely used for FC connection. no CIFS services or data lifs are configured':
For iSCSI and FC protocols, a dedicated SVM management LIF is required because data and management protocols cannot share the same LIF. Just wanted to make sure you are using a separate mgmt lif.

What's the output:
::>network interface show -vserver <vserver> -data-protocol none

storageadd
5,258 Views

Hi OntapForum

Thank you for getting back to me...

 

Ive added the run as account that i used to add the hosts  in Snapcenter under Settings >User Access  for both  host and storage connection-  still no luck, the cluster (wsfc) is still not discovering the disks in Snapcenter.  The SVM is resolvable on the A2 Domain

 

the net int output:


Vserver Name: archive_svm
Logical Interface Name: archive_svm_vlan213_lif1
Role: data
Data Protocol: none

 

The weird thing is we have an existing config  on Snapcenter with FC connection that works flawlessly, both existing and the one im trying configure has identical config, physical servers, windows , patches are all on the same version..albeit the only difference is both Snapcenter and SVM are in the same domain

Ontapforrum
5,247 Views

Sometimes, it can be little frustrating especially if its working fine in another environment. Just wondering if it could be due to 'port/firewall' issue between plug-in and storage? We are not snapping sql yet with SC but we will soon, hence this thread is very useful.

There is another thread here, I believe could be relative.
https://community.netapp.com/t5/Data-Backup-and-Recovery-Discussions/SnapCenter-4-0-No-storage-connection-is-set-for-StorageSystem-lt-Fully-Qualified/...

 

Quick Start Guide For SnapCenter Plug-in for Microsoft SQL Server
https://library.netapp.com/ecm/ecm_download_file/ECMLP2861721

storageadd
5,197 Views

Hi OntapForum

 

Since the SC plugin hosts were part of A2 domain, the first thing i did was to add the rule exception of the ports 

 

All the listed ports below were ensured they are opened bidirectionally..

 

80
8145
8146
3306
135
445
49152-65535
443

 

configured domain trust both ways and added the Run As Account on Snapcenter as part of A2 Domain with it having administrator rights. The SVM is also resolving to the correct FQDN  on the  A2 Domain. It fails with this error whenever it try to configure the log directory and under the hosts   the disks are failing to be discovered.

storageadd_0-1593355419900.png

 

I did see the thread as well,  and i noticed that that SVM is not joined to the A2 Domain on the AD.

 

Would it be a cause for the disk to not be discovered on the hosts on Snapcenter?

 

Secondly im using my own Snapcenter account to discover the disks, I could maybe using the local Admin account which is the run as account i used to add the plugin hosts

 

these are the 2 things left to do,  might take a couple of days and require some approvals before i can get onto that...

 

 

 

mrahul
5,166 Views

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] The SVM 'archive_svm' with User 'a1\admin.xxx' , UserGroupObjectType 'None' and Role 1 RoleObjectsShared True is not in the storage system cache.

2020-06-23T05:55:52.5446075-04:00 Verbose SDW PID=[2996] TID=[12312] The SVM is not in the storage systemId cache:

2020-06-23T05:56:13.5923927-04:00 Error SDW PID=[2996] TID=[12312] Cannot retrieve storage connection setting from SMS server.

The issue here is for some reason the SVM in question is not present in your plug-in host caching. Cache is built when you add the host and push plug-in packages. During 'discovery' operation, SC looks for the storage in plugin-host cache and if  it doesnt find the respective SVM, it tries to fetch it from SC server host itself. From your logs the fetching operation too is failing. This is mostly beacuse the storage resolution is not working from your SQL nodes, hence missing from its cache.

 

Please make sure the svm name 'archive_svm' resolves and pings using name from both SQL Cluster nodes and retry .

Ex: ping -a <SVM IP> should return name 'archive_svm' from SC server as well all plugin host nodes.

 

To summarize , if SVM is added with short name or FQDN it has to be resolvable from both the server and the plug-in host.

 

Once you make the change, you can either try to restart the SMcore/Plug-in for windows services on the plug-in hosts or else, remove and re-add the host, if no restrictions.

 

 

storageadd
5,135 Views

Hi mrahul

 

thank you for getting back to me..

 

I run into the storage resolution issue, and its indeed not resolving to the proper FQDN.  Ive already added the DNS record but unfortunately its not working...

 

however,

 

Upon further troubleshooting with my respective team on this matter,  We've found out that this these ports (80,443, 8146, 3306, 65535 ) are not open between plugin host & svm , The telnet is failing on these ports .

 

These ports (8145, 135, 445, 49152) meanwhile are open.  telnet test is ok.

 

I did put in a request for all listed ports  above to be open based on the guide but for some reason either it was missed or something else which im not sure...

 

Im working on getting these ports open first between the plugin host and SVM before focusing on other steps of troubleshooting...

 

Ill update the thread once this done ....

 

thank you so much

 

 

 

Public