Talk with fellow users about the multiple protocols supported by NetApp unified storage including SAN, NAS, CIFS/SMB, NFS, iSCSI, S3 Object, Fibre-Channel, NVMe, and FPolicy.
Talk with fellow users about the multiple protocols supported by NetApp unified storage including SAN, NAS, CIFS/SMB, NFS, iSCSI, S3 Object, Fibre-Channel, NVMe, and FPolicy.
Hey folks, i have a strange issue and i cant get rid of it. Environment: OKD-Cluster (4.20.0-okd-scos.12) NetApp OTS (NetApp Release 9.17.1P2) with dedicated SVM for the OKD Cluster Trident, installed via Operator. Version: 25.6.2 ActiveMQ-Artemis Cluster, installed via Operator (Red Hat Integration - AMQ Broker for RHEL 9. Version: 7.13.2-opr-1+0.1761129569.p) using Trident-PVC for data ActiveMQ starts normally and is operating as expected but the "LockCount" and "OwnerCount" is rising steadily while all other counters stay low. otscl::*> nfs storepool show -vserver mysvm
Node: otscl1
Vserver: mysvm
Data-Ip: 192.168.1.66
Client-Ip Protocol IsTrunked OwnerCount OpenCount DelegCount LockCount
-------------- --------- --------- ---------- ---------- ---------- ---------
192.168.1.67 nfs4.2 false 0 0 0 0
192.168.1.68 nfs4.2 false 26099 23 0 26099 When the Lock/OwnerCount hits ~131k, the following error appears: otscl1 EMERGENCY Nblade.nfsV4PoolExhaust: NFS Store Pool for Owner exhausted. Associated object type is CLUSTER_NODE with UUID: XXXXXXXXXXXXXXXX. From now on, all NFS4-Shares on the OTS-Cluster (all SVMs) cant be accessed anymore until we restart ActiveMQ which resets the counters. I also checked the locks in detail. See: locks show -vserver mysvm
(vserver locks show)
Notice: Using this command can impact system performance. It is recommended
that you specify both the vserver and the volume when issuing this command to
minimize the scope of the command's operation. To abort the command, press Ctrl-C.
Vserver: mysvm
Volume Object Path LIF Protocol Lock Type Client
-------- ------------------------- ----------- --------- ----------- ----------
trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/server.lock
mysvm_lif
nfsv4.1 share-level 192.168.1.66
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/serverlock.1
mysvm_lif
nfsv4.1 share-level 192.168.1.66
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/serverlock.2
mysvm_lif
nfsv4.1 share-level 192.168.1.66
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/bindings/activemq-bindings-1.bindings
mysvm_lif
nfsv4.1 delegation 192.168.1.66
Delegation Type: write
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/bindings/activemq-bindings-2.bindings
mysvm_lif
nfsv4.1 delegation 192.168.1.66
Delegation Type: write
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/activemq-data-1.amq
mysvm_lif
nfsv4.1 share-level 192.168.1.66
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/activemq-data-2.amq
mysvm_lif
nfsv4.1 share-level 192.168.1.66
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/server.lock
mysvm_lif
nfsv4.1 share-level 192.168.1.67
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/serverlock.1
mysvm_lif
nfsv4.1 byte-range 192.168.1.66
Bytelock Offset(Length): 0 (18446744073709551615)
share-level 192.168.1.67
Sharelock Mode: read_write-deny_none
/trident_pvc_1e201be0_e6d6_4ab2_8270_579061df7f89/journal/serverlock.2
mysvm_lif
nfsv4.1 share-level 192.168.1.67
Sharelock Mode: read_write-deny_none
trident_pvc_52530de0_c35d_4a2b_a133_d84b9ef2b9b7
/trident_pvc_52530de0_c35d_4a2b_a133_d84b9ef2b9b7/.healthcheck
mysvm_lif
nfsv4.1 delegation 192.168.1.67
Delegation Type: write
15 entries were displayed. Running the AMQ-Operator and Trident on OpenShift (instead OKD), the counters will stay low, so i thought this could be a kernel or OS issue. OKD: 6.12.0-142.el10.x86_64 #1 SMP PREEMPT_DYNAMIC OpenShift: 5.14.0-427.97.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC I installed CentOS Stream 10 (Kernel: 6.12.0-170) which is the OS for OKD and set the same kernel-params as in the cluster, mounted the trident-share (copied mount-options from okd-node) and deployed AMQ-Artemis using the same config. The counters stay low. While my research, i stumbled across the following comments: By Chris at 2025-03-04 17:05:36: A not so long while ago I managed to crash a NetApp filer by upgrading a Linux host to an early 6.x kernel and connecting with new NFSv4 features. Seems like its early for all implementors 🙂 By Benjamin Coddington at 2025-10-08 14:42:11: Just ran across this post and thought it worth mentioning that as of v6.17 there have been over 1k patches to the in-kernel linux NFS client since v5.15 and 2021. https://utcc.utoronto.ca/~cks/space/blog/linux/NFSv4KernelStateNotImpressed?showcomments Do you have any hints, tweaks or ideas how i could further investigate this problem? I already installed a nfs4-server on linux which runs without any problems, so my devs already think abour replacing our OTS-Cluster 😄 Thank you.
... View more
I am wondering if anyone has successfully been able to configure "things" to survive ONTAP upgrades through FW's without having to remount v4 clients. Firewalls are not my wheelhouse and so this is more of a theoretical question as anything. 95% of our clients have a FW between them and the NAS. I would say many of the clients survive just fine but there appears to always be a subset of them which will log timeouts endlessly until intervention occurs. Now, I am a proponent of default options (hard being part of this I'm sure) but this is a political football as app owners prefer their machines not hang until reboot (which is sometimes the case with hard mounts) so, they insist on soft mounting. I understand there are grace timers and such but we don't seem to have enough evidence to know if those are needing adjusting from the defaults in ONTAP or not. thanks for your thoughts 🤔
... View more
We’re running ONTAP 9.16.1P7 and have cluster_mgmt and node_mgmt LIFs on different VLANs. Setup: node_mgmt network: 192.168.10.x cluster_mgmt LIF: 192.168.20.210 (home node: node10) NTP server: 192.168.20.15 (same VLAN as cluster_mgmt) Issue: All node_mgmt LIFs (nodes 07–10) can ping 192.168.10.254 (gateway). However, when cluster_mgmt is active on a node (e.g., node10), that node’s node_mgmt LIF can no longer ping 192.168.20.1 or 192.168.20.15 (NTP). If cluster_mgmt is migrated to another node (e.g., node09), the problem follows — node09’s node_mgmt LIF then loses connectivity to the NTP network. All other nodes continue to work fine. It seems that the node hosting cluster_mgmt loses network reachability from its node_mgmt LIF toward the 192.168.20.x network (where NTP resides). If the NTP server were located on a different VLAN (not the same as cluster_mgmt), this would likely not be an issue — so it seems related to VLAN separation and how ONTAP handles management traffic between these subnets. Has anyone else seen this behavior? Could it be related to routing, ARP isolation, or VLAN handling of management LIFs in ONTAP? Thanks for any insights.
... View more
Hi all, First of all I am new to ontap. I have setup an single node cluster ASA-C250. Ports e0c and e0d are used for the cluster interconnect network. Ports e1a and e1b are connected to a Juniper switch. Now I want to configure a SVM with iSCSI as Access protocol. But for some reason I can the ports e1a,e1b . I have the same issue when I want to add thes ports to the Default Broadcast domain dacs-cluster01::> network port broadcast-domain add-ports -ipspace Default -broadcast-domain Default -ports dacs-cluster01-01:e1a,dacs-cluster01-01:e1b,dacs-cluster01-02:e1a,dacs-cluster01-02:e1b
Error: command failed: can't find port Extra info: dacs-cluster01::> version
NetApp Release 9.15.1P3: Wed Sep 25 22:33:44 UTC 2024
dacs-cluster01::> network device-discovery show
Node/ Local Discovered
Protocol Port Device (LLDP: ChassisID) Interface Platform
----------- ------ ------------------------- ---------------- ----------------
dacs-cluster01-01/cdp
e0c dacs-cluster01-02 e0c ASA-C250
e0d dacs-cluster01-02 e0d ASA-C250
e1a dacs-cluster01-02 e1a ASA-C250
e1b dacs-cluster01-02 e1b ASA-C250
dacs-cluster01-01/lldp
e0M dacsswt001 (a4:0e:75:7e:96:cb)
40 -
e0c dacs-cluster01-02 (d0:39:ea:c2:e8:f1)
e0c -
e0d dacs-cluster01-02 (d0:39:ea:c2:e8:f1)
e0d -
e1a dacsswt003 (b4:16:78:a4:c4:40)
et-0/0/48 -
e1b dacsswt004 (b4:16:78:a4:f6:40)
et-0/0/48 -
dacs-cluster01-02/cdp
e0c dacs-cluster01-01 e0c ASA-C250
e0d dacs-cluster01-01 e0d ASA-C250
e1a dacs-cluster01-01 e1a ASA-C250
e1b dacs-cluster01-01 e1b ASA-C250
dacs-cluster01-02/lldp
e0M dacsswt001 (a4:0e:75:7e:96:cb)
90 -
e0c dacs-cluster01-01 (d0:39:ea:c2:eb:29)
e0c -
e0d dacs-cluster01-01 (d0:39:ea:c2:eb:29)
e0d -
e1a dacsswt003 (b4:16:78:a4:c4:40)
et-0/0/49 -
e1b dacsswt004 (b4:16:78:a4:f6:40)
et-0/0/49 -
18 entries were displayed. dacs-cluster01::> system node run -node dacs-cluster01-01 sysconfig -a
<snip>
slot 1: Dual 100G Ethernet Controller CX6-Mezz
e1a MAC Address: d0:39:ea:5c:63:6a (auto-100g_sr4-fd-up)
QSFP Vendor: SOLID-OPTICS
QSFP Part Number: QSFP-100G-LR4-LC
QSFP Serial Number: SOZ131Lp3095
e1b MAC Address: d0:39:ea:5c:63:6b (auto-100g_sr4-fd-up)
QSFP Vendor: SOLID-OPTICS
QSFP Part Number: QSFP-100G-LR4-LC
QSFP Serial Number: SOZ131Lp3097
Device Type: CX6 PSID(NAP0000000013)
Firmware Version: 20.30.1004
Part Number: 111-04588
Hardware Revision: B0
Serial Number: 032422000999 Thanks in advance. Carlo
... View more
I have a test lab with ONTAP simulator at 9.13.1 version, and Microsoft published windows server 2025 last month, so I want test ONTAP with new OS. 1.Windows server 2025 is installed and active directory service is installed. 2. During the cifs create procedure, it shows an error: Machine account creation procedure failed. 3. From wireshark capture, it shows kpasswd replyed an error.
... View more