Tech ONTAP Blogs
Tech ONTAP Blogs
NFS has been around for decades as the premier networked, clustered filesystem. If you're a unix/linux user, and you're storing a lot of files, you're probably using NFS right now, especially if you need multiple hosts accessing the same data.
If you're looking for high-performance NFS, NetApp's implementation is the best in the business. A lot of NetApp's market share was built on ONTAP's unique ability to deliver fast, easy-to-manage NFS storage for Oracle database workloads. It's an especially nice solution for Oracle RAC because it's an inherently clustered filesystem. The connected hosts are just reading and writing files. The actual filesystem management lives on the storage system itself. All the NFS clients on the hosts see the same logical data.
The NFSv3 specification was published in 1995, and that's still the version almost everyone is using today. You can store a huge number of files, it's easy to configure, and it's super-fast. There really wasn't much to improve, and as a result v3 has been the dominant version for decades.
Note: I originally wrote this post for Oracle database customers moving from NFSv3 to NFSv4, but it morphed into a more general explanation of the practical difference between managing NFSv3 storage and managing NFSv4 storage. Any sysadmin using NFS should understand the differences in protocol behavior.
So, why is everyone increasingly looking at NFSv4?
Sometimes it's just perception. NFSv4 is newer, and 'newer' is often seen as 'better'. Most customers I see who are either migrating to NFSv4 or choosing NFSv4 for a new project honestly could have used either v3 or v4 and wouldn't notice a difference between the two. There are exceptions, though. There are subtle improvements in NFSv4 that sometimes make it a much better option, especially in cloud deployments.
This post is about the key practical differences between NFSv3 and NFSv4. I'll cover security improvements, changes in networking behavior, and changes in the locking model. It's especially critical you understand the section NFSv4.1 Locks and Leases. NFSv4 is significantly different from NFSv3. If you're running an application like an Oracle database over NFSv4, you need to change your management practices if you want to avoid accidentally crashing your database.
This is not a re-hash of the ONTAP NFS best practices. You can find that information here, https://www.netapp.com/media/10720-tr-4067.pdf.
If someone says “NFSv4” they're usually referring to NFSv4.1. That’s almost certainly the version you’ll be using.
The first release of NFSv4, which was version 4.0, worked fine, but the NFSv4 protocol was designed to expand and evolve. The primary version you’ll see today is NFSv4.1. For the most part, you don't have to think about the improvements in NFSv.1. It just works better than NFSv4.0 in terms of performance and resiliency.
For purposes of this post, when I write NFSv4 just assume that I’m talking about NFSv4.1 It’s the most widely adopted and supported version. (NetApp has support for NFSv4.2, but the primary difference is we added support for labelled NFS, which is a security feature that most customers haven’t implemented.)
The most confusing part about the NFSv4 specification is the existence of optional features. The NFSv3 spec was quite rigid. A given client or server either supported NFSv3 or did not support NFSv3. In contrast, the NFSv4 spec is loaded with optional features.
Most of these optional NFSv4 features are disabled by default in ONTAP because they're not commonly used by sysadmins. You probably don't need to think about them, but there are some applications on the market that specifically require certain capabilities for optimum performance. If you have one of these applications, there should be a section in the documentation covering NFS that will explain what you need from your storage system and which options should be enabled.
If you plan to enable one of the options (delegations is the most commonly used optional feature), test it first and make sure your OS's NFS client fully supports the option and it's compatible with the application you're using. Some of the advanced features can be revolutionary, but only if the OS and application make use of those features. For more information on optional features, refer to the TR referenced above.
Again, it's rare you'll run into any issues. For most users, NFSv4 is NFSv4. It just works.
NFSv4.1 introduced a feature called parallel NFS (pNFS) which is a significant feature with broad appeal for a lot of customers. It separates the metadata path from the data path, which can simplify management and improve performance in very large scale environments.
For example, let's say you have a 20-node cluster. You could enable the pNFS feature, configure a data interface on all 20 nodes, and then mount your NFSv4.1 filesystems from one IP in the cluster. That IP becomes the control path. The OS will then retrieve the data path information and choose the optimal network interface for data traffic. The result is you can distribute your data all over the entire 20-node cluster and the OS will automatically figure out the correct IP address and network interface to use for data access. The pNFS feature is also supported by Oracle's direct NFS client.
pNFS is not enabled by default. NetApp has supported it for a long time, but at the time of the release some OS's had a few bugs. We didn't want customers to accidentally use a feature that might expose them to OS bugs. In addition, pNFS can silently change the network paths in use to move data around, which could also cause confusion for customers. It was safer to leave pNFS disabled so customers know for sure whether it's being used within their storage network.
Don't think of this as an upgrade. NFSv4 isn't better than NFSv3, NFSv4 is merely different. Whether you get any benefits from those differences depends on the application.
For example - locking. NFSv3 has some basic locking capabilities, but it's essentially an honor system lock. NFSv3 locks aren't enforced by the server. NFSv3 clients can ignore locks. In contrast, NFSv4 servers, including ONTAP, must honor and enforce locks.
That opens up new opportunities for applications. For example, IBM WebSphere and Tibco offer clusterable applications where locking is important. There's nothing stopping those vendors from writing application-level logic that tracks and controls which parts of the application are using which files, but that requires work. NFSv4 can do that work too, natively, right on the storage system itself. NFSv4 servers track the state of open and locked files, which means you can build clustered applications where individual files can be exclusively locked for use by a specific process. When that process is done with the file, it can release the lock and other processes can acquire the lock. The storage system enforces the locking.
That's a cool feature, but do you need any of that? If you have an Oracle database, it's mostly just doing reads and write of various sizes and that's all. Oracle databases already manage locking and file access synchronization internally. NetApp does a lot of performance testing with real Oracle databases, and we're not seeing any significant performance difference between NFSv3 and NFSv4. Oracle simply hasn't coded their software to make use of the advanced NFSv4 features.
While the choice of NFS version rarely matters to the applications you're running, it does affect your network infrastructure. In particular, it's much easier to run NFSv4 across a firewall.
With NFSv4, you have a single target port (2049) and the NFSv4 clients are required to renew leases on files and filesystems on regular basis. (more on leases below) This activity keeps the TCP session active. You can normally just open port 2049 through the firewall and NFSv4 will work reliably.
In contrast, NFSv3 is often impossible to run through a firewall. Among the problems experienced by customers trying to make it work is NFSv3 filesystems hanging for up to 30 minutes or more. The problem is that firewalls are almost universally configured to drop a network packet that isn't part of a known TCP session. If you have a lot of NFSv3 filesystems, one of them will probably have quiet periods where the TCP session has low activity. If your TCP session timeout limit on the firewall is set to 15 minutes, and an NFSv3 filesystem is quiet for 15 minutes, the firewall will make the TCP session stale and cease passing packets.
Even worse, it will probably drop them.
If the firewall rejected the packets, that would prompt the client to open a new session, but that's not how firewalls normally work. They'll silently drop the packets. You don't usually want a firewall rejecting a packet because that tells an intruder that the destination exists. Silently dropping an invalid packet is safer because it doesn't reveal anything about the other side of the firewall.
The result of silent packet drops with NFSv3 is the client will hang while it tries to retransmit packets over and over and over. Eventually it gives up and will open a fresh TCP session. The firewall will register the new TCP session and traffic will resume, but in the interim your OS might have been stalled out for 5, 10, 20 minutes or more. Most firewalls can't be configured to avoid this situation. You can increase the allowable timeout for an inactive TCP session, but there has to be some kind of timeout with fixed number of seconds.
We've had a few customers write scripts that did a repeated "stat" on an NFSv3 mountpoint in order to ensure there's enough network activity on the wire to prevent the firewall from closing the session. This is okay as a one-off hack, but it's not something I'd want to rely on for anything mission-critical and it doesn't scale well.
Even if you could increase the timeouts for NFSv3, how do you know which ports to open and ensure they're correctly configured on the firewall? You've got 2049 for NFS, 111 for portmap, 635 for mountd, 4045 for NLM, 4046 for NSM, 4049 for rquota…
NFSv4 works better because there's just a single target port, plus the "heartbeat" of lease renewal would keep the TCP/IP session alive.
NFSv4 is inherently more secure than NFSv3. For example, NFSv4 security is normally based on usernames, not user ID's. The result is it's more difficult for an intruder to spoof credentials to gain access to data on an NFSv4 server. You can also easily tell which clients are actively using an NFSv4. It's often impossible to know for sure with NFSv3. You might know a certain client mounted a filesystem at some point in the past, but are they still using the files? Is the filesystem still mounted now? You can't know for sure with NFSv3.
NFSv4 also includes options to make it even more secure. The primary security feature is Kerberos. You have three options -
In a nutshell, basic krb5 security means better, more secure authentication for NFS access. It's not encryption per se, but it uses an encrypted process to ensure that whoever is accessing an NFS resource is who they claimed to be. Think of it as a secure login process where the NFS client authenticates to the NFS server.
If you use krb5i, you add a validation layer to the payload of the NFS conversation. If a malicious middleman gained access to the network layer and tried to modify the data in transit, krb5i would detect and stop it. The intruder may be able to read data from the conversation, but they won't be able to intercept and tamper with the data.
If you're concerned about an intruder being able to read network packets on the wire, you can go all the way to krb5p. The letter p in krb5p means privacy. It delivers complete encryption.
In the field, few administrators use these options for a simple reason - what are the odds a malicious intruder is going to gain access to data center and start snooping on IP packets on the wire? If someone was able to do that, they'd probably be able to get actual login credentials to the database server itself. They'd then be able to freely access data as an actual user.
With increased interest in cloud, some customers are demanding that all data on the wire be encrypted, no exceptions, ever, and they're demanding krb5p. They don't necessarily use it across all NFS filesystems, but they want the option to turn it on. This is also an example of how NFSv4 security is superior to NFSv3. While some of NFSv3 could be krb5p encrypted, not all NFSv3 functions could be "kerberized". NFSv4, however, can be 100% encrypted.
NFSv4 with krb5p is still not generally used because the encryption/decryption work has overhead. Latency will increase and maximum throughput will drop. Most databases would not be affected to the point users would notice a difference, but it depends on the IO load and latency sensitivity. Users of a very active database would probably experience a noticeable performance hit with full krb5p encryption. That's a lot of CPU work for both the OS and the storage system. CPU cycles are not free.
If you're genuinely concerned about network traffic being intercepted and decoded in-transit, I would recommend looking at all available options. Yes, you could turn on krb5p, but you could also isolate certain NFS traffic to a dedicated switch. Many switches support private VLANs where individual network ports can communicate with the storage system, but all other port-to-port traffic is blocked. An outside intruder wouldn't be able to intercept network traffic because there would be no other ports on the logical network. It's just the client and the server. This option mitigates the risk of an intruder intercepting traffic without imposing a performance overhead.
In addition, you may want to consider IPSec. Any network administrator should know IPSec already, and it's been part of OSs for years. It's a lot like the VPN client you have on your PC, except it's used by server OSs and network devices.
As an ONTAP example, you can configure an IPSec endpoint on a linux OS and an IPSec endpoint on ONTAP and subsequently all IP traffic will use that IPSec tunnel for communication. The protocol doesn't really matter (although I wouldn't recommend using krb5p over IPsec. You don't really need to re-encrypt already encrypted traffic). NFS should perform about the same under IPSec as it would with krb5p, and in some environments IPSec is easier to configure than krb5p.
Note: You can also use IPsec with NFSv3 if you need to secure an NFS connection and NFSv4 is not yet an option for you.
Applications can encrypt data too.
For example, if you're an Oracle database user, consider encryption at the database layer. That also delivers encryption of data on the wire, plus one additional benefit - the backups are encrypted. A lot of the data leaks you read about are a result of someone leaving an unprotected backup in an insecure location. Oracle's Transparent Data Encryption (TDE) encrypts the tablespaces themselves, which means a breach of the backup location will yield access to a data that is still encrypted. As long as the Oracle Wallet data, which contains the decryption keys, is not stored with the backups themselves, that backup data is still secured.
Additionally, TDE scales better. The encryption/decryption work is distributed across all your database servers, which means more CPU's sharing in the work. In addition, and unlike krb5p encryption, TDE incurs zero overhead on the storage system itself.
In my opinion, this is the most important section of this post. If you don't understand this topic, you're likely to accidentally crash your database.
NFSv3 is stateless. That effectively means that the NFS server (ONTAP) doesn't keep track of which filesystems are mounted, by whom, or which locks are truly in place. ONTAP does have some features that will record mount attempts so you have an idea which clients may be accessing data, and there may be advisory locks present, but that information isn't guaranteed to be 100% complete. It can't be complete, because tracking NFS client state is not part of the NFSv3 standard.
In contrast, NFSv4 is stateful. The NFSv4 server tracks which clients are using which filesystems, which files exist, which files and/or regions of files are locked, etc. This means there needs to be regular communication between an NFSv4 server to keep the state data current.
The most important states being managed by the NFS server are NFSv4 Locks and NFSv4 Leases, and they are very much intertwined. You need to understand how each works by itself, and how they relate to one another.
With NFSv3, locks are advisory. An NFS client can still modify or delete a "locked" file. An NFSv3 lock doesn't expire by itself, it must be removed. This creates problems. For example, if you have a clustered application that creates NFSv3 locks, and one of the nodes fails, what do you do? You can code the application on the surviving nodes to remove the locks, but how do you know that's safe? Maybe the "failed" node is operational, but isn't communicating with the rest of the cluster?
With NFSv4, locks have a limited duration. As long as the client holding the locks continues to check in with the NFSv4 server, no other client is permitted to acquire those locks. If a client fails to check in with the NFSv4, the locks eventually get revoked by the server and other clients will be able to request and obtain locks.
Now we have to add a layer - leases. NFSv4 locks are associated with an NFSv4 lease.
When an NFSv4 client establishes a connection with an NFSv4 server, it gets a lease. If the client obtains a lock (there are many types of locks) then the lock is associated with the lease.
This lease has a defined timeout. By default, ONTAP will set the timeout value to 30 seconds:
EcoSystems-A200-B::*> nfs server show -vserver jfsCloud4 -fields v4-lease-seconds
vserver v4-lease-seconds
--------- ----------------
jfsCloud4 30
This means that an NFSv4 client needs to check in with the NFSv4 server every 30 seconds to renew its leases.
The lease is automatically renewed by any activity, so if the client is doing work there's no need to perform addition operations. If an application becomes quiet and is not doing real work, it's going to need to perform a sort of keep-alive operation (called a SEQUENCE) instead. It's essentially just saying "I'm still here, please refresh my leases."
Question: What happens if you lose network connectivity for 31 seconds?
NFSv3 is stateless. It's not expecting communication from the clients. NFSv4 is stateful, and once that lease period elapses, the lease expires, and locks are revoked and the locked files are made available to other clients.
With NFSv3, you could move network cables around, reboot network switches, make configuration changes, and be fairly sure that nothing bad would happen. Applications would normally just wait patiently for the network connection to work again. Many applications would wait until the end of time, but even an application like Oracle RAC allowed for a 200 second loss of storage connectivity by default. I've personally powered down and physically relocated NetApp storage systems that were serving NFSv3 shares to various applications, knowing that everything would just freeze until I completed my work and work would resume when I put the system back on the network.
With NFSv4, you have 30 seconds (unless you've increased the value of that parameter within ONTAP) to complete your work. If you exceed that, your leases time out. Normally this results in application crashes.
If you have an Oracle database, and you experience a loss of network connectivity (sometimes called a "network partition") that exceeds the lease timeout, you will crash the database.
Here's an example of what happens in the Oracle alert log if this happens:
2022-10-11T15:52:55.206231-04:00
Errors in file /orabin/diag/rdbms/ntap/NTAP/trace/NTAP_ckpt_25444.trc:
ORA-00202: control file: '/redo0/NTAP/ctrl/control01.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 1
Additional information: 4294967295
2022-10-11T15:52:59.842508-04:00
Errors in file /orabin/diag/rdbms/ntap/NTAP/trace/NTAP_ckpt_25444.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/redo1/NTAP/ctrl/control02.ctl'
ORA-27061: waiting for async I/Os failed
If you look at the syslogs, you should see several of these errors:
Oct 11 15:52:55 jfs0 kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Oct 11 15:52:55 jfs0 kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Oct 11 15:52:55 jfs0 kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
The log messages are usually the first sign of a problem, other than the application freeze. Typically, you see nothing at all during the network outage because processes and the OS itself are blocked attempting to access the NFS filesystem.
The errors appear after the network is operational again. In the example above, once connectivity was reestablished, the OS attempted to reacquire the locks, but it was too late. The least had expired and the locks were removed. That results in an error that propagates up to the Oracle layer and causes the message in the alert log. You might see variations on these patterns depending on the version and configuration of the database.
There's nothing stopping vendors from writing software that detect loss of locks and reacquires the file handles, but I'm not aware of any vendor who has done that.
In summary, NFSv3 tolerates network interruption, but NFSv4 is more sensitive and imposes a defined lease period.
Now, what if a 30 second timeout isn't acceptable? What if you manage a dynamically changing network where switches are rebooted or cables are relocated and the result is the occasional network interruption? You could choose to extend the lease period, but whether you want to do that requires an explanation of NFSv4 grace periods.
Remember how I said that NFSv3 is stateless, while NFSv4 is stateful? That affects storage failover operations as well as network interruptions.
If an NFSv3 server is rebooted, it's ready to serve IO almost instantly. It was not maintaining any sort of state about clients. The result is that an ONTAP takeover operation often appears to be close to instantaneous. The moment a controller is ready to start serving data it will send an ARP to the network that signals the change in topology. Clients normally detect this almost instantly and data resumes flowing.
NFSv4, however, will produce a brief pause. Neither NetApp nor OS vendors can do anything about it - it's just part of how NFSv4 works.
Remember how NFSv4 servers need to track the leases, locks, and who's using what? What happens if an NFS server panics and reboots, or loses power for a moment, or is restarted during maintenance activity? The lease/lock and other client information is lost. The server needs to figure out which client is using what data before resuming operation. This is where the grace period comes in.
Let's say you suddenly power cycle your NFSv4 server. When it comes back up, clients that attempt to resume IO will get a response that essentially says, "Hi there, I have lost lease/lock information. Would you like to re-register your locks?"
That's the start of the grace period. It defaults to 45 seconds on ONTAP:
EcoSystems-A200-B::> nfs server show -vserver jfsCloud4 -fields v4-grace-seconds
vserver v4-grace-seconds
--------- ----------------
jfsCloud4 45
The result is that, after a restart, a controller will pause IO while all the clients reclaim their leases and locks. Once the grace period ends, the server will resume IO operations.
The grace period and the lease period are connected. As mentioned above, the default lease timeout is 30 seconds, which means NFSv4 clients must check in with the server at least every 30 seconds or they lose their leases and, in turn, their locks. The grace period exists to allow an NFS server to rebuild lease/lock data, and it defaults to 45 seconds. ONTAP requires the grace period to be 15 seconds longer than the lease period. This ensures that an NFS client environment that is designed to renew leases at least every 30 seconds will have the ability to check in with the server after a restart. A grace period of 45 seconds ensures that all those clients that expect to renew their leases at least every 30 seconds definitely have the opportunity to do so.
As asked mentioned above:
Now, what if a 30 second timeout isn't acceptable? What if you manage a dynamically changing network where switches are rebooted or cables are relocated and the result is the occasional network interruption? You could choose to extend the lease period, but whether you want to do that requires an explanation of NFSv4 grace periods.
If you want to increase the lease timeout to 60 seconds in order to withstand a 60 second network outage, you're going to have to increase the grace period to at least 75 seconds. ONTAP requires it to be 15 seconds higher than the lease period. That means you're going to experience longer IO pauses during controller failovers.
This shouldn't normally be a problem. Typical users only update ONTAP controllers once or twice per year, and unplanned failovers due to hardware failures are extremely rare. Also, let's be realistic, if you had a network where a 60-second network outage was a concerning possibility, and you needed to the lease timeout to 60 seconds, then you probably wouldn't object to rare storage system failovers resulting in a 75 second pause either. You've already acknowledged you have a network that's pausing for 60+ seconds rather frequently.
You do, however, need to be aware that the NFSv4 grace period exists. I was initially confused when I noted IO pauses on Oracle databases running in the lab, and I thought I had a network problem that was delaying failover, or maybe storage failover was slow. NFSv3 failover was virtually instantaneous, so why isn't NFSv4 just as quick? That's how I learned about the real-world impact of NFSv4 lease periods and NFSv4 grace periods.
If you really, really want to see what's going on with leases and locks, ONTAP can tell you.
The commands and output can be confusing because there are two ways to look at NFSv4 locks:
The end result is the networking part of ONTAP needs to maintain a list of NFSv4 clients and which NFSv4 locks they hold. Meanwhile, the data part of ONTAP also needs to maintain a list of open NFSv4 files and which NFSv4 locks exist on those files. In other words, NFSv4 locks are indexed by the client that holds them, and NFSv4 locks are indexed by the file they apply to.
Note: I've simplified the screen shots below a little so they're not 200 characters wide and 1000 lines long. Your ONTAP output will have extra columns and lines.
If you want to get NFSv4 locking data from the NFSv4 client point of view, you use the vserver locks show command. It accepts various arguments and filters.
Here's an example of what's locked on one of the datafile volumes on one of my Oracle lab systems:
EcoSystems-A200-A::vserver locks*> vserver locks show -volume jfs0_oradata0 -fields volume,path,lockid,client-id
volume path lockid client-id
------------- -------------------------------- ------------------------------------ ----------------
jfs0_oradata0 /jfs0_oradata0/NTAP/system01.dbf 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/system01.dbf 721e4ce8-e6e3-4011-b8cc-7cea6e53661b 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/sysaux01.dbf bb7afcdb-6f8c-4fea-b47d-4a161cd45ceb 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/sysaux01.dbf 2eacf804-7209-4678-ada5-0b9cdefceee0 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/users01.dbf 693d3bb8-aed5-4abd-939b-2fdb8af54ae6 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/users01.dbf a7d24881-b502-40b6-b264-7414df8a98f5 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS000.dbf 1a33008c-573b-4ab7-ae87-33e9b5891e6a 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS000.dbf b6ef3873-217a-46e3-bdc7-5703fb6c82f4 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS004.dbf fef3204b-406c-4f44-a02b-d14adaba807c 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS004.dbf 9f9f737b-52de-4d7a-b169-3ba15df8bcc5 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS008.dbf b322f896-1989-43ab-9d83-eaa2850f916a 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS008.dbf cd33d350-ff79-4e29-8e13-f64ed994bc4e 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS012.dbf e4a54f25-5290-4da3-9a93-28c4ea389480 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS012.dbf f3faed7f-3232-46f4-a125-4d2ad8059bc4 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS016.dbf be7ad0d4-bb70-45a8-85b5-45edcb626487 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS016.dbf ce26918c-8a44-4d02-8c41-fafb7e5d2954 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS020.dbf 47517938-b944-4a0b-a9e8-960b721602f4 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS020.dbf 2808307d-46c9-4afa-af2a-bb13f0908ea3 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS024.dbf f21b6f26-0726-4405-9bac-d9e680baa4df 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS024.dbf 0a95f55b-3dfa-45db-8713-c5ad717441ae 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS028.dbf a0196191-4012-4615-b2fd-dda0ce2d7c3f 0100000028aa6c80
jfs0_oradata0 /jfs0_oradata0/NTAP/IOPS028.dbf fc769b9d-0fff-4e74-944a-068b82702fd1 0100000028aa6c80
The first time I used this command, I immediately asked, "Hey, where's the lease data? How many seconds are left on the lease for those locks?" That information is held elsewhere. Since an NFSv4 file might be the target of multiple locks with different lease periods, and the NFSv4 server needs to enforce locks, then the NFSv4 server needs to track the detailed locking data down at the file level. You get that data with vserver locks nfsv4 show. Yes, it's almost the same command.
In other words, the vserver locks show command tells you which locks exist. The vserver locks nfsv4 show command tells you the details about a lock.
Let's take the first line in the above output:
EcoSystems-A200-A::vserver locks*> vserver locks show -volume jfs0_oradata0 -fields volume,path,lockid,client-id
volume path lockid client-id
------------- -------------------------------- ------------------------------------ ----------------
jfs0_oradata0 /jfs0_oradata0/NTAP/system01.dbf 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2 0100000028aa6c80
If I want to know how many seconds are left on that lock, I can run this command:
EcoSystems-A200-A::*> vserver locks nfsv4 show -vserver jfsCloud3 -lock-uuid 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2
There are no entries matching your query.
Wait, why didn't that work?
The reason is I'm using 2-node cluster. The NFSv4 client-centric command (vserver locks show) shows me locking information up at the network layer. The NFSv4 server spans all ONTAP controllers in the cluster, so this command will look the same on all controllers. Individual file management is based on the controller that owns the drives. That means the low-level locking information is available only on a particular controller.
Here are the individual controllers in my HA pair:
EcoSystems-A200-A::*> network int show
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
EcoSystems-A200-A
A200-01_mgmt1
up/up 10.63.147.141/24 EcoSystems-A200-01
e0M true
A200-01_mgmt2
up/up 10.63.147.142/24 EcoSystems-A200-02
e0M true
If I ssh into the cluster, and the management IP is currently hosted on EcoSystems-A200-01, then the command vserver locks nfsv4 show will only look at NFSv4 locks that exist on the files that are owned by that controller.
If I open an ssh connection to 10.63.147.142 then I'll be able to view the NFSv4 locks for files owned by EcoSystems-A200-02:
EcoSystems-A200-A::*> vserver locks nfsv4 show -lock-uuid 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2
Logical
Interface Lock UUID Lock Type
----------- --------------------------------- ------------
jfs3_nfs2 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2 share-level
This is where I can see the lease data:
EcoSystems-A200-A::*> vserver locks nfsv4 show -lock-uuid 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2 -fields lease-remaining
lif lock-uuid lease-remaining
--------- ------------------------------------ ---------------
jfs3_nfs1 72ae1cb3-7a7c-48c6-aaa5-5b3ba5b78ae2 9
This particular system is set to a lease-seconds of 10. There's an active Oracle database, which means it's constantly performing IO, which in turn means it's constantly renewing the lease. If I cut the power on that how. you'd see the lease-remaining field count down to 0 and then disappear as the leases and associated locks expire.
The chance of anyone needing to go into these diag-level details is close to zero, but I was troubleshooting an Oracle dNFS bug related to leases and got to know all these commands. I thought it was worth writing up in case someone else ended up working on a really obscure problem.
So, that's the story on NFSv4. The top takeaways are:
NFSv4 Bonus Tip:
If you're playing with NFSv4, don't forget the domain. This is also documented in the big NFS TR linked above, but I missed it the first time through, and I've seen customers miss this as well. It's confusing because if you forget it, there's a good chance that NFSv4 will MOSTLY work, but you'll have some strange behavior with permissions.
Here's how my ONTAP systems are configured in my lab:
EcoSystems-A200-A::> nfs server show -vserver jfsCloud3 -fields v4-id-domain
vserver v4-id-domain
--------- ------------
jfsCloud3 jfs.lab
and this is on my hosts:
[root@jfs0 ~]# more /etc/idmapd.conf | grep Domain
Domain = jfs.lab
They match. If I'd forgotten to update the default /etc/idmap.conf, weird things would have happened.