Given some recent discussion with a moderator (not sure if I should name names?), I just wanted to highlight that post given I've found it pretty useful -- useful enough I wrote up my own summary to have handy whenever I get into NFS discussions with customers -- here's my summary as well (full credit goes to the post above but thought I'd put it here as well in case helpful for anyone).
Ranking these in order of importance....
Deduplication - possible to use deduplicated space savings with LUNs but MUCH more complicated (have to mess with fractional reserve, LUN thin provisioning, etc. -- possible to get caught overprovisioning and have real issues)
VMware Datastore sizing -- easy datastore growth (possible with VMFS) and shrinking (not possible with VMFS)
Larger datastores - no need to keep datastores smaller like with VMFS - up to 16 TB
Snapshots - can retrieve individual vmdk's from snapshots and/or mount vmdk's from snapshots for single file restore
SMVI - main benefit is ability to do faster VM restores (uses SnapRestore rather than LUN clone so can instantly restore a single VM to any previous snapshot)
VMDK Thin Provisioning
Ease of addition - somewhat easier than LUNs/VMFS
VMFS/RDMs - no need to deal with them
Single-file FlexClone (future feature) - can clone a vmdk instantly for fast provisioning
No single disk I/O queue as with iSCSI/FC so performance limitations are purely governed by pipe size and disk array size.
Faster failover to SnapMirror remote copies (less steps plus faster steps) - no need to do LUN resignaturing
ESX server I/O is small block and extremely random meaning that bandwidth is less important (i.e. GigE works well).
Can dump individual VM's via NDMP
No FC zoning, switch cost, HBA's, compatibility matrices, or LUN IDs
The File Level FlexClone feature you mentioned is available as of Data ONTAP 7.3.1 (which is GA - Generally Available). The Rapid Cloning Utility version 2.0 (RCU 2.0) leverages this feature and provides this functionality directly from the VI Client (there are other features as well). Not only is File Level FlexClone extremely fast, but the resulting files don't take up additional space. The RCU 2.0 is a free tool (due in April) that will require FlexClone, NFS, and Data ONTAP 7.3.1P2.
When I think about Virtualization on NFS I jump right on the storage efficiency and density band wagon. Don't get me wrong, NetApp's blocks story with virtualization is awesome too. However, when someone asks me (Technical Evangelist for NAS<-- 🙂) how many VM's can be stored on a single NFS mount point the answer is astonishing -- How many does your hypervisor vendor support?
At this point it's about 256 <--- look familiar? 2^8; good ole 8-bit there. So now you have 256 Virtual machines running off the same export. Now imagine if those are Virtual Desktops; that's quite a bit of density vs blocks. SCSI limits to about 256 LUNs which is on par; but why do you need seperate luns? If You need seperate luns if you want to move the VM from one physical server to another. In that case the LUN must move - so having a lot of VMs per LUN can constrict your virtual data center management.
Beyond density I get into the futures of NFS, which include NFSv4. The big deal with NFSv4 is delegations and lock management. Not that NFSv3 has any issues, but NFSv4 read/write file delegations would give hypervisor servers more local control over the data and file locking is more resolute. Further, NFSv4 supports the notion of referrals which would allow a NFS server node to redirect a NFs client to a less burdened NFS node in a clustered storage environment, such as Data ONTAP GX. All for naught, as the primary NFS version is version 3, most hypervisor servers don't support NFSv4 and Data ONTAP GX doesn't support NFSv4 today😞
However, in the not to distant future hypervisor vendors may chose to support NFSv4. Data ONTAP 7.3+ supports it and Data ONTAP 8 cluster mode storage will support NFSv4 - combining it with hypervisor support could be truly beneficial. Then, just over the horizon is parallel NFS (NFSv4.1) which gives rise to more predictable performance at the NFS client (hypervisor server). The predictability arises out of the support for parallel data servers, since those data servers can be clustered you can create NFS volumes across at least two nodes that are also clustered - giving you four systems to support parallel reads and writes. If one node needs to be upgraded at least three nodes would still be on line servicing requests (for that volume). Finally, during all this parallel data management, if the workload is primarily read then you can introduce FlexCache into the mix. It supports data center scale-out and remote-office/branch-office accessibility by NFSv3 clients today. In the world of ONTAP 8 - FlexCache will get even better - shhh 🙂
So if you start using NFS for all the reasons above; you'll be ready to take advantage of the system enhancements coming with NFSv4 and NFSv4.1.
Perhaps you can encourage your hyperviser vendor to do the same 🙂
No worries; what's even more awesome about NFSv4 is the pseudo-filesystem.
Today you can get a 6080 rack and stack 1TB SATA drives on it to the tune of 1PB+, drop in some PAM/Flash and turn on NFSv4. Mount the filer at / and get access to the entire 1pb of data through a single mount point - badda bing badda booom 🙂
the downside is that it's available at volume mount points of 16TB each :-(. So 100TB's is available as 7 directories off the root of a filer
Then you can write across all 7 volumes/directories up to 112TB (less file system format + dedupe savings + snapshot savings) 🙂
Downside is you need 70 directories for a 1PB of data :-(, in a future version of data ontap you could get that with 10 directories, perhaps - maybe🙂 yay
I have been using NFS for my VMWare datastore for about 6 months. It has worked out great and many of the advantages you sight are the reasons we switched from using LUNs to NFs.
The only part we have been struggling with is backup. We use SnapMirror to replicate our primary NFS datastore to another filer across campus. We then use Commvault to do an NDMP dump to our disk based backup solution. Commvault sees the vmdk files as monolithic files so my fulls and incrementals are the same size.
We have been looking into using VCB but am a little reluctant to jump in that direction to get file level backup so my incrementals are truly incremental and are small. These two articles is more like what I am looking for.
This article talks about mounting the NFS volume and doing backups that way
The question I have is Netapp looking to do the same type of integration they did with Tivoli with other backup vendors like Commvault? Any other choices I should be looking at for file level backup when I have an NFS datastore?
"VMware Datastore sizing -- easy datastore growth (possible with VMFS) and shrinking (not possible with VMFS)"
I'm curious, how are you going about shrinking the vmdks? While I appreciate the native thinness of NFS for new vm's, one loses that benefit when conducting storage vmotions like we had to for migrating off of das onto our new filers.
It be nice if there was some easy tool like the space reclaimer in Snapdrive.
So far, the only solutions I've come up with are to try out the mbralign tool which will require downtime or dedupe the volume. However, we're not totally ready to dedupe our fibre disks and are only testing it on sata. I was told by someone at Netapp that the mbralign tool might do the trick but haven't had a chance to test it yet...
I did come across this page, but am unsure if it's a solution or not:
So...that point refers to datastore shrinking actually. That is, you can't shrink a VMFS datastore (provided either via iSCSI or FC) but can shrink an NFS datastore (just shrink the FlexVol on the NetApp).
And....if you use an NFS datastore where dedup savings show up immediately, you get some benefits of the vmdk's being empty as the identical zero's inside get de-deped.....hopefully clear as mud.
The mbralign tool will take a thick type vmdk and make it thin (when --sparse is specified), however this isn't required to reclaim space. The use of the FAS dedupe feature will reclaim much more space and will do so while the VMs are running. With regard to the rtfm-ed.co.uk post, FAS dedupe will get you back much more space, without the pain of having to do an export/import. I have had customers leverage SDelete tool (link in Mike's post) to increase the effects of FAS dedup. The tool simply writes zeros to previously allocated space within the guest filesystem. You can imagine how nicely a bunch of zeros dedupes. 😉 When this is done across the whole datastore, the dedupe savings can be significant. Obviously the amount of savings will depend on the age of the guest filesystems and how much data has been added and removed over time. The other great feature of FAS dedupe is that it works on any type vmdk (thin and thick) and on VMFS and NFS datastores.
The Tech Support person you spoke to is mistaken. If you want to push this, you can give them this internal link (it's only valid within NetApp) which when he/she reads it should not only contradict that statement but instruct the GSC how to handle such cases.
They still say its not supported - see new engineer's reply and my response below:
Ok, I don’t want to use RCU if I can not get support Regardless - my goal remains the same: I want to use file level flexclone in ONTAP 188.8.131.52 to clone a file /vol/vms/vm1/vm1.vmdk to /vol/vms/vm2/vm1.vmdk (I can rename the file to vm2.vmdk later)
How do I accomplish this via the ONTAP commandline?
On 7/1/09 6:40 PM, "email@example.com> wrote:
I checked with our engineers on your questions who insisted that the no-support status of this utility remains in effect.
The only publicly accessible support documentation is available on the RCU2./01 Description Page where you may review Release Notes, Best Practices, and an Installation and Administration Guide.