We have a VMware datastore configured on our FAS2240. I'm getting an alert on our monitoring system due to usage of a particular volume. The volume in question shows 86% utilised and the LUN at 97% within System Manager, this is also reflected within VSC. However, the datastore itself, i.e. what VMware is seeing, is only 31% used.
My question is, why would there be such a difference between these figures? Is there any way to tell what is utilising this space? There are no snapshots in play.
We have other volumes configured in the same way with similar VDI load and there is nowhere near as much difference between these figures.
The Fractional Reserve setting is disabled by the looks of it.
Here's the output of the df -r command.
FAS2240-02> df -r
Filesystem kbytes used avail reserved Mounted on
/vol/vol0/ 199229440 3880512 195348928 0 /vol/vol0/
/vol/vol0/.snapshot 10485760 237508 10248252 0 /vol/vol0/.snapshot
/vol/v_vdi_datastore9/ 432402476 186069996 246332480 0 /vol/v_vdi_datastore9/
/vol/v_vdi_datastore9/.snapshot 0 0 0 0 /vol/v_vdi_datastore9/.snapshot
/vol/v_vdi_view_datastore9/ 70265404 12336924 57928480 0 /vol/v_vdi_view_datastore9/
/vol/v_vdi_view_datastore9/.snapshot 0 0 0 0 /vol/v_vdi_view_datastore9/.snapshot
/vol/v_vdi_datastore10/ 432402476 384968280 47434196 0 /vol/v_vdi_datastore10/
/vol/v_vdi_datastore10/.snapshot 0 0 0 0 /vol/v_vdi_datastore10/.snapshot
/vol/v_vdi_view_datastore10/ 70265404 11059304 59206100 0 /vol/v_vdi_view_datastore10/
/vol/v_vdi_view_datastore10/.snapshot 0 0 0 0 /vol/v_vdi_view_datastore10/.snapshot
Fractional reserve isn't to be blamed, indeed.
OK, let's go back to basis. You said this in your original post:
The volume in question shows 86% utilised and the LUN at 97% within System Manager, this is also reflected within VSC. However, the datastore itself, i.e. what VMware is seeing, is only 31% used.
First of all, volume is utilised, because there is LUN in it, which is space-reserved - it doesn't matter is not filled with data. Secondly: where are you getting the info the LUN itself is 97% full - System Manager GUI?
VMware is seeing the LUN, not the volume, so is reporting space utilisation (presumably correctly) within the LUN.
Thanks for the response.
That makes sense that the volume is utilised because a LUN is stored within it. I'm seeing the 97% LUN utilisation within System Manager and also the Virtual Storage Console plugin for VMware vSphere, which is where the below screenshot is from.
As can be seen, the Datastore usage (what VMware see's) is only 52%, yet the LUN usage is close 98%, that is what is baffling me.
Has this particular datastore been thick provisioned on VMware? Maybe LUN usage shows a simple fact VMware filled the LUN with zeroes during formatting?
Not that I'm aware of, I didn't know there was a way within VMware, the volume is definitely Thin Provisioned within the NetApp side of things. I think I'll just shuffle some of the Virtual Machines around and keep an eye on it. It's very strange though.
Have a look at this for more details re provisioning on the VMware side:
Basically so called Eagerzeroedthick VMDK can give this result - all of the allocated space is zeroed out at creation time.
Did you find any resolution? We are seeing she same symtoms on one of our datastores on a 3-month old VMWare deployment. At first I thought it was because snapshots were retaining data from some large servers that used to be on this datastore, but on closer look those snapshots are already gone and the numbers do not add up. I do not see why Datastore usage is 33 % while LUN usage is 93%.
A LUN will reach 100% utilization over time unless you run space reclamation. The vSphere ESX will write blocks and delete them from its point of view but the netapp machine doesnt know which blocks are actualy still in use or are free´d. So you will have 100% utilization over time.
It is the same with Windows/Linux and all other OS attached LUNs.
Correct. This behavior is common when VMware deletes / moves VMs out of the datastore. Since VMware owns the filesystem - VMFS, it has no mechanism to tell the storage controller that the space has been freed up by deleting or storage vmotion.
Recently changes to VAAI have made it possible for VMware to notify the storage that the space is reclaimed. This could be termed "hole punching" or space reclaimation. Hole Punching is something that is easily done with NFS datastores using the Virtual Storage Console plugin, but today (changes soon?) with VMFS it must be done using VMware vmkfstools.
See this KB article on how to address: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2014849
Basically, if the LUN is at 100% utilization, as long as the volume is not full we shouldn't be too concerned. If it is really bugging you, hit the ESXi/VMA CLI and use vmkfstools. Unfortunately VMware has not implemented an automatic method for doing this. Also keep in mind only use the vmkfstools workaround if you are running ESXi 5.0U1 and a version of Data ONTAP that supports VAAI (8.0.1+).
Bonus - you can adjust the thresholds for which you get storage alerts for particular volumes by using this command:
dfm volume list | grep -i <your volume name>
dfm volume set <volume ID> volumefullthreshold=95
dfm volume set <volume ID> volumenearlyfullthreshold=90
The global volume full thresholds are configured in the GUI or CLI too.
Very interesting thread.
I too have been investigating this on our own VMware environment and have managed to successfully use the ESX tool to reclaim space as well as utilising the ASIS on the NetApp.
My only question is though once space reclaim is run, whilst the volume correlates to the datastore usage (i.e. exactly what is showing on VMware side), the LUN usage differs.
How would I go about ensuring the LUN match the volume usage on the Netapp?