We have provisioned 671 GB LUN which is resides in 1109GB Volume. Volume is thick and LUN is thin and it presented to VMware ESXi cluster. We have configured SMVI snapshots. Today morning LUN went offline due to space issue. Temporarily i increased volume size and broght LUN online. What are all the best practices available to prevent such incidents in future? If i set LUN reservation to ENABLED, won't it go offline?
To ensure that LUN does not go offline space reservation must be enabled and fractional reserve set to 100%. In all other cases you must monitor volume for free space and take appropriate actions when required.
After making LUN from thin to thick and fr_reserve to 100% there will be 438GB free space in volume to hold the snaphots. How new snapshot creation will work if there is space constraint after sometime? whether new snapshot creation get terminated or possibility of such offline incident again?
With FR set to 100% the full current LUN size is reserved on volume when snapshot is taken. If there is not enough space to reserve, snapshot creation fails. So in the worst case volume size needs to be twice LUN size plus any space for snapshots.
you can easily prevent a thin provisioned lun going offline with 2 settings.
vol autosize - This will automatically grow the volume to your specified size when it starts to run out of space and before the lun goes offline. EG to grow your volume to 1500GB in 100GB increments you would enter:
> vol autosize VMVol -m 1500g -i 100g on
snap autodelete - Next if the volume has grown to its max size and you are still running out of space you can tell the storage system to delete snapshots out of the volume before the lun goes offline:
> snap autodelete VMVol on
This might cause you a headache with VSC (or whatever you use for your SMVI snapshots) as it wont know the snaps have been deleted but its better the the lun going offline. There are a few options you can configue such as which snapshots to delete first (or exclude) as well as target free space to reclaim.
You can also configure the basics of these two settings in the System Manager GUI
You can choose to delete the snapshots before growing the volume if you choose. I see VMVol has the default of autosize first - try_first=volume_grow
If you use Oncomand Core (FDM) you can set up alerts for the autosize & autodlete events so when you see the vol grow by 100g you can jump on and take action before its too late.
enabling space_alloc will turn on the SCSI3 flag setting to support native LUN reclaimation for Thin LUNs. But the caveat is that it only support the latest OS from the Windows/VMW & UNIX.
The main advantage of space_alloc flag settings is it will not send LUN offline when they have their usage capacity max out. Instead, it will only stop all writes activity till the space issue is resolved.