ONTAP Discussions
ONTAP Discussions
Hi,
I am periodically having an issue where a LUN is taking itself offline with a "write to lun failed due to lack of space error". The windows server that has access to the LUN reports that there is 4 GB of free space available. In addition our monitoring tool reports 4GB of free space at the point the LUN goes offline. We do not run snapshots on the Volume however another volume in the same aggregate does. I have laid out the configuration at the end of the post.
The offline error has occurred 3 times over the last month although I can't see a particular pattern to the failure. My knowledge of NAS in general is very limited but from trawling through the configuration I have noticed that "space reserved" is turned off on the LUN. The online document states "It is recommended that you create a LUN with space reservations enabled so that blocks in the LUN can be updated when they are written to". Could this be the cause of the LUN running out of space and going offline? If my calculations are correct the volume is 12.7 TB and the LUN is around 12.69 TB so the LUN shouldn't be running out of space?
Has anyone any idea from my configuration what could be the problem or, if it is the space reserved setting why it is running out of space when the volume is big enough to hold the LUN?
Any help would be appreciated.
LUN | |
Description | NL LUN |
Size | 13635859545 |
Space Reserved | OFF |
Volume | |
Name | NL |
Type | Flexible |
Status | Online,raid0 |
Used Capacity | 12.7 TB |
% Used | 100% |
Total Capacity | 12.7 TB |
Number of files | 107 |
Max files | 31.9 m |
SNAP | NO |
Fractional reserve | 100% |
Containing Aggregate | AGR1 |
Space Guarantee | Volume |
Total Size | 12.7 TB |
Max directory size | 20.5 MB |
Aggregate | |
Name | AGR1 |
Type | Aggregate |
Status | Online,raid0 |
Used Capacity | 100% |
Total Capacity | 13.3 TB |
Number of files | 107 |
Max files | 31.1k |
Raid Size | 9 |
Checksums | Block |
Number of disks | 42 |
There is also an additional volume on the aggregate.
Volume | |
Name | PD |
Type | Flexible |
Status | Online, raid0 |
Used Capacity | 115GB |
% Used | 19% |
Total Capacity | 600 GB |
Number of files | 138k |
Max files | 20.8m |
SNAP | YES |
Fractional Reserve | 100 |
Space Guarantee | Volume |
Total Size | 600 GB |
Max directory size | 20.5 MB |
If the LUN is thin provisioned, what the host sees is irrelevant - the host has been allocated capacity it has not consumed yet. What happens when it attempts to consume it?
Well, it depends on how the lun and flexvol are configured - if the lun is not space reserved, space will be taken from the flexvol which it sits in. It looks like the flexvol is full - this is why the writes are failing. Can you shrink the other flexvol, and grow this one?
Thanks for the quick reply. Yes, I will shrink the other flexvol and growth the thin provisioned one. I'm going to turn space reserved on too.
It looks like the aggregate is full
What does the output of 'df -hA' show?
If there are snaps on the aggregate you can turn those off and reclaim some of the reserve space on the aggregate.
Yes, it defintiely is. I have investigated a bit further and the lun was over commited by 174MB. Since the aggregate was full this caused the lun to go offline. Fortunately I can shrink down another lun in the agregate so will use that space to increase the over commited lun and turn on space reservation.
Thanks for all your help.