Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have installed two instances of the ESX version of the 8.3 Simulator and within a few days I run into space problems on the root volume. See the messages below from the console. I have also runto into this same problem with instances installed in ESX in a Partner's lab. It appears that there are logs or some other file(s) that are continually being created and filling up the root volume. The Partner dug a little deeper and reports that it looks like the Simulator is buitl on a Linux image and that it is the Linux image that is actually having a space problem.
CONSOLE OUTPUT:
login as: admin
Using keyboard-interactive authentication.
Password:
***********************
** SYSTEM MESSAGES **
***********************
CRITICAL: This node is not healthy because the root volume is low on space
(<10MB). The node can still serve data, but it cannot participate in cluster
operations until this situation is rectified. Free space using the nodeshell or
contact technical support for assistance.
cdot83::>
VM Characteristics: See Attachment.
Solved! See The Solution
1 ACCEPTED SOLUTION
rbaetz has accepted the solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And here's the quick&dirty node shell variant:
From the problem node's console:
- login to the cluster shell
- run local
- disk assign all
- aggr status
The root aggr will have "root" in the option list. Typically its aggr0 - aggr add aggregate_name 3@1g
Assuming the default 1gb disks were used. Adjust as necessary. - vol status
The root vol will have "root" in the options list. typically its vol0 - vol size root_volume +2290m
The size increase availble may vary depending on the type of disks used. 2560m or 2290m are most common. Try 2560 first, if that fails fall back to 2290, if that fails the error will give the max size in kb - exit
- reboot
You may or may not need a second reboot to remove the recovery flag in the loader. If required it will tell you when you log in from the node shell.
After a clean reboot, go back and disable aggr snaps and vol snaps on the root, delete any existing snaps, and clean out old logs and asup files in the mroot.
23 REPLIES 23
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Rbrinson -
Yes the sims run out of space given the default root volume size.
Common practice is to add another idisk to the aggr and grow the root volume, and turn off snapshots.
I hope this response has been helpful to you.
At your service,
Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
At this point, I cannot issue any meaningful commands to make this happen. Every command I issue seems to get thwarted due to databases being offline, like the VLDB. Can you give me some additional guidance on what commands to issue to accomplish your sugestion?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Rbrinson -
Ouch ! It may be easiest to reinstall the sim.
Can you delete any snapshots on the root vol ?
Will it let you unlock the diag user and drop down in the system shell ?
You could then try deleting log files ... ?
I hope this response has been helpful to you.
At your service,
Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could try deleting some unneeded files (like logs) from system shell.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks - we removed a bunch of log files, which resolved the issue temporarily. However, the same problem came back within a couple of days.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The fundamental problem is the root aggregate and root volume are too small.
If you can some get some files cleaned off and get back into the cluster shell you can increase its size with the following procedure.
I wrote this from the cluster shell perspective as part of a larger document, but if you can't get back into the cluster shell the node shell equivilents should work just as well.
Increasing the size of the simulator root volume
Steps:
- log in to the cluster shell
- Assign any unowned disks by entering the following command:
run * disk assign all - Identify the root aggregate by entering the following command:
storage aggregate show -node node_name -root true
Example:
demo1::> storage aggregate show -node demo1-01 -root true
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0 900MB 42.72MB 95% online 1 demo1-01 raid_dp,
normal - Add 3 disks to the root aggregate by entering the following command:
storage aggregate add-disks -aggregate root_aggregate -diskcount 3 - Use the root aggregate name to identify the root volume by entering the following command:
volume show -node node_name -aggregate root_aggregate
Example:
demo1::> volume show -node demo1-01 -aggregate aggr0
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
demo1-01 vol0 aggr0 online RW 851.5MB 529.8MB 37% - Increase the size of the root volume by 2.5GB by entering the following command:
run -node node_name vol size root_volume +2560m
rbaetz has accepted the solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And here's the quick&dirty node shell variant:
From the problem node's console:
- login to the cluster shell
- run local
- disk assign all
- aggr status
The root aggr will have "root" in the option list. Typically its aggr0 - aggr add aggregate_name 3@1g
Assuming the default 1gb disks were used. Adjust as necessary. - vol status
The root vol will have "root" in the options list. typically its vol0 - vol size root_volume +2290m
The size increase availble may vary depending on the type of disks used. 2560m or 2290m are most common. Try 2560 first, if that fails fall back to 2290, if that fails the error will give the max size in kb - exit
- reboot
You may or may not need a second reboot to remove the recovery flag in the loader. If required it will tell you when you log in from the node shell.
After a clean reboot, go back and disable aggr snaps and vol snaps on the root, delete any existing snaps, and clean out old logs and asup files in the mroot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good day,
I am having same issue, but when trying to add disk on my root aggregate it is having ERROR: data base is not open.
can someone please help me with this.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you using the nodeshell commands (run local, etc) from the post above?
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think in that case, you will need to get into the systemshell as diag user and then go to /mroot/etc and remove the log directory recursively (rm -rf /mroot/etc/log). Once this is done do a df -h . on the /mroot directory and note the decreasing useage. Once it drops below 100%, then exit the shell and reboot. it should come back up and then add disk to the aggr and space to the volume vol0 as previously mentioned.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you need to delete the snapshots for toor volume
cluster> node run local
node> snap delete -a vol0
node> vol options vol0 nosnap on
node> ctrl+D
cluster> reboot
reboot the simulator and that should do the trick.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thank you, the only easy and concise solution listed here!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I have a similar issue with a strange twist. I also get the low space problem, however when I query the root aggregate size, it gives me 3.38GB available out of a total 4.17GB. Only 19% used, so strange why it's complaining about space when there seems to be loads free. I cleared all my snapshots and changed the snap sched vol0 0 0 0 etc a while ago when i added extra disks and expanded the space on aggr0 and i'm still gettig the same problem. All aggregates on my second node are showinga status 'unknown'.
Help!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you added disks to aggr0, did you also expand the size of vol0?
What does df say?
df -h /vol/vol0
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Thanks for the quick reply. I added two disk but only expanded by 1g at the time as i thought it would be plenty considering that i disabled snapshot sched etc. Interestingly, just ran the command you suggested and get a response of 'Error: show failed: Database is not open'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Database not open means you are at the cluster shell, but the databases are offline because the root vol is full. Try it from the node shell on the node with the error condition:
run local df -h vol0
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok..that makes sense. Running local gives /vol/vol0 total 807mb, used 797mb, free 10mb, capacity 99%.
How do i go about extending the vol out from the root aggregate?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sounds like you've got plenty of free space in the aggr, so try:
vol size vol0 +2g
Again, thats at the node shell. Then reboot. It'll still give you an error. Then reboot again, but stop at the loader and clear the root recovery flag:
VLOADER> unsetenv bootarg.init.boot_recovery
VLOADER> boot
This time it should come up clean.
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok...that fixed it. Node 2 root vol is back online.
Thanks for your help, much appreciated.
Cheers
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Having the same issue and can't add disks. Get this errror message Command failed: database is not open
