Simulator Discussions
Simulator Discussions
I have installed two instances of the ESX version of the 8.3 Simulator and within a few days I run into space problems on the root volume. See the messages below from the console. I have also runto into this same problem with instances installed in ESX in a Partner's lab. It appears that there are logs or some other file(s) that are continually being created and filling up the root volume. The Partner dug a little deeper and reports that it looks like the Simulator is buitl on a Linux image and that it is the Linux image that is actually having a space problem.
CONSOLE OUTPUT:
login as: admin
Using keyboard-interactive authentication.
Password:
***********************
** SYSTEM MESSAGES **
***********************
CRITICAL: This node is not healthy because the root volume is low on space
(<10MB). The node can still serve data, but it cannot participate in cluster
operations until this situation is rectified. Free space using the nodeshell or
contact technical support for assistance.
cdot83::>
VM Characteristics: See Attachment.
Solved! See The Solution
And here's the quick&dirty node shell variant:
From the problem node's console:
You may or may not need a second reboot to remove the recovery flag in the loader. If required it will tell you when you log in from the node shell.
After a clean reboot, go back and disable aggr snaps and vol snaps on the root, delete any existing snaps, and clean out old logs and asup files in the mroot.
Rbrinson -
Yes the sims run out of space given the default root volume size.
Common practice is to add another idisk to the aggr and grow the root volume, and turn off snapshots.
I hope this response has been helpful to you.
At your service,
Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)
At this point, I cannot issue any meaningful commands to make this happen. Every command I issue seems to get thwarted due to databases being offline, like the VLDB. Can you give me some additional guidance on what commands to issue to accomplish your sugestion?
Rbrinson -
Ouch ! It may be easiest to reinstall the sim.
Can you delete any snapshots on the root vol ?
Will it let you unlock the diag user and drop down in the system shell ?
You could then try deleting log files ... ?
I hope this response has been helpful to you.
At your service,
Eugene E. Kashpureff, Sr.
Independent NetApp Consultant http://www.linkedin.com/in/eugenekashpureff
Senior NetApp Instructor, IT Learning Solutions http://sg.itls.asia/netapp
(P.S. I appreciate 'kudos' on any helpful posts.)
Thanks - we removed a bunch of log files, which resolved the issue temporarily. However, the same problem came back within a couple of days.
The fundamental problem is the root aggregate and root volume are too small.
If you can some get some files cleaned off and get back into the cluster shell you can increase its size with the following procedure.
I wrote this from the cluster shell perspective as part of a larger document, but if you can't get back into the cluster shell the node shell equivilents should work just as well.
Steps:
And here's the quick&dirty node shell variant:
From the problem node's console:
You may or may not need a second reboot to remove the recovery flag in the loader. If required it will tell you when you log in from the node shell.
After a clean reboot, go back and disable aggr snaps and vol snaps on the root, delete any existing snaps, and clean out old logs and asup files in the mroot.
Good day,
I am having same issue, but when trying to add disk on my root aggregate it is having ERROR: data base is not open.
can someone please help me with this.
Thank you.
Are you using the nodeshell commands (run local, etc) from the post above?
I think in that case, you will need to get into the systemshell as diag user and then go to /mroot/etc and remove the log directory recursively (rm -rf /mroot/etc/log). Once this is done do a df -h . on the /mroot directory and note the decreasing useage. Once it drops below 100%, then exit the shell and reboot. it should come back up and then add disk to the aggr and space to the volume vol0 as previously mentioned.
you need to delete the snapshots for toor volume
cluster> node run local
node> snap delete -a vol0
node> vol options vol0 nosnap on
node> ctrl+D
cluster> reboot
reboot the simulator and that should do the trick.
thank you, the only easy and concise solution listed here!
Hi
I have a similar issue with a strange twist. I also get the low space problem, however when I query the root aggregate size, it gives me 3.38GB available out of a total 4.17GB. Only 19% used, so strange why it's complaining about space when there seems to be loads free. I cleared all my snapshots and changed the snap sched vol0 0 0 0 etc a while ago when i added extra disks and expanded the space on aggr0 and i'm still gettig the same problem. All aggregates on my second node are showinga status 'unknown'.
Help!!!
When you added disks to aggr0, did you also expand the size of vol0?
What does df say?
df -h /vol/vol0
Hi
Thanks for the quick reply. I added two disk but only expanded by 1g at the time as i thought it would be plenty considering that i disabled snapshot sched etc. Interestingly, just ran the command you suggested and get a response of 'Error: show failed: Database is not open'
Database not open means you are at the cluster shell, but the databases are offline because the root vol is full. Try it from the node shell on the node with the error condition:
run local df -h vol0
ok..that makes sense. Running local gives /vol/vol0 total 807mb, used 797mb, free 10mb, capacity 99%.
How do i go about extending the vol out from the root aggregate?
Sounds like you've got plenty of free space in the aggr, so try:
vol size vol0 +2g
Again, thats at the node shell. Then reboot. It'll still give you an error. Then reboot again, but stop at the loader and clear the root recovery flag:
VLOADER> unsetenv bootarg.init.boot_recovery
VLOADER> boot
This time it should come up clean.
Ok...that fixed it. Node 2 root vol is back online.
Thanks for your help, much appreciated.
Cheers
Having the same issue and can't add disks. Get this errror message Command failed: database is not open