Active IQ Unified Manager Discussions
Active IQ Unified Manager Discussions
| About DataFabric® Manager server | |
| Version | 3.6 (3.6R1) | 
| Serial Number | 1-50-003808 | 
| Administrator Name | root | 
| Host Name | miadfm01 | 
| Host Full Name | miadfm01 | 
| Node Limit | 15 (currently managing 4) | 
| Operating System | Red Hat Enterprise Linux ES release 3 (Taroon Update 9) 2.4.21-27.EL i686 | 
| CPU Count | 1 | 
| System Memory | 3014 MB (load: 19%) | 
cat /opt/NTAPdfm/conf/sybase.conf
-n MonitorDB_miadfm01
-o "/opt/NTAPdfm/log/sybase.log"
-os 10000000
-gd all
-gl all
-gk all
#-gn 64
-gn 32
#-ti 1440
-ti 240
-gp 8192
-ct-
-x SharedMemory,tcpip
-ud
Solved! See The Solution
Hi Joe --
I think you should open a case with NetApp support. From the dfmwatchdog snippet you posted, something is weird. Among other things, why was the database reported being up negative one year? NGS can help you diagnose the problem more effectively than this forum can.
-- Pete
Could you provide us
1. output from "top" command
2. conents of /proc/meminfo file?
Some theory related to it high memory usage:
The Linuxkernel maitains disk cache globally. So files can remain in cache
even after the process that was using them finish execution, because they
might be used by another process. Freeing the cache would mean discarding
cached data. So the kernel tries to keep the cached data as long as possible.
Hence the memory is not freed immediately, but gradually based on the need for
some other purpose/process.
Pages only become free when they're evicted to build up the free pool (like
through a garbage collector), or when nothing useful can be stored in them.
This causes the Linux kernel to hold the cache memory until an explicit
request comes. So the memory load looks big.
We were running with 2 gigs - and i was able to bump it up to 3 gigs - but i think the mem use may be creeping up again...
cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  3160662016 579219456 2581442560        0 178278400 185606144
Swap: 1077501952        0 1077501952
MemTotal:      3086584 kB
MemFree:       2520940 kB
MemShared:           0 kB
Buffers:        174100 kB
Cached:         181256 kB
SwapCached:          0 kB
Active:         366960 kB
ActiveAnon:     109424 kB
ActiveCache:    257536 kB
Inact_dirty:     74768 kB
Inact_laundry:   22656 kB
Inact_clean:         0 kB
Inact_target:    92876 kB
HighTotal:     2228160 kB
HighFree:      1932884 kB
LowTotal:       858424 kB
LowFree:        588056 kB
SwapTotal:     1052248 kB
SwapFree:      1052248 kB
Committed_AS:   550096 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB 
[root@miadfm01 root]# ps aux|grep dfm
root      1306  0.2  1.6 375176 50024 ?      S    Jan12   2:50 /opt/NTAPdfm/sbin/dbsrv9 @/opt/NTAPdfm/conf/sybase.conf
root      1357  0.0  0.0  5060 2464 ?        S    Jan12   0:00 /opt/NTAPdfm/sbin/httpd -f /opt/NTAPdfm/conf/httpd.conf
root      1358  0.0  0.0  3008 1024 ?        S    Jan12   0:00 /opt/NTAPdfm/sbin/rotatelogs /opt/NTAPdfm/log/error.log
root      1359  0.0  0.0  3008 1032 ?        S    Jan12   0:00 /opt/NTAPdfm/sbin/rotatelogs /opt/NTAPdfm/log/access.log
root      1365  0.0  0.1 73124 4360 ?        S    Jan12   0:00 /opt/NTAPdfm/sbin/dfmeventd start
root      1366  0.0  0.2 116976 7204 ?       S    Jan12   0:02 /opt/NTAPdfm/sbin/dfmmonitor
root      1367  0.0  0.1 37384 4872 ?        S    Jan12   0:00 /opt/NTAPdfm/sbin/dfmscheduler
root      1368  0.0  1.5 259888 48052 ?      S    Jan12   0:03 /opt/NTAPdfm/sbin/dfmserver
root      1369  0.7  0.0  6480 2708 ?        S    Jan12  10:23 /opt/NTAPdfm/sbin/dfmwatchdog 
11:49:08  up 21:49,  1 user,  load average: 0.00, 0.02, 0.00
65 processes: 63 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    0.0%    0.0%    1.4%   0.0%     0.0%    0.0%   98.6%
Mem:  3086584k av,  565644k used, 2520940k free,       0k shrd,  174092k buff
       367196k active,              74768k inactive
Swap: 1052248k av,       0k used, 1052248k free                  181256k cached
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
 1369 root      16   0  2708 2708  2160 S     0.7  0.0  10:23   0 dfmwatchdog
    1 root      15   0   500  500   440 S     0.0  0.0   0:03   0 init
    2 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 keventd
    3 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kapmd
    4 root      34  19     0    0     0 SWN   0.0  0.0   0:00   0 ksoftirqd/0
    7 root      25   0     0    0     0 SW    0.0  0.0   0:00   0 bdflush
    5 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kswapd
    6 root      15   0     0    0     0 SW    0.0  0.0   0:29   0 kscand
    8 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kupdated
    9 root      25   0     0    0     0 SW    0.0  0.0   0:00   0 mdrecoveryd
   20 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  620 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  621 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  622 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  623 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  624 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  625 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 kjournald
  944 root      15   0   588  588   504 S     0.0  0.0   0:00   0 syslogd
  948 root      22   0   468  468   404 S     0.0  0.0   0:00   0 klogd
  974 rpc       21   0   580  580   504 S     0.0  0.0   0:00   0 portmap
  993 rpcuser   25   0   716  716   636 S     0.0  0.0   0:00   0 rpc.statd
 1004 root      15   0   412  412   352 S     0.0  0.0   0:00   0 mdadm
 1052 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 vmmemctl
 1085 root      16   0   612  612   528 S     0.0  0.0   0:06   0 vmware-guestd
 1104 root      RT   0   660  660   520 S     0.0  0.0   0:00   0 auditd
 1156 root      24   0   516  516   460 S     0.0  0.0   0:00   0 apmd
 1197 root      15   0  3944 3944  2356 S     0.0  0.1   0:07   0 snmpd
 1220 root      25   0     0    0     0 SW    0.0  0.0   0:00   0 khubd
 1234 root      16   0  1552 1552  1300 S     0.0  0.0   0:00   0 sshd
 1248 root      24   0   880  880   760 S     0.0  0.0   0:00   0 xinetd
 1276 root      15   0  2564 2564  1896 S     0.0  0.0   0:00   0 sendmail
 1285 smmsp     15   0  2288 2280  1740 S     0.0  0.0   0:00   0 sendmail
 1295 root      25   0   472  472   420 S     0.0  0.0   0:00   0 gpm
 1306 root      20   0 50024  48M  2240 S     0.0  1.6   2:50   0 dbsrv9 
Why do you say you've got high Sybase memory usage? I'm not disputing it, I just don't see anything in the data you posted which would lead to that conclusion. The "System Memory" line from "dfm diag" seems to indicate you're only using 19% of your memory.
Look in dfmwatchdog.log to see how much memory all of the DFM processes are using. It would be interesting to know if the Sybase memory spiked or it's been growing slowly over time. Typically it grows for a while after starting up then levels off after a day or so. If it spiked up, it would be interesting to know if some other process spiked at the same time.
More often than not, when memory usage spikes it's because we're getting behind processing events. That doesn't exactly match because what you typically see is a sharp growth in the memory used by dfmmonitor and eventd.
pete - thanks that watchdog file is very helpfull -here is a snip below [just a grep on dbsrv9]. I guess it will be ok If things do smooth out.
You ever work with the sybase.conf file to keep things in check? - Joe
Jan 12 13:44:14 [dfmwatchdog: INFO]: dbsrv9 is back up at 12 Jan 13:44; pid = 1306
Jan 12 13:44:14 [dfmwatchdog: INFO]: dbsrv9      up     0 seconds, mem = 26.1 MB, cpu = 0.0%, db = 262 MB, log = 67.3 MB
Jan 12 13:44:19 [dfmwatchdog: INFO]: dbsrv9      up     5 seconds, mem = 31.6 MB, cpu = 55.7%, db = 262 MB, log = 67.3 MB
Jan 12 13:44:29 [dfmwatchdog: INFO]: dbsrv9      up    15 seconds, mem = 36.2 MB, cpu = 76.5%, db = 263 MB, log = 67.3 MB
Jan 12 13:44:34 [dfmwatchdog: INFO]: dbsrv9      up    20 seconds, mem = 36.5 MB, cpu = 49.5%, db = 263 MB, log = 67.3 MB
Jan 12 13:44:54 [dfmwatchdog: INFO]: dbsrv9      up    40 seconds, mem = 38.0 MB, cpu = 17.2%, db = 265 MB, log = 67.6 MB
Jan 12 13:45:14 [dfmwatchdog: INFO]: dbsrv9      up   1.0 minutes, mem = 38.5 MB, cpu = 3.2%, db = 265 MB, log = 67.6 MB
Jan 12 14:02:56 [dfmwatchdog: INFO]: dbsrv9      up    -1.0 years, mem = 42.9 MB, cpu = 0.0%, db = 268 MB, log = 67.6 MB
Jan 12 14:03:11 [dfmwatchdog: INFO]: dbsrv9      up    -1.0 years, mem = 48.1 MB, cpu = 7.9%, db = 269 MB, log = 67.6 MB 
Memory Maxed out on this linux machine again - had to reboot - any suggestions?
Hi Joe --
I think you should open a case with NetApp support. From the dfmwatchdog snippet you posted, something is weird. Among other things, why was the database reported being up negative one year? NGS can help you diagnose the problem more effectively than this forum can.
-- Pete
-Will do and thanks to all for the help. -Joe
Just wanted to close up this thread...
I upgraded the HOST OS - Linux
I upgraded the Version of DFM
- And all is well
| About DataFabric® Manager server | |
| Version | 3.8 (3.8) | 
| Serial Number | 1-50-003808 | 
| Administrator Name | root | 
| Host Name | miadfm01.bftg.com | 
| Host IP Address | 10.10.17.38 | 
| Host Full Name | miadfm01.bftg.com | 
| Operations Manager Node limit | 15 (currently managing 4) | 
| Operating System | Red Hat Enterprise Linux Server release 5.4 (Tikanga) 2.6.18-164.15.1.el5 i686 | 
| CPU Count | 1 | 
| System Memory | 3042 MB (load excluding cached memory: 27%) | 
| Installation Directory | /opt/NTAPdfm 19.8 GB free (57.4%) | 
Hi,
I would suggest you to upgrade to 3.8.1 which is a GA release.
Regards
adai
