Loader backlog (files) in OEM
Loader is a part of the Management Service that pushes metric data into the Management Repository at periodic intervals. when there is data pending load the Loader Backlog chart indicates that the backlog is high and Loader output is low, which may indicate a system bottleneck or the need for another Management Service. The chart shows the total backlog of files totaled over all Oracle Management Services for the past 24 hours. Click the image to display loader backlog charts for each individual Management Service over the past 24 hours.
Somtimes we face that /ora is 100% full and it becomes difficult to start the services using "opmnctl" and will throu errors like
ahc55(grid):/ora/product/oem/10203/oms10g/opmn/bin>opmnctl startall
opmnctl: starting opmn and all managed processes...
================================================================================
opmn id=ahc55:6200
5 of 6 processes started.
ias-instance id=EnterpriseManager0.ahc55
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ias-component/process-type/process-set:
HTTP_Server/HTTP_Server/HTTP_Server
Error
--> Process (pid=28195)
failed to start a managed process after the maximum retry limit
Log:
/ora/product/oem/10203/oms10g/opmn/logs/HTTP_Server~1
In this case we need to clear the Sysman and Apache logs to make room for that or if this fails the we need to restart the reporsitory database.
we can use "():/ora/product/oem/10203/oms10g/opmn/bin>opmnctl status" command to check the status of this
[opmnctl is the supported tool for starting and stopping all components in an Oracle instance, with the exception of the Fusion Middleware Control Console. opmnctl provides a centralized way to control and monitor Oracle Application Server components from the command line]
opmnctl status
It generates the list of process running
ahc55():/ora/product/oem/10203/oms10g/opmn/bin>opmnctl status
Processes in Instance: EnterpriseManager0.ahc55
-------------------+--------------------+---------+---------
ias-component process-type pid status
-------------------+--------------------+---------+---------
DSA DSA N/A Down
HTTP_Server HTTP_Server 17007 Alive
LogLoader logloaderd N/A Down
dcm-daemon dcm-daemon N/A Down
OC4J home 17008 Alive
OC4J OC4J_EM 17010 Alive
WebCache WebCache 17011 Alive
WebCache WebCacheAdmin 17012 Alive
Solving ths issue
Normally we follow these steps to solve this issue..if every things fails we restart the repository database
Steps
Solution :
1) We need to clear the Apache/Sysman logs
2) Stop and Start the opmnctl
3) File upload should start
4) If that fails the step 5
5) If everything fails we need to start the database [pgrid]
1) Check the disk space
ahc55():/ora>df -k /ora
Filesystem kbytes used avail capacity Mounted on
/ora 18588650 18402767 0 100% /ora
2) Check the file upload status for backlog files at loader console or….
Or we can check this from prompt
ahc55():/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
93455
ahc55():/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
93487
ahc55():/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
93530
ahc55():/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
93566
If it is increasing the we need to follow the next steps
[this number should decrease instead of increase]
3) Stop and start the OMS
a) ahc55(grid):/ora/product/oem/10203/oms10g/opmn/bin>opmnctl stopall
b) ahc55(grid):/ora/product/oem/10203/oms10g/opmn/bin>opmnctl startall
4) Check if the file count is decreasing if not then follow next
Clean /ora/product/oem/10203/oms10g/Apache/Apache/logs
NOTE:- Except fastcgi and httpd.pid you can move all to /tmp
ahc55():/ora/product/oem/10203/oms10g/Apache/Apache/logs>ls -ltr
total 45502
drwx------ 3 oracle oinstall 512 Nov 27 15:35 fastcgi
-rw------- 1 oracle oinstall 1056768 Mar 22 11:35 mm.23113.mem
-rw------- 1 oracle oinstall 0 Mar 22 11:35 mm.23113.sem
-rw-r--r-- 1 oracle oinstall 0 Mar 22 11:35 ssl_request_log
-rw-r--r-- 1 oracle oinstall 6 Mar 22 11:35 httpd.pid
-rw------- 1 oracle oinstall 1056768 Mar 22 11:35 mod_oc4j.23113.shm.mem
-rw------- 1 oracle oinstall 0 Mar 22 11:35 mod_oc4j.23113.shm.sem
-rw------- 1 oracle oinstall 0 Mar 22 11:35 ssl_mutex.23113
-rw------- 1 oracle oinstall 0 Mar 22 11:35 ssl_scache.sem
-rw-r--r-- 1 oracle oinstall 257 Mar 22 11:35 ssl_engine_log
-rw------- 1 oracle oinstall 0 Mar 22 11:35 dms_metrics.23113.shm.sem
-rw------- 1 oracle oinstall 3072000 Mar 22 11:35 dms_metrics.23113.shm.mem
-rw------- 1 oracle oinstall 1572864 Mar 22 11:41 ssl_scache.mem
-rw-r--r-- 1 oracle oinstall 892604 Mar 22 12:01 error_log
-rw-r--r-- 1 oracle oinstall 11694 Mar 22 12:41 error_log.1269216000
-rw-r--r-- 1 oracle oinstall 454916 Mar 22 12:59 access_log.1269216000
-rw-r--r-- 1 oracle oinstall 5070917 Mar 22 19:06 access_log.1269259200
-rw-r--r-- 1 oracle oinstall 14718281 Mar 22 19:08 access_log
ahc55():/ora/product/oem/10203/oms10g/Apache/Apache/logs>
Next go to SYSMAN/log and clear the Logs except pafLogs (as we will be having space to restart the OMS)
ora/product/oem/10203/oms10g/sysman/log
ahc55():/ora/product/oem/10203/oms10g/sysman/log>ls -ltr
total 9794
drwxr-xr-x 2 oracle oinstall 512 Jul 30 2008 pafLogs
-rw-r--r-- 1 oracle oinstall 2498438 Mar 22 19:03 emoms.log
-rw-r--r-- 1 oracle oinstall 2498438 Mar 22 19:03 emoms.trc
ahc55():/ora/product/oem/10203/oms10g/sysman/log>
Check if uploading happening
ahc55(grid):/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
94045
ahc55(grid):/ora/product/oem/10203/oms10g/sysman/recv>ls wc -l
93902
Well explained.
ReplyDeleteThanks for sharing the real time experience.
Rajesh Yogi
Hi Vijay,
ReplyDeleteYour post is interesting. I've a similar issue:
I've a huge loader backlog. There are now some 67000 files in $OMS_HOME/sysman/recv and that amount is increasing. I did stop the HTTP server in order to stop metrics coming in and let Oracle time to load in those files. But that went very slowly. In about 1 hour I saw a decrease of 200 files. to slow and I can't stop the http server for a prolonged time.
I've OMS 10.2.0.1 and rdbms 10.2.0.4 undex Linux.
Any ideas how to decrease the backlog? I did create a TAR but Oracle advised to upgrade to OMS 10.2.0.5 or 11g . I think it's a wise advise but can't upgrade right now.
regards,
Ivan