Data Collection for JVM issues Some of the scripts will require you to modify the hard coded values for your particular environment. Please review and test all scripts before implementing. 1. Please confirm the architecture of the problem environment. How many servers, which Apps node on which server. Also confirm if there are any firewalls between Web Server and DB server or not. 2. Need to collate additional information, both as a once only exercise (to reconfirm the latest versions of the configuration files) and also some additional ongoing monitoring. i) Enable response time in access_log file To enable the response time in the access_log file add the following entry to httpd.conf This format is required when using "rotatelogs" which is enabled by default LogFormat "%h %l %u %t \"%r\" %>s %b %T" ii) Implement additional scripts to collect data Run mzRunCollectData.sh before and during a problem period on the Apps Web Server tier Start running before this script during normal running, and ensure script completes succesfully and output is being generated. Then leave running until the problem recurs. After Apache is bounced stop the script collection and upload results In addition to this, whilst the problem is occurring, run the mzJVM_ThreadDump.sh for the Java processes that are having problems. NOTE - check you are not using "-Xrs" for your JVM parameters before doing this, otherwise running this script will kill the JVM process iii) Collate the latest versions of all configuration files All files located in the $IAS_HOME/Apache/Apache/conf directory All files located in the $IAS_HOME/Apache/Jserv/etc directory $IAS_HOME/Apache/Apache/bin/java.sh The following files from the $IAS_HOME/Apache/modplsql/cfg directory plsql.cfg wdbsvr.app $FND_TOP/secure/<SID>_<Hostname>/<SID>.dbc iv) Enable additional Java GC logging Check your manufacturers web site to confirm the available options for your version of Java being used. Enable the most detailed level of Garbage Collection monitoring that is available, with timestamps if available.
For example Solaris JDK 1.3.1, adding "-verbose:gc" to the command line twice will give additional information. Most J2SE 1.4 will allow "-XX:+PrintGCDetails" to add time of GC Please ensure you test this change on a TEST environment first, to confirm it has the expected effect with your current version of the JDK. v) For each incident please collect new set of log files listed below as an "information pack" of data to analyse The following files from the $IAS_ORACLE_HOME/Apache/Apache/logs directory a) access_log.<ID Number> by default rotatelogs is enabled, so there is a seperate log file for every 24 hours. You need to identify which log file relates to the problem period and provide the relevant file for this period only. For example grep for the date of the incident in these files b) error_log ** once you have uploaded this file, please archive or remove this file, so we get a new log file for the next incident The following files from the $IAS_ORACLE_HOME/Apache/Jserv/logs directory ** for all the files in this section, once you have uploaded these files, please archive or remove them, so we get a new log file for the next incident c) mod_jserv.log d) jserv.log if jserv.log does not exist or is not currently being written to, then JServ logging needs to be enabled by setting the AutoConfig variable "s_oacorelog" variable to a value of "true" and re-running AutoConfig e) All files from the $IAS_ORACLE_HOME/Apache/Jserv/logs/jvm directory These will have filenames of the form <JVM GroupName>.<JVM Instance>.stdout|stderr For example OACoreGroup.0.stderr f) Alert.log file from RDBMS server g) Hourly reports from the Statspack output for 2 hours before the incident, plus the hour up to and including the time of the incident h) The output files from the mz scripts (implemented in step (ii) above) DMS*.txt OS*.txt mzRUNSQL*.txt mzPID*.txt i) The mzJVM_ThreadDump.txt output file, if you ran the mzJVM_ThreadDump.sh script j) Output file from "truss" if the JVM process is spinning (taking 100% cpu) This output is only needed for about 1 minute elapsed time <<end>>