l 概要
2013年1月15日XXX出现YYY中间件内存使用超90%情况,并且有时候会发生慢的现象。现根据当时的情况,进行如下分析。
l 相关问题现象
2013年1月15日,10点时分,zcyy系统出现内存超90%告警。遂,用登录主机。。。topas。。。查看。。。
l 可分析文件
1. zcyy_server3.out日志若干(出现问题前后节点nohup日志)
2.问题点dump未收集,查看日常dump文件 (wls的Thread快照)
l 分析过程
1.分析server日志,发现存在若干应用异常日志,这些异常会让代码的执行效率,客户响应变慢,建议进行优化。如下:
<2013-1-14 下午09时48分41秒 CST> <Warning> <HTTP> <BEA-101196> <[web]: Error while parsing the Tag Library Descriptor at "/bea/wls103/user_projects/domains/zcyy_domain/servers/zcyy_server3/stage/EAR/EAR/web/WEB-INF/tld/wftrace-common.tld".
com.ctc.wstx.exc.WstxIOException: Tried all: '1' addresses, but could not connect over HTTP to server: 'java.sun.com', port: '80'
at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:683)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
at weblogic.servlet.internal.TldCacheHelper$TldIOHelper.parseXML(TldCacheHelper.java:134)
at weblogic.descriptor.DescriptorCache.parseXML(DescriptorCache.java:380)
at weblogic.servlet.internal.TldCacheHelper.parseTagLibraries(TldCacheHelper.java:65)
Truncated. see log file for complete stacktrace
java.net.ConnectException: Tried all: '1' addresses, but could not connect over HTTP to server: 'java.sun.com', port: '80'
at weblogic.net.http.HttpClient.openServer(HttpClient.java:312)
at weblogic.net.http.HttpClient.openServer(HttpClient.java:388)
at weblogic.net.http.HttpClient.New(HttpClient.java:238)
at weblogic.net.http.HttpURLConnection.connect(HttpURLConnection.java:172)
at weblogic.net.http.HttpURLConnection.getInputStream(HttpURLConnection.java:356)
Truncated. see log file for complete stacktrace
>
启动出现46次,直接原因是web项目中tld文件写入访问java官网的url:http://java.sun.com:80
原因:1.更新的web项目与stage目录下的目录版本不同,建议删除stage目录,重新加载;或更改为nostage模式
2.去掉web项目java官网的写入。
<2013-1-14 下午07时01分34秒 CST> <Warning> <EJB> <BEA-012035> <The Remote interface method: 'public abstract java.util.Map com.comtop.lcam.pbms.service.finance.appservice.IPbmsBudgetService.queryViableBalance(java.util.Map) throws java.lang.Exception' in EJB 'PbmsBudgetAppService' contains a parameter of type: 'java.util.Map' which is not Serializable. Though the EJB 'PbmsBudgetAppService' has call-by-reference set to false, this parameter is not Serializable and hence will be passed by reference. A parameter can be passed using call-by-value only if the parameter type is Serializable.>
启动出现1040次,代码参数未序列化的错误,影响对象的持久化存储。直接解决办法,根据相应堆栈参数序列化代码。
[ERROR]
2013-01-15 02:31:39,760
StackTrace : com.comtop.component.desktopremind.DesktopReminderConstants.getInstanceClass(DesktopReminderConstants.java:128)
***********************
提醒数量实现类实例化出现异常,类路径:com.comtop.lcam.ipms.plan.component.todolist.adapter.IpmsCommonTodoAdapter
***********************
java.lang.ClassNotFoundException: com.comtop.lcam.ipms.plan.component.todolist.adapter.IpmsCommonTodoAdapter
at weblogic.utils.classloaders.GenericClassLoader.findLocalClass(GenericClassLoader.java:283)
at weblogic.utils.classloaders.GenericClassLoader.findClass(GenericClassLoader.java:256)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:176)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at com.comtop.component.desktopremind.DesktopReminderConstants.getInstanceClass(DesktopReminderConstants.java:124)
at com.comtop.workflow.taskmessage.action.TaskMessageShowAction.modifityTaskCount(TaskMessageShowAction.java:542)
at com.comtop.workflow.taskmessage.action.TaskMessageShowAction.execute(TaskMessageShowAction.java:193)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:484)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:274)
运行过程中出现661次,class实例化异常的错误
2013-01-15 09:16:36,610
StackTrace : com.comtop.lcam.material.purchase.contract.action.PurchaseContractMultiOperateAction.judgeBudgetSwitch(PurchaseContractMultiOperateAction.java:1822)
***********************
调用大预算系统接口时出现异常
***********************
java.lang.NullPointerException
at com.comtop.lcam.material.purchase.contract.action.PurchaseContractMultiOperateAction.getAvailBudget(PurchaseContractMultiOperateAction.java:1886)
at com.comtop.lcam.material.purchase.contract.action.PurchaseContractMultiOperateAction.judgeBudgetSwitch(PurchaseContractMultiOperateAction.java:1808)
at com.comtop.lcam.material.purchase.contract.action.PurchaseContractMultiOperateAction.execute(PurchaseContractMultiOperateAction.java:129)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:484)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:274)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1482)
at com.comtop.struts.action.ComtopActionServlet.process(ComtopActionServlet.java:84)
at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:525)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
运行过程中,出现71次,高时可达6367次NPE异常
<2012-12-6 下午04时37分57秒 CST> <Error> <Cluster> <BEA-000126> <All session objects should be serializable to replicate. Check the objects in your session. Failed to replicate non-serializable object.>
<2012-12-6 下午04时37分57秒 CST> <Error> <Cluster> <BEA-000126> <All session objects should be serializable to replicate. Check the objects in your session. Failed to replicate non-serializable object.>
[ERROR]
2012-12-06 16:37:57,070
StackTrace : com.comtop.component.desktopremind.DesktopReminderConstants.getInstanceClass(DesktopReminderConstants.java:128)
***********************
提醒数量实现类实例化出现异常,类路径:com.comtop.lcam.ipms.plan.component.todolist.adapter.IpmsCommonTodoAdapter
***********************
java.lang.ClassNotFoundException: com.comtop.lcam.ipms.plan.component.todolist.adapter.IpmsCommonTodoAdapter
at weblogic.utils.classloaders.GenericClassLoader.findLocalClass(GenericClassLoader.java:283)
at weblogic.utils.classloaders.GenericClassLoader.findClass(GenericClassLoader.java:256)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:176)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at com.comtop.component.desktopremind.DesktopReminderConstants.getInstanceClass(DesktopReminderConstants.java:124)
at com.comtop.workflow.taskmessage.action.TaskMessageShowAction.modifityTaskCount(TaskMessageShowAction.java:542)
at com.comtop.workflow.taskmessage.action.TaskMessageShowAction.execute(TaskMessageShowAction.java:193)
运行过程中 145552次<BEA-000126>non-serializable object错误,创建session会话中,存在未持久化的对象,导致session创建失败。
如:找到com.comtop.component.desktopremind.DesktopReminderConstants进行对应的代码优化。
2.分析日常dump文件
虽然出现问题时间,由于时间比较仓促未收集,现收集一部分日常dump文件,查看线程使用情况。
可以看见,处于等待的线程有31%+46%=77%之多,可见此zcyy系统对于对象object的请求、处理会较多。
另外,查看当时heap使用情况我们可以发现JVM设置中perm已经做够大了
JVM: -Xms4096m -Xmx4096m -XX:PermSize=1024m -XX:MaxPermSize=1024m
现场使用情况:object使用情况均高达87%
Heap
PSYoungGen total 1263936K, used 894741K [0x00002aab98dc0000, 0x00002aabee310000, 0x00002aabee310000)
eden space 1135552K, 75% used [0x00002aab98dc0000,0x00002aabcd779dd0,0x00002aabde2b0000)
from space 128384K, 25% used [0x00002aabe65b0000,0x00002aabe85bb9f0,0x00002aabee310000)
to space 131264K, 0% used [0x00002aabde2b0000,0x00002aabde2b0000,0x00002aabe62e0000)
PSOldGen total 2796224K, used 2447050K [0x00002aaaee310000, 0x00002aab98dc0000, 0x00002aab98dc0000)
object space 2796224K, 87% used [0x00002aaaee310000,0x00002aab838c2bc0,0x00002aab98dc0000)
PSPermGen total 1048576K, used 917221K [0x00002aaaae310000, 0x00002aaaee310000, 0x00002aaaee310000)
object space 1048576K, 87% used [0x00002aaaae310000,0x00002aaae62c9630,0x00002aaaee310000)
查看线程具体使用情况,存在大部分的object等待情况。
l 优化方案
根据如上nohup日志以及dump文件的分析,针对问题现象,现给出以下建议:
1. 针对日志中出现的应用错误,进行代码优化,提升代码执行效率。
2. 据JVM设置情况,就可发现zcyy系统会调用的object会很多,占用的本地代码,也会较多。处于优化考虑,建议观察gc参数,查看heap使用曲线,是否调整本地内存大小;并根据全局heapdump查找大内存对象,是否需要进行代码细化。
3. 为提高处理效率并减少BUG率:查看WLS版本为10.3.0.0,建议升级到稳定版10.3.6.0;jdk使用版本为SUN JDK 1.6.0_20,建议升级到最新JDK 1.6.0_38.升级到稳定版。
该贴被funny编辑于2014-2-27 15:34:04