[原创]Weblogic数据库连接占满导致的OOM问题分析实例_Tomcat, WebLogic及J2EE讨论区_Weblogic技术|Tuxedo技术|中间件技术|Oracle论坛|JAVA论坛|Linux/Unix技术|hadoop论坛_联动北方技术论坛  
网站首页 | 关于我们 | 服务中心 | 经验交流 | 公司荣誉 | 成功案例 | 合作伙伴 | 联系我们 |
联动北方-国内领先的云技术服务提供商
»  游客             当前位置:  论坛首页 »  自由讨论区 »  Tomcat, WebLogic及J2EE讨论区 »
总帖数
1
每页帖数
101/1页1
返回列表
0
发起投票  发起投票 发新帖子
查看: 10319 | 回复: 0   主题: [原创]Weblogic数据库连接占满导致的OOM问题分析实例        下一篇 
funny
注册用户
等级:中校
经验:1529
发帖:111
精华:4
注册:2013-3-13
状态:离线
发送短消息息给funny 加好友    发送短消息息给funny 发消息
发表于: IP:您无权察看 2014-3-6 13:14:18 | [全部帖] [楼主帖] 楼主

1. 问题描述

XXX系统11月5日下午weblogic server出现内存溢出错误,导致server宕掉,重启服务问题解决。

2. 问题分析

查看出现问题时的server日志,发现内存溢出错误,同时在域目录下生成了heapdump文件:

####<2013-11-5 下午041610秒 CST> <Error> <Socket> <NAMLap1> <appserver> <ExecuteThread: '3' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <> <1383639370694> <BEA-000405> <Uncaught Throwable in processSockets
java.lang.OutOfMemoryError: GC overhead limit exceeded.
java.lang.OutOfMemoryError: GC overhead limit exceeded
at weblogic.utils.collections.WeakConcurrentHashMap.put(WeakConcurrentHashMap.java:283)
at weblogic.server.channels.ServerChannelRuntimeMBeanImpl.addServerConnectionRuntime(ServerChannelRuntimeMBeanImpl.java:45)
at weblogic.socket.AbstractMuxableSocket.registerForRuntimeMonitoring(AbstractMuxableSocket.java:89)
at weblogic.socket.AbstractMuxableSocket.<init>(AbstractMuxableSocket.java:107)
at weblogic.servlet.internal.MuxableSocketHTTP.<init>(MuxableSocketHTTP.java:86)
at weblogic.servlet.internal.ProtocolHandlerHTTP.createSocket(ProtocolHandlerHTTP.java:65)
at weblogic.socket.MuxableSocketDiscriminator.dispatch(MuxableSocketDiscriminator.java:131)
at weblogic.socket.SocketMuxer.readReadySocketOnce(SocketMuxer.java:965)
at weblogic.socket.SocketMuxer.readReadySocket(SocketMuxer.java:897)
at weblogic.socket.PosixSocketMuxer.processSockets(PosixSocketMuxer.java:130)
at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:29)
at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:42)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:145)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117)
>


查看内存溢出前的日志发现数据库连接已经达到设置的最大值90:

####<2013-11-5 下午035653秒 CST> <Info> <Common> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '175' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1383638213230> <BEA-000627> <Reached maximum capacity of pool "amliq", making "0" new resource instances instead of "1".>
####<2013-11-5 下午040039秒 CST> <Info> <Common> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1383638439453> <BEA-000627> <Reached maximum capacity of pool "amliq", making "0" new resource instances instead of "1".>


还有频繁的数据库连接创建和关闭的信息:

####<2013-11-5 上午111946秒 CST> <Info> <Common> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '172' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1383621586448> <BEA-000628> <Created "1" resources for pool "amliq", out of which "1" are available and "0" are unavailable.> 
####<2013-11-5 上午111946秒 CST> <Info> <Common> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '178' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1383621586509> <BEA-000628> <Created "1" resources for pool "amliq", out of which "1" are available and "0" are unavailable.> 
####<2013-11-5 上午112023秒 CST> <Info> <JDBC> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '139' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1383621623543> <BEA-001128> <Connection for pool "amlase" has been closed.> 
####<2013-11-5 上午112023秒 CST> <Info> <JDBC> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '139' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1383621623546> <BEA-001128> <Connection for pool "amlase" has been closed.> 
####<2013-11-5 上午112023秒 CST> <Info> <JDBC> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '139' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1383621623547> <BEA-001128> <Connection for pool "amlase" has been closed.>


当前数据库连接的初始值30、最大值90、步增长1,上面的信息表示数据库连接在频繁的创建,而关闭表示连接空闲又被weblogic回收,频繁的创建和关闭数据库连接务必对系统性能造成一定的影响,尤其是并发量大的时候,建议将初始值设置大一些如80,由于当前已经达到设置的最大值90,如果最终排除应用程序问题,是由于客观的用户数增多,并发加大导致的连接不够用,也建议将最大连接数设置的再大一些。另外,步增长强烈建议设置的大一些,建议设置为10.

问题出现时还有大量的线程执行时间超时的告警,从下午1点38分开始:

####<2013-11-5 下午013825秒 CST> <Error> <WebLogicServer> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '195' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1383629905915> <BEA-000337> <[STUCK] ExecuteThread: '12' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "613" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 613019 ms
[
POST /information/client/t47_party_list.do?newsearchflag=1 HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, */*
Referer: http://10.200.181.99:7001/information/client/t47_party_list.do?newenterflag=1&ram=oi3gxVu47i
Accept-Language: zh-cn
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET4.0C; .NET4.0E)
Content-Length: 140
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: JSESSIONID=JnTnS4BYp9CyvvblTYvfcSLlWGR1JTd4MyRc9L51ZymjN1HFgGsD!-222941173
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
com.sybase.jdbc3.timedio.RawDbio.reallyRead(Unknown Source)
com.sybase.jdbc3.timedio.Dbio.doRead(Unknown Source)
com.sybase.jdbc3.timedio.InStreamMgr.a(Unknown Source)
com.sybase.jdbc3.timedio.InStreamMgr.doRead(Unknown Source)
com.sybase.jdbc3.tds.TdsProtocolContext.getChunk(Unknown Source)
com.sybase.jdbc3.tds.PduInputFormatter.a(Unknown Source)
com.sybase.jdbc3.tds.PduInputFormatter.read(Unknown Source)
com.sybase.jdbc3.tds.TdsInputStream.read(Unknown Source)
com.sybase.jdbc3.tds.TdsInputStream.read(Unknown Source)
com.sybase.jdbc3.tds.TdsInputStream.readUnsignedByte(Unknown Source)
com.sybase.jdbc3.tds.TdsDataObject.getSize(Unknown Source)
com.sybase.jdbc3.tds.TdsJdbcInputStream.if(Unknown Source)
com.sybase.jdbc3.tds.TdsJdbcInputStream.clear(Unknown Source)
com.sybase.jdbc3.tds.TdsResultSet.next(Unknown Source)
com.sybase.jdbc3.tds.TdsResultSet.close(Unknown Source)
com.sybase.jdbc3.jdbc.SybResultSet.markDead(Unknown Source)
com.sybase.jdbc3.jdbc.SybResultSet.close(Unknown Source)
com.sybase.jdbc3.jdbc.SybResultSet.close(Unknown Source)
weblogic.jdbc.wrapper.ResultSet.internalClose(ResultSet.java:181)
weblogic.jdbc.wrapper.Statement.closeAllResultSets(Statement.java:282)
weblogic.jdbc.wrapper.Statement.getMoreResults(Statement.java:570)
com.ibatis.sqlmap.engine.execution.SqlExecutor.executeQuery(SqlExecutor.java:189)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.sqlExecuteQuery(GeneralStatement.java:205)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.executeQueryWithCallback(GeneralStatement.java:173)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.executeQueryForList(GeneralStatement.java:123)
com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:610)
com.ibatis.sqlmap.engine.impl.SqlMapSessionImpl.queryForList(SqlMapSessionImpl.java:105)
com.ibatis.sqlmap.engine.impl.SqlMapClientImpl.queryForList(SqlMapClientImpl.java:82)
com.ist.aml.inves.dao.T47_partyDAO.getT47_customerList(T47_partyDAO.java:72)
com.ist.aml.information.controller.T47_partyAction.performGetT47_partyList(T47_partyAction.java:288)
com.ist.aml.information.controller.T47_partyAction.execute(T47_partyAction.java:94)
org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:419)
org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:224)
org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196)
org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:432)
javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:301)
weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
com.ist.common.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:189)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3739)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3705)
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2282)
weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2181)
weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1491)
weblogic.work.ExecuteThread.execute(ExecuteThread.java:256)
weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
>


执行请求POST /information/client/t47_party_list.do?newsearchflag=1 HTTP/1.1超时,单个请求执行时间超过10分钟。从堆栈信息中可以看出当前请求正在执行数据库相关操作,此时这个线程将一直拿着数据库连接不释放。同样执行时间超时的请求还包括:

####<2013-11-5 下午040017秒 CST> <Error> <WebLogicServer> <NAMLap1> <appserver> <[ACTIVE] ExecuteThread: '161' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1383638417575> <BEA-000337> <[STUCK] ExecuteThread: '35' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "665" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 665312 ms
[
GET /risk/v47_party_rg/party_summary_excel.do HTTP/1.1
Reverse-Via: SPDBBIZISA01
Cookie: LtpaToken2=5ZMtVqv7ErKBoeziEmLxqOCfOWFWH4lPD2NJntpS+zcqGEJDxMzNdCsPvEv0rSWWxFSoJqgmzOm77kTzkonlZoto0sQIjoZL7kC6iXsQhVJR3FDQqn2uUt7tcVLf6+dk9SMqaZsyU5gchdUa0tnm+RulQT/AUlfCElQlhJsVtjeMghe3Fx3hHLho5UEPxIznBk8+Coj3b7MfBQc612ZXzv6uwwRvyOMnrC0Epurw+mpi1lknKqjOhMACPOqEnJLR2ARijxtzpkXJywwa4nRXkhPy9EpJR69zNnRIEmwxdMHSCEhDafi7OgtzAV/aWBJCimMDjZSLk+S9+ZPeezXMbf7zkOuknuAkCunjYHEWKmG3o/SfSyytTjcz3xPFH6kg; LtpaToken=QBX0uNzb5EHt7K/LsOzSgiushA/fZN3Xkks5iCYsrmkzFO/nXAS1WSmgGZWuNLR9cSWKDPmmnfOWIL1ES0xWt2eP1IxzzwwI8/Qzz4TmDEb7GVExgq5bHTSs4V8IPD08eJsRw5TozdznmzC7/BqoScslLFLeFY+i5cx4rlLEMSFyjdZHlndjUI0GNyaDDY+LdCa4KPayQZOjt7D8+ooGMzlj9St3eZczcsx9PFhwolHm5Uk8eTIhfmSAhwCocZSQUSfFKi6iCzai0vKVV3wwX4TxSQNlg8+AC61JJOIqApeIev7gHjHFP1mcpzld0iTU68uC3K85atgWGp3C29Ucm71OgaotUQKg; JSESSIONID_portal=00006-45IVv3ae0h7oMPlnU1M6g:17dnv9td3:17dphk17u; JSESSIONID=Szb7S4NCbMPPM1TW8LY36yGg2MrPwkRQTRcQ6lj2JHKNFbGCph8W!-222941173
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)
Accept: */*
UA-CPU: x86
Connection: Keep-Alive
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
com.ibatis.sqlmap.engine.mapping.result.AutoResultMap.getResults(AutoResultMap.java:52)
com.ibatis.sqlmap.engine.execution.SqlExecutor.handleResults(SqlExecutor.java:394)
com.ibatis.sqlmap.engine.execution.SqlExecutor.executeQuery(SqlExecutor.java:185)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.sqlExecuteQuery(GeneralStatement.java:205)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.executeQueryWithCallback(GeneralStatement.java:173)
com.ibatis.sqlmap.engine.mapping.statement.GeneralStatement.executeQueryForList(GeneralStatement.java:123)
com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:610)
com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.queryForList(SqlMapExecutorDelegate.java:584)
com.ibatis.sqlmap.engine.impl.SqlMapSessionImpl.queryForList(SqlMapSessionImpl.java:101)
com.ibatis.sqlmap.engine.impl.SqlMapClientImpl.queryForList(SqlMapClientImpl.java:78)
com.ist.aml.risk.dao.T47_corporation_rgDAO.getV47_party_summaryList(T47_corporation_rgDAO.java:750)
com.ist.aml.risk.controller.T47_corporation_rgAction.performGetV47_party_summaryExcel(T47_corporation_rgAction.java:1525)
com.ist.aml.risk.controller.T47_corporation_rgAction.execute(T47_corporation_rgAction.java:91)
org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:419)
org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:224)
org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196)
org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414)
javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:301)
weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
com.ist.common.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:189)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:60)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3739)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3705)
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2282)
weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2181)
weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1491)
weblogic.work.ExecuteThread.execute(ExecuteThread.java:256)
weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
>


这个超时的线程在执行GET /risk/v47_party_rg/party_summary_excel.do HTTP/1.1请求,同样的堆栈信息显示在执行数据库相关操作。

    由于生成的heapdump文件太大一般个人电脑不能打开,需要在64位系统上安装64JDK,同时需要16G左右的空闲物理内存才能打开并分析该文件。目前,只能先让应用人员根据打印出的堆栈找到程序相关代码先做判断,另外在数据库层面,可以通过查看AWR报告,找到执行时间长的sql语句协助应用人员判断执行数据库操作的请求执行时间超时,然后导致数据库连接被占满,最终导致内存溢出的原因。




赞(0)    操作        顶端 
总帖数
1
每页帖数
101/1页1
返回列表
发新帖子
请输入验证码: 点击刷新验证码
您需要登录后才可以回帖 登录 | 注册
技术讨论