1.概述
两个问题:一个是部署的19个应用包中,PerformanceManagement包启动后的状态一直不对没有达到活动状态;另外登录界面输入用户名密码后点击登录,页面没有反应。
2.问题分析
针对PerformanceManagement包启动不成功问题,重新部署并监控日志后发现,启动过程中应用僵死,一直在等待初始化:
log4j:WARN Please initialize the log4j system properly.
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /Server.log (Permission denied)
at java.io.FileOutputStream.openAppend(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:192)
at java.io.FileOutputStream.<init>(FileOutputStream.java:116)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:272)
at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:156)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:151)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:247)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:123)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:87)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:645)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:603)
at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:524)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:408)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:432)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:460)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:145)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at org.apache.log4j.Logger.getLogger(Logger.java:85)
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:44)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:516)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)
at com.sun.faces.config.ConfigureListener.<clinit>(ConfigureListener.java:212)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at java.lang.J9VMInternals.newInstanceImpl(Native Method)
at java.lang.Class.newInstance(Class.java:1345)
at weblogic.servlet.internal.WebComponentContributor.getNewInstance(WebComponentContributor.java:225)
at weblogic.servlet.internal.WebComponentContributor.createListenerInstance(WebComponentContributor.java:265)
at weblogic.servlet.internal.EventsManager.registerEventListener(EventsManager.java:372)
at weblogic.servlet.internal.EventsManager.registerEventListeners(EventsManager.java:110)
at weblogic.servlet.internal.EventsManager.registerEventListeners(EventsManager.java:75)
at weblogic.servlet.internal.WebAppServletContext.preloadResources(WebAppServletContext.java:1861)
at weblogic.servlet.internal.WebAppServletContext.start(WebAppServletContext.java:3126)
at weblogic.servlet.internal.WebAppModule.startContexts(WebAppModule.java:1512)
at weblogic.servlet.internal.WebAppModule.start(WebAppModule.java:486)
at weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:425)
at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:41)
at weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:119)
at weblogic.application.internal.flow.ScopedModuleDriver.start(ScopedModuleDriver.java:200)
at weblogic.application.internal.flow.ModuleListenerInvoker.start(ModuleListenerInvoker.java:247)
at weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:425)
at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:41)
at weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:119)
at weblogic.application.internal.flow.StartModulesFlow.activate(StartModulesFlow.java:27)
at weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:1267)
at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:41)
at weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:409)
at weblogic.application.internal.EarDeployment.activate(EarDeployment.java:58)
at weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:161)
at weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:79)
at weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:569)
at weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:150)
at weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:116)
at weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:323)
at weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:844)
at weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1253)
at weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:440)
at weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:163)
at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:195)
at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:13)
at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:68)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:528)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
Initializing Performance Management
从上面的日志信息中可以看出是因为没有正常的初始化log4j导致该性能包也不能被正常启动,但应用方建议这个包对系统应用没有影响,可以暂时不部署,于是拿掉该包后测试,依旧出现不能登录的情况。点击登录后界面没有反应,一段时间后从日志中可以看出线程执行时间超过600的错误,同时线程池中hogging线程数增多,server状态变为”warning” :
####<Jun 25, 2012 5:11:24 PM CST> <Error> <WebLogicServer> <ZB0000WC01> <op1app11> <[ACTIVE] ExecuteThread: '18' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1340615484761> <BEA-000337> <[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "636" seconds working on the request "weblogic.servlet.internal.ServletRequestImpl@74717471[
POST /PartnerPlatformService/service/app/logon.object HTTP/1.1
Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Referer: http://10.133.193.10:8081/PartnerPlatformService/service/app/logon.do?service=%2FCmcApp%2FApp%2FappService.jsp&cms=ZB0000BOP1%3A6400&cmsVisible=true&authType=secSAPR3&authenticationVisible=true&sm=false&smAuth=secLDAP&sapSSOPrimary=false&sso=false&useLogonToken=false&sessionCookie=true&persistCookies=true&appName=BusinessObjects+Central+Management+Console&prodName=Business+Objects&appKind=CMC&backUrl=%2FApp%2Fhome.faces&backContext=%2FCmcApp&backUrlParents=1
Accept-Language: zh-cn
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; chromeframe/18.0.1025.162; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; QIHU 360EE)
Content-Length: 979
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: ADMINCONSOLESESSION=v9FhPyvD2mdDJpTsylQz66T3pBjSdTt6yrXcp4w5QzzGvQg4rPgd!-498231822; JSESSIONID=0XqJPyyLZHS7lsWM16KmxxsLR25cplNH4lvMzBKYLJd7rLhrwLWr!1060100871!NONE; InfoViewPLATFORMSVC_COOKIE_TOKEN=""
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:167)
com.crystaldecisions.celib.synchronization.RWLock.forReading(RWLock.java:253)
com.crystaldecisions.celib.synchronization.RWLock.forReading(RWLock.java:189)
com.crystaldecisions.enterprise.ocaframework.ServiceMgr.lock(ServiceMgr.java:1060)
com.crystaldecisions.enterprise.ocaframework.ServerController.redirectServer(ServerController.java:534)
com.crystaldecisions.enterprise.ocaframework.ServiceMgr.redirectServer(ServiceMgr.java:1099)
com.crystaldecisions.enterprise.ocaframework.ManagedSessions.get(ManagedSessions.java:256)
com.crystaldecisions.enterprise.ocaframework.ServiceMgr.getManagedService(ServiceMgr.java:702)
com.crystaldecisions.sdk.occa.security.internal.LogonService.ensureServiceStub(LogonService.java:507)
com.crystaldecisions.sdk.occa.security.internal.LogonService.doUserLogon(LogonService.java:663)
com.crystaldecisions.sdk.occa.security.internal.LogonService.userLogon(LogonService.java:629)
com.crystaldecisions.sdk.occa.security.internal.SecurityMgr.userLogon(SecurityMgr.java:246)
com.crystaldecisions.sdk.framework.internal.SessionMgr.logonEx(SessionMgr.java:700)
com.businessobjects.clientaction.shared.logon.LogonUtils.logon(LogonUtils.java:85)
com.businessobjects.clientaction.shared.logon.LogonAction.logon(LogonAction.java:330)
com.businessobjects.clientaction.partner.shared.logon.PartnerLogonAction.logon(PartnerLogonAction.java:211)
com.businessobjects.clientaction.partner.shared.logon.PartnerLogonAction.handleLogon(PartnerLogonAction.java:237)
com.businessobjects.clientaction.partner.shared.logon.PartnerLogonAction.perform(PartnerLogonAction.java:427)
org.apache.struts.action.ActionServlet.processActionPerform(ActionServlet.java:1787)
org.apache.struts.action.ActionServlet.process(ActionServlet.java:1586)
com.businessobjects.webutil.struts.CrystalUTF8InputActionServlet.process(CrystalUTF8InputActionServlet.java:32)
org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:510)
javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:300)
weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
com.businessobjects.webutil.BrowserRenderingModeFilter.doFilter(BrowserRenderingModeFilter.java:20)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
com.businessobjects.webutil.websessiontimeout.WebSessionTimeoutFilter.doFilter(WebSessionTimeoutFilter.java:161)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.doIt(WebAppServletContext.java:3684)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3650)
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2268)
weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2174)
weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1446)
weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
>
从上面打出的堆栈信息可以看出发送请求/PartnerPlatformService/service/app/logon.object没有响应,一直到weblogic告警线程执行时间超过600秒,同时在access日志中也看到post了logon.object的请求没有get回相应的结果。由于weblogic承载的应用主要负责页面显示,后台业务逻辑全部由SAP的BO实现,之后将排查重点转移到BO应用上。重启几次后发现BO并没有被真正停止重启,进程一直保留在之前启动的时间,同时相关的应用脚本也不能运行。之后发现应用的磁盘空间50G被占满,扩展到100G,由于用脚本不能杀掉进程,而进程数又比较多,所以重启主机。之后重新启动BO应用,再次测试应用正常访问,同时PerformanceManagement性能包状态变为“活动”。
3分析结论
BO应用由于磁盘空间被占满导致应用状态异常不能正常处理请求,同时由于应用状态的不正常导致进程假死,不能用脚本正常启动和停止。扩展磁盘空间后一切恢复正常。建议建立完善的系统监控架构,避免出现类似磁盘空间被沾满却没有一处报警的情况发生。同时查看被沾满的50G空间里应用日志和core日志达到37G之多,建议关闭应用info、debug相关信息并及时清理程序异常退出时操作系统生成的core文件。