1. 问题描述
XXX的YYY分公司新上线的tuxedo服务不能正常启动,不能通过TMS创建到数据库的连接致使所有应用server不能正常启动
2. 应用环境
tuxedo 11.1.1.3.0, 64-bit, Patch Level 013
Linux cvmsap01vl64js 2.6.18-308.el5
3. 问题分析
服务不能启动时,tuxedo ulog报错如下:
131216.cvmsap01vl64js!vms_0212.18077.3460687856.0: 10-18-2013: Tuxedo Version 11.1.1.3.0, 64-bit
131216.cvmsap01vl64js!vms_0212.18077.3460687856.0: LIBTUX_CAT:262: INFO: Standard main starting
131216.cvmsap01vl64js!vms_0212.18077.3460687856.0: LIBTUX_CAT:466: ERROR: tpopen TPERMERR xa_open returned XAER_RMERR
131216.cvmsap01vl64js!vms_0212.18077.3460687856.0: LIBTUX_CAT:6014: ERROR: tpopen failed - TPERMERR - resource manager error
131216.cvmsap01vl64js!vms_0212.18077.3460687856.0: LIBTUX_CAT:250: ERROR: tpsvrinit() failed
131216.cvmsap01vl64js!tmboot.17733.1449823328.-2: CMDTUX_CAT:825: ERROR: Process vms_0212 at jscvms failed with /T tperrno (TPESYSTEM - internal system error)
上面信息显示服务启动失败xa_open返回XAER_RMERR错误,查看xa日志先后报了如下错误:
ORACLE XA: Version 11.2.0.3.0. RM name = 'Oracle_XA'.
094231.16561.0:
ORA-12170: TNS:Connect timeout occurred
094231.16561.0:
xaolgn_help: XAER_RMERR; OCIServerAttach failed. ORA-12170.
配置好数据库监听后不再报上面的错误,之后报如下错误:
ORACLE XA: Version 11.2.0.3.0. RM name = 'Oracle_XA'.
131217.18083.0:
ORA-12537: TNS:connection closed
131217.18083.0:
xaolgn_help: XAER_RMERR; OCIServerAttach failed. ORA-12537.
在本机sqlplus登录数据库报同样的错误,修改tnsnames.ora重新配置后可以正常连接到数据库,但是启动过程中依旧报错:
161928.cvmsap01vl64js!cvms_check.19556.3354093552.0: LIBTUX_CAT:681: ERROR: Failure to create message queue
161928.cvmsap01vl64js!cvms_check.19556.3354093552.0: LIBTUX_CAT:248: ERROR: System init function failed, Uunixerr = : msgget: No space left on device
不能创建消息队列原因是操作系统的限制修改/etc/sysctl.conf在最后面添加:
kernel.msgmni = 360
root用户下执行sysctl –p使之生效,再次启动tuxedo,所有server都正常启动。
4. 调优建议
当前应用server采用MSSQ方式,大多数server的MIN和MAX最小和最大服务启动个数不同,但由于没设置动态增长和消减策略,所以server的个数只能是MIN值,建议调整。举例如下:
"cvms_uncheck" SRVGRP="CVMS_DB" SRVID=1400
CLOPT="-A"
RQADDR="cvms_uncheck"
RQPERM=0666 REPLYQ=Y RPPERM=0666 MIN=8 MAX=12 CONV=N
SYSTEM_ACCESS=FASTPATH
MAXGEN=10 GRACE=0 RESTART=Y
MINDISPATCHTHREADS=0 MAXDISPATCHTHREADS=1 THREADSTACKSIZE=0
SICACHEENTRIESMAX="500"
调整为:
"cvms_uncheck" SRVGRP="CVMS_DB" SRVID=1400
CLOPT="-A -p 2,10:5,3"
RQADDR="cvms_uncheck"
RQPERM=0666 REPLYQ=Y RPPERM=0666 MIN=8 MAX=12 CONV=N
SYSTEM_ACCESS=FASTPATH
MAXGEN=10 GRACE=0 RESTART=Y
MINDISPATCHTHREADS=0 MAXDISPATCHTHREADS=1 THREADSTACKSIZE=0
SICACHEENTRIESMAX="500"
说明:刚开始时启动最小进程数8个,当请求队列cvms_uncheck中的请求个数大于5后超过3秒,就增加该服务的一个新进程,最多只能12个;如果请求队列cvms_uncheck中的请求个数小于2后超过10秒,就停止该服务的一个进程,但最少要有8个