早上服务器crt远程连接不上,但是telnet端口是通的,只能重启服务器,查看message日志如下,请各位大神帮忙分析下原因
May 24 12:53:13 BJ-XHM-D2-05-DL385-KX-S1ZJ ntpd[3880]: synchronized to LOCAL(0), stratum 10
May 24 13:09:11 BJ-XHM-D2-05-DL385-KX-S1ZJ ntpd[3880]: synchronized to 10.142.132.33, stratum 2
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: INFO: task ps:32619 blocked for more than 120 seconds.
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: ps D 000000000000002b 0 32619 32618 32620 (NOTLB)
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: ffff8101cf2b9dd8 0000000000000086 ffff8101cf2b9de8 ffffffff80063ff8
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: 0000000700b118fa 0000000000000001 ffff81017925e820 ffff81022ebd17a0
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: 0012a62abe573eb1 0000000000000376 ffff81017925ea08 0000000700000c0f
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: Call Trace:
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff80063ff8>] thread_return+0x62/0xfe
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff800656ac>] __down_read+0x7a/0x92
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff800c4197>] access_process_vm+0x47/0x18d
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff800493d0>] get_task_mm+0x17/0x36
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff80107065>] proc_pid_cmdline+0x69/0xf4
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff80107571>] proc_info_read+0x5f/0xb9
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff8000b6b0>] vfs_read+0xcb/0x171
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff80011c01>] sys_read+0x45/0x6e
May 25 05:14:51 BJ-XHM-D2-05-DL385-KX-S1ZJ kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0
解决办法:
那是系统hung住了,用dump方式重启,比如DL385的NMI功能。那样才能分析hung住时候的dump了解原因。