linux平台使用udev rule动态扩容,需要添加裸设备。在执行start_udev的时候,会造成监听offline,节点vip飘到其他节点。
[root@node1 ~]# start_udev
Starting udev: [ OK ]
[grid@node1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.LISTENER.lsnr
ONLINE OFFLINE node1
ONLINE ONLINE node2
ora.OCRDG.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.asm
ONLINE ONLINE node1 Started
ONLINE ONLINE node2 Started
ora.gsd
OFFLINE OFFLINE node1
OFFLINE OFFLINE node2
ora.net1.network
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.ons
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.registry.acfs
ONLINE ONLINE node1
ONLINE ONLINE node2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE node2
ora.cvu
1 ONLINE ONLINE node2
ora.node1.vip
1 ONLINE INTERMEDIATE node2 FAILED OVER
ora.node2.vip
1 ONLINE ONLINE node2
ora.oc4j
1 ONLINE ONLINE node2
ora.scan1.vip
1 ONLINE ONLINE node2
ora.tommy.db
1 ONLINE ONLINE node1 Open
2 ONLINE ONLINE node2 Open
这种情况往往发生在扩容的时候,造成其中一个节点无法访问。
在另一个节点可以发现,故障节点的vip和scanip已经飘了过来。
[root@node2 ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:FF:3E:00
inet addr:192.168.200.20 Bcast:192.168.200.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:feff:3e00/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1302 errors:0 dropped:0 overruns:0 frame:0
TX packets:810 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:238029 (232.4 KiB) TX bytes:116747 (114.0 KiB)
eth0:1 Link encap:Ethernet HWaddr 08:00:27:FF:3E:00
inet addr:192.168.200.29 Bcast:192.168.200.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:2 Link encap:Ethernet HWaddr 08:00:27:FF:3E:00
inet addr:192.168.200.11 Bcast:192.168.200.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:3 Link encap:Ethernet HWaddr 08:00:27:FF:3E:00
inet addr:192.168.200.21 Bcast:192.168.200.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1 Link encap:Ethernet HWaddr 08:00:27:2E:25:6C
inet addr:172.10.200.20 Bcast:172.10.200.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe2e:256c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:92478 errors:0 dropped:0 overruns:0 frame:0
TX packets:154274 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:42887663 (40.9 MiB) TX bytes:115217418 (109.8 MiB)
eth1:1 Link encap:Ethernet HWaddr 08:00:27:2E:25:6C
inet addr:169.254.254.34 Bcast:169.254.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:37522 errors:0 dropped:0 overruns:0 frame:0
TX packets:37522 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:24002960 (22.8 MiB) TX bytes:24002960 (22.8 MiB)
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
这是因为start_udev删除了公有网络的接口,造成该节点的监听崩溃,集群软件将所有的资源,scanip的监听和vip从node1移动到了node2。
处理方法:
所有节点,在共有网络和私有网络的网卡的配置文件中添加下面的设置:
vi /etc/sysconfig/network-scripts/ifcfg-eth0
HOTPLUG="NO"
不用重启网卡,即可生效。
再次执行start_udev的时候,就不会发生上面的情况。