Red Hat Enterprise Linux 5 with Cluster Suite Software is inquorate

One of the nodes of a Red Hat Enterprise Linux 5 with Cluster Suite software is inquorate

root@linux01:~ # clustat
Cluster Status for clinformatica @ Tue Jul 18 14:52:42 2017
Member Status: Quorate

Member Name ID Status
—— —- —- ——
linux01.heartbeat.local 1 Online, Local, rgmanager
linux02.heartbeat.local 2 Offline
/dev/mapper/qdisk0 0 Online, Quorum Disk

Service Name Owner (Last) State
——- —- —– —— —–
service:PCenterETL_41 linux01.heartbeat.local started
service:PCenterETL_42 (none) stopped

root@linux02:~ # clustat
Cluster Status for clinformatica @ Tue Jul 18 14:53:10 2017
Member Status: Inquorate

Member Name ID Status
—— —- —- ——
linux01.heartbeat.local 1 Offline
linux02.heartbeat.local 2 Online, Local
/dev/mapper/qdisk0 0 Offline

To troubleshoot the problem, install omping. It is part of EPEL

Download the package here: https://download.fedoraproject.org/pub/archive/epel/5/x86_64/omping-0.0.4-1.el5.x86_64.rpm
Install the package

root@linux01:/tmp # rpm -ivh omping-0.0.4-1.el5.x86_64.rpm
warning: omping-0.0.4-1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 217521f6
Preparing… ########################################### [100%]
1:omping ########################################### [100%]

root@linux02:/tmp # rpm -ivh omping-0.0.4-1.el5.x86_64.rpm
warning: omping-0.0.4-1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 217521f6
Preparing… ########################################### [100%]
1:omping ########################################### [100%]

Check which IP address the cluster uses

root@linux01:~ # grep clusternode /etc/cluster/cluster.conf | grep name

linux01-hb is 142.40.81.128
linux02-hb is 142.40.81.129

Run omping with the IP of the machine followed by the other node

root@linux01:~ # omping 142.40.81.128 142.40.81.129
142.40.81.129 : waiting for response msg
142.40.81.129 : waiting for response msg
142.40.81.129 : waiting for response msg
142.40.81.129 : waiting for response msg
142.40.81.129 : waiting for response msg
142.40.81.129 : waiting for response msg
142.40.81.129 : joined (S,G) = (*, 232.43.211.234), pinging
142.40.81.129 : unicast, seq=1, size=69 bytes, dist=0, time=0.246ms
142.40.81.129 : multicast, seq=1, size=69 bytes, dist=0, time=0.251ms

142.40.81.129 : multicast, seq=179, size=69 bytes, dist=0, time=0.269ms
142.40.81.129 : unicast, seq=180, size=69 bytes, dist=0, time=0.233ms
142.40.81.129 : multicast, seq=180, size=69 bytes, dist=0, time=0.239ms
142.40.81.129 : unicast, seq=181, size=69 bytes, dist=0, time=0.213ms
142.40.81.129 : multicast, seq=181, size=69 bytes, dist=0, time=0.219ms
142.40.81.129 : unicast, seq=182, size=69 bytes, dist=0, time=0.231ms
142.40.81.129 : multicast, seq=182, size=69 bytes, dist=0, time=0.236ms
142.40.81.129 : unicast, seq=183, size=69 bytes, dist=0, time=0.209ms
142.40.81.129 : multicast, seq=183, size=69 bytes, dist=0, time=0.286ms
142.40.81.129 : unicast, seq=184, size=69 bytes, dist=0, time=0.254ms
142.40.81.129 : unicast, seq=185, size=69 bytes, dist=0, time=0.176ms
142.40.81.129 : unicast, seq=186, size=69 bytes, dist=0, time=0.191ms
142.40.81.129 : unicast, seq=187, size=69 bytes, dist=0, time=0.291ms
142.40.81.129 : unicast, seq=188, size=69 bytes, dist=0, time=0.203ms
142.40.81.129 : unicast, seq=189, size=69 bytes, dist=0, time=0.199ms
142.40.81.129 : unicast, seq=190, size=69 bytes, dist=0, time=0.209ms
142.40.81.129 : unicast, seq=191, size=69 bytes, dist=0, time=0.145ms
142.40.81.129 : unicast, seq=192, size=69 bytes, dist=0, time=0.210ms
142.40.81.129 : unicast, seq=193, size=69 bytes, dist=0, time=0.281ms
142.40.81.129 : unicast, seq=194, size=69 bytes, dist=0, time=0.186ms
142.40.81.129 : unicast, seq=195, size=69 bytes, dist=0, time=0.195ms
142.40.81.129 : unicast, seq=196, size=69 bytes, dist=0, time=0.141ms
142.40.81.129 : unicast, seq=197, size=69 bytes, dist=0, time=0.205ms
142.40.81.129 : unicast, seq=198, size=69 bytes, dist=0, time=0.196ms
142.40.81.129 : unicast, seq=199, size=69 bytes, dist=0, time=0.179ms
142.40.81.129 : unicast, seq=200, size=69 bytes, dist=0, time=0.190ms

142.40.81.129 : unicast, xmt/rcv/%loss = 200/200/0%, min/avg/max/std-dev = 0.104/0.199/0.306/0.039
142.40.81.129 : multicast, xmt/rcv/%loss = 200/183/8%, min/avg/max/std-dev = 0.126/0.215/0.311/0.041

Running omping with the IP of the machine followed by the other node

root@linux02:~ # omping 142.40.81.129 142.40.81.128
142.40.81.128 : waiting for response msg
142.40.81.128 : joined (S,G) = (*, 232.43.211.234), pinging
142.40.81.128 : unicast, seq=1, size=69 bytes, dist=0, time=0.172ms
142.40.81.128 : multicast, seq=1, size=69 bytes, dist=0, time=0.287ms
142.40.81.128 : unicast, seq=2, size=69 bytes, dist=0, time=0.196ms
142.40.81.128 : multicast, seq=2, size=69 bytes, dist=0, time=0.252ms

142.40.81.128 : unicast, seq=184, size=69 bytes, dist=0, time=0.311ms
142.40.81.128 : multicast, seq=184, size=69 bytes, dist=0, time=0.367ms
142.40.81.128 : unicast, seq=185, size=69 bytes, dist=0, time=0.286ms
142.40.81.128 : multicast, seq=185, size=69 bytes, dist=0, time=0.338ms
142.40.81.128 : unicast, seq=186, size=69 bytes, dist=0, time=0.184ms
142.40.81.128 : unicast, seq=187, size=69 bytes, dist=0, time=0.193ms
142.40.81.128 : unicast, seq=188, size=69 bytes, dist=0, time=0.174ms
142.40.81.128 : unicast, seq=189, size=69 bytes, dist=0, time=0.192ms
142.40.81.128 : unicast, seq=190, size=69 bytes, dist=0, time=0.200ms
142.40.81.128 : unicast, seq=191, size=69 bytes, dist=0, time=0.241ms
142.40.81.128 : unicast, seq=192, size=69 bytes, dist=0, time=0.304ms
142.40.81.128 : unicast, seq=193, size=69 bytes, dist=0, time=0.259ms
142.40.81.128 : unicast, seq=194, size=69 bytes, dist=0, time=0.272ms
142.40.81.128 : unicast, seq=195, size=69 bytes, dist=0, time=0.246ms
142.40.81.128 : unicast, seq=196, size=69 bytes, dist=0, time=0.611ms
142.40.81.128 : unicast, seq=197, size=69 bytes, dist=0, time=0.208ms
142.40.81.128 : unicast, seq=198, size=69 bytes, dist=0, time=0.200ms
142.40.81.128 : unicast, seq=199, size=69 bytes, dist=0, time=0.194ms
142.40.81.128 : unicast, seq=200, size=69 bytes, dist=0, time=0.186ms
142.40.81.128 : unicast, seq=201, size=69 bytes, dist=0, time=0.190ms
142.40.81.128 : waiting for response msg
142.40.81.128 : server told us to stop

142.40.81.128 : unicast, xmt/rcv/%loss = 201/201/0%, min/avg/max/std-dev = 0.115/0.223/0.611/0.055
142.40.81.128 : multicast, xmt/rcv/%loss = 201/185/7%, min/avg/max/std-dev = 0.159/0.278/0.797/0.062

After a few minutes, the network switch blocked multicast communication.

Solved after reconfiguring network switch

How to Avoid a Split-Brain Scenario with Cisco Switches by Enabling Multicast Communication

Advertisement