[Pacemaker] cman multi-homed with udp-broadcast issues

Fri Jan 6 17:00:59 UTC 2012

So I'm trying to setup a cluster with a secondary communication ring in 
case the first ring fails. The cluster operates fine, but doesnt seem to 
handle path failure properly. When I break the path between the 2 nodes 
on ring 1, I get the following in the logs:

Jan  6 16:55:17 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] 
Incrementing problem counter for seqid 202 iface 165.212.15.49 to [1 of 3]
Jan  6 16:55:19 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1 
active with no faults
Jan  6 16:55:24 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] 
Incrementing problem counter for seqid 204 iface 165.212.15.49 to [1 of 3]
Jan  6 16:55:26 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1 
active with no faults
Jan  6 16:55:30 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] 
Incrementing problem counter for seqid 206 iface 165.212.15.49 to [1 of 3]
Jan  6 16:55:32 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1 
active with no faults

And it just repeats over and over. From notes I've found from others, it 
appears this might be because of each ring sharing the same broadcast 
address. Indeed this is the case as `cman_tool status` shows
Multicast addresses: 255.255.255.255 255.255.255.255
Node addresses: 165.212.64.49 165.212.15.49

However I've tried changing this address in the cluster.conf and it 
seems to be completely ignored. I've also tried changing the port for 
the second ring and thats also ignored (tcpdump shows them still going 
to the same port as ring 0).

So, is this indeed the cause of it not properly detecting ring failure? 
And if so, how can I fix it?

cluster.conf:
<?xml version="1.0" ?>
<cluster name="syslog" config_version="6">
<logging to_logfile="no" syslog_facility="local2" debug="on" />
<cman expected_votes="1" two_node="1" transport="udpb" port="5408" />
<totem rrp_mode="active" secauth="off" />
<clusternodes>
<clusternode name="syslog01" nodeid="1">
<altname name="syslog01-cms" port="5406" mcast="165.212.15.255" />
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="pcmk-1" />
</method>
</fence>
</clusternode>
<clusternode name="syslog02" nodeid="2">
<altname name="syslog02-cms" port="5406" mcast="165.212.15.255" />
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="pcmk-1" />
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="pcmk" agent="fence_pcmk" />
</fencedevices>
</cluster>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120106/5bcd03ff/attachment-0003.html>