[Pacemaker] No communication between nodes (setup problem)

Wed Jan 30 14:01:38 UTC 2013

On Wed, Jan 30, 2013 at 3:27 PM, Keith Ouellette
<Keith.Ouellette at airgas.com> wrote:
> Hans,
>
>
>
>    Is the multicast port 5405 "opened" in the firewall? That has bitten me
> before.

5405 and 5404.

>
>
>
> Thanks,
>
> Keith
>
>
>
> ________________________________
> From: Hans Bert [dadeda2002 at yahoo.de]
> Sent: Wednesday, January 30, 2013 8:22 AM
> To: and k; The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] No communication between nodes (setup problem)
>
> Hi,
>
> in the meantime I modified the configuration to check if it works with
> multicast
>
> totem {
>   version: 2
>   secauth: off
>   cluster_name: mcscluster
>   interface {
>     ringnumber: 0
>     bindnetaddr: 192.168.100.0
>     mcastaddr: 239.255.1.12
>     mcastport: 5405
>     ttl: 1
>   }
> }
>
> but unfortunately it is still not working.
> I started wireshark and I can see on both hosts MC packages from both hosts.
>
> YES selinux is disabled on both nodes
>
> [root at server1 corosync]# selinuxenabled
> [root at server1 corosync]# echo $?
> 1
>
>
> Something else I found out is:
>
> [root at server1 corosync]# pcs status nodes both
> Error mapping 192.168.100.111
> Error mapping 192.168.100.112
> Corosync Nodes:
>  Online:
>  Offline: 192.168.100.111 192.168.100.112
> Pacemaker Nodes:
>  Online: server1
>  Standby:
>  Offline:
>
>
> [root at server2 corosync]# pcs status nodes both
> Error mapping 192.168.100.111
> Error mapping 192.168.100.112
> Corosync Nodes:
>  Online:
>  Offline: 192.168.100.111 192.168.100.112
> Pacemaker Nodes:
>  Online: server2
>  Standby:
>  Offline:
>
>
>
> any further hints?
>
>
> Best regards,
> Hans
>
>
>
>
> ________________________________
>
>
> Hi,
>
> It seem to be problem with network traffic.
>
> Have you tried to sniff network traffic to be sure that udp traffic reaches
> from one node to another ??
>
> Try on server1:
>
> tcpdump -i interface -p udp -s 192.168.100.112
>
> on server2:
>
> tcpdump -i interface -p udp -s 192.168.100.111
>
>
> if there will be no packet traffic, that means you have some network issue.
>
> BTW: Is SELinux enabled on nodes ??
>
> --
> Regards
> Andrew
>
>
>
> 2013/1/30 Hans Bert
>
> Hello,
>
> we had to move from Fedora 16 to Fedora 18 and wanted to set up Corosync
> with Pacemaker and PCS as management tool.
> With F16 our cluster was running pretty good, but with F18 after 5 days we
> are reaching the point were we don't have
> got ideas what might be the problem(s).
>
>
> The cluster is build of two servers (server1=192.168.100.111;
> server2=192.168.100.112)
>
> Based on the Howto for F18 with pcs we created the following corosync.conf:
>
> totem {
>   version: 2
>   secauth: off
>   cluster_name: mcscluster
>   transport: udpu
> }
>
> nodelist {
>   node {
>     ring0_addr: 192.168.100.111
>   }
>   node {
>     ring0_addr: 192.168.100.112
>   }
> }
>
> quorum {
>   provider: corosync_votequorum
> }
>
> logging {
>   fileline: off
>   to_stderr: no
>   to_logfile: yes
>   to_syslog: yes
>   logfile: /var/log/cluster/corosync.log
>   debug: on
>   timestamp: on
> }
>
>
>
> After we started the server a status check shows us:
>
>
> [root at server1 corosync]#pcs status corosync
>
> Membership information
> ----------------------
>     Nodeid      Votes Name
> 1868867776          1 server1 (local)
>
> [root at server1 ~]# pcs status
> Last updated: Wed Jan 30 10:45:17 2013
> Last change: Wed Jan 30 10:18:56 2013 via cibadmin on server1
> Stack: corosync
> Current DC: server1 (1868867776) - partition WITHOUT quorum
> Version: 1.1.8-3.fc18-394e906
> 1 Nodes configured, unknown expected votes
> 0 Resources configured.
>
>
> Online: [ server1 ]
>
> Full list of resources:
>
>
>
> And on the other server:
>
>
> [root at server2 corosync]# pcs status corosync
>
> Membership information
> ----------------------
>     Nodeid      Votes Name
> 1885644992          1 server2 (local)
>
> [root at server2 corosync]# pcs status
> Last updated: Wed Jan 30 10:44:40 2013
> Last change: Wed Jan 30 10:19:36 2013 via cibadmin on server2
> Stack: corosync
> Current DC: server2 (1885644992) - partition WITHOUT quorum
> Version: 1.1.8-3.fc18-394e906
> 1 Nodes configured, unknown expected votes
> 0 Resources configured.
>
>
> Online: [ server2 ]
>
>
>
>
>
>
> The only warnings and errors in the logfile are:
>
> [root at server1 ~]# cat /var/log/cluster/corosync.log | egrep "warning|error"
> Jan 30 10:25:59 [1608] server1       crmd:  warning: do_log:    FSA: Input
> I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
> Jan 30 10:25:59 [1607] server1    pengine:  warning: cluster_status:    We
> do not have quorum - fencing and resource management disabled
> Jan 30 10:28:25 [1525] server1 corosync debug   [QUORUM] getinfo response
> error: 1
> Jan 30 10:40:59 [1607] server1    pengine:  warning: cluster_status:    We
> do not have quorum - fencing and resource management disabled
>
>
> root at server2 corosync]# cat /var/log/cluster/corosync.log | egrep
> "warning|error"
> Jan 30 10:27:18 [1458] server2       crmd:  warning: do_log:    FSA: Input
> I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
> Jan 30 10:27:18 [1457] server2    pengine:  warning: cluster_status:    We
> do not have quorum - fencing and resource management disabled
> Jan 30 10:29:19 [1349] server2 corosync debug   [QUORUM] getinfo response
> error: 1
> Jan 30 10:42:18 [1457] server2    pengine:  warning: cluster_status:    We
> do not have quorum - fencing and resource management disabled
> Jan 30 10:44:36 [1349] server2 corosync debug   [QUORUM] getinfo response
> error: 1
>
>
>
>
> We have installed the following packages:
>
> corosync-2.2.0-1.fc18.i686
> corosynclib-2.2.0-1.fc18.i686
> drbd-bash-completion-8.3.13-1.fc18.i686
> drbd-pacemaker-8.3.13-1.fc18.i686
> drbd-utils-8.3.13-1.fc18.i686
> pacemaker-1.1.8-3.fc18.i686
> pacemaker-cli-1.1.8-3.fc18.i686
> pacemaker-cluster-libs-1.1.8-3.fc18.i686
> pacemaker-libs-1.1.8-3.fc18.i686
> pcs-0.9.27-3.fc18.i686
>
>
>
> Firewalls are disabled, Pinging and SSH communication is working without any
> problems.
>
> With best regards
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>


-- 
Dan Frincu
CCNA, RHCE