[Pacemaker] 4th node does not return to cluster

Andrew Beekhof andrew at beekhof.net
Thu Sep 22 23:07:22 EDT 2011


Are you using:

> service {
>        # Load the Pacemaker Cluster Resource Manager
>        name: pacemaker
>        ver:  1
> }

for all of the nodes?

On Wed, Aug 17, 2011 at 8:27 AM, Gabriel Gomiz
<ggomiz at cooperativaobrera.com.ar> wrote:
> Hi to all... :)
>
> We are experiencing some difficulties with a pacemaker 4 node cluster. 3
> nodes are ok but a 4th node, after some corosync failures (with core dumps)
> and pacemaker restarts included, does not returns to cluster.
>
> In the other 3 nodes the 4th appears online, but in the 4th node there is a
> empty cib when I display crm.
>
> Something weird in the logs is this kind of messages:
>
> Aug 16 19:07:15 lorien.cooperativaobrera.com.ar cib: [28120]: WARN:
> cib_peer_callback: Discarding cib_modify message (421) from
> mordor.cooperativaobrera.com.ar: not in our membership
>
> It seems as the 4th node is not considering itself as a member of the
> cluster. How can I rejoin the member again?
>
> Any help you cah give me will be highly appreciated.
>
> Many thanks in advance
>
> PD: If you need any additional logs, tests I can make, etc. I'm willing to
> make it.
>
> -----
>
> DATA:
>
> OS is CENTOS 6.0 64 bits
> PACEMAKER version 1.1.5
> COROSYNC 1.2.3-21
>
> NODE 1:
>
> [DB1] gandalf # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:05 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
>  Resource Group: dashboard
>     fs_dashboard       (ocf::heartbeat:Filesystem):    Started
> isildur.cooperativaobrera.com.ar
>     ip_dashboard       (ocf::heartbeat:IPaddr):        Started
> isildur.cooperativaobrera.com.ar
>     srv_httpd_dashboard        (lsb:httpd.dashboard):  Started
> isildur.cooperativaobrera.com.ar
>     srv_dashjobs       (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 2:
>
> [DB2] isildur # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:28 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
>  Resource Group: dashboard
>     fs_dashboard       (ocf::heartbeat:Filesystem):    Started
> isildur.cooperativaobrera.com.ar
>     ip_dashboard       (ocf::heartbeat:IPaddr):        Started
> isildur.cooperativaobrera.com.ar
>     srv_httpd_dashboard        (lsb:httpd.dashboard):  Started
> isildur.cooperativaobrera.com.ar
>     srv_dashjobs       (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 3:
>
> [VM1] mordor # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:40 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
>  Resource Group: dashboard
>     fs_dashboard       (ocf::heartbeat:Filesystem):    Started
> isildur.cooperativaobrera.com.ar
>     ip_dashboard       (ocf::heartbeat:IPaddr):        Started
> isildur.cooperativaobrera.com.ar
>     srv_httpd_dashboard        (lsb:httpd.dashboard):  Started
> isildur.cooperativaobrera.com.ar
>     srv_dashjobs       (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 4:
>
> [VM2] lorien # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:54 2011
> Current DC: NONE
> 0 Nodes configured, unknown expected votes
> 0 Resources configured.
> ============
>
> LOGS ON NODE 4:
>
> <attached>
>
> CONFIG COROSYNC (NODE 4, other nodes are the same but changing bindnetaddr):
>
> compatibility: whitetank
>
> totem {
>        version: 2
>        secauth: off
>        threads: 0
>        interface {
>                ringnumber: 0
>                bindnetaddr: 192.168.238.43
>                mcastaddr: 226.94.2.1
>                mcastport: 5405
>        }
> }
>
> logging {
>        fileline: off
>        to_stderr: no
>        to_logfile: yes
>        to_syslog: yes
>        logfile: /var/log/cluster/corosync.log
>        debug: off
>        timestamp: on
>        logger_subsys {
>                subsys: AMF
>                debug: off
>        }
> }
>
> amf {
>        mode: disabled
> }
>
> service {
>        # Load the Pacemaker Cluster Resource Manager
>        name: pacemaker
>        ver:  1
> }
>
> --
>      .^.    Lic. Gabriel Gomiz - Red Hat Certified Engineer (RHCE)
>      /V\    Jefe de Sistemas - Administrador Red y Servidores
>     // \\   Gerencia de Sistemas - Cooperativa Obrera Ltda.
>    /(   )\  Tel (0291) 456-0084
>     ^^-^^   s/Window[$s]/LINUX!!/g or die;
>
> PGP: http://admin.cooperativaobrera.com.ar/pgp/ggomiz.txt
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>




More information about the Pacemaker mailing list