[Pacemaker] 4th node does not return to cluster
Andrew Beekhof
andrew at beekhof.net
Fri Sep 23 03:07:22 UTC 2011
Are you using:
> service {
> # Load the Pacemaker Cluster Resource Manager
> name: pacemaker
> ver: 1
> }
for all of the nodes?
On Wed, Aug 17, 2011 at 8:27 AM, Gabriel Gomiz
<ggomiz at cooperativaobrera.com.ar> wrote:
> Hi to all... :)
>
> We are experiencing some difficulties with a pacemaker 4 node cluster. 3
> nodes are ok but a 4th node, after some corosync failures (with core dumps)
> and pacemaker restarts included, does not returns to cluster.
>
> In the other 3 nodes the 4th appears online, but in the 4th node there is a
> empty cib when I display crm.
>
> Something weird in the logs is this kind of messages:
>
> Aug 16 19:07:15 lorien.cooperativaobrera.com.ar cib: [28120]: WARN:
> cib_peer_callback: Discarding cib_modify message (421) from
> mordor.cooperativaobrera.com.ar: not in our membership
>
> It seems as the 4th node is not considering itself as a member of the
> cluster. How can I rejoin the member again?
>
> Any help you cah give me will be highly appreciated.
>
> Many thanks in advance
>
> PD: If you need any additional logs, tests I can make, etc. I'm willing to
> make it.
>
> -----
>
> DATA:
>
> OS is CENTOS 6.0 64 bits
> PACEMAKER version 1.1.5
> COROSYNC 1.2.3-21
>
> NODE 1:
>
> [DB1] gandalf # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:05 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
> Resource Group: dashboard
> fs_dashboard (ocf::heartbeat:Filesystem): Started
> isildur.cooperativaobrera.com.ar
> ip_dashboard (ocf::heartbeat:IPaddr): Started
> isildur.cooperativaobrera.com.ar
> srv_httpd_dashboard (lsb:httpd.dashboard): Started
> isildur.cooperativaobrera.com.ar
> srv_dashjobs (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 2:
>
> [DB2] isildur # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:28 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
> Resource Group: dashboard
> fs_dashboard (ocf::heartbeat:Filesystem): Started
> isildur.cooperativaobrera.com.ar
> ip_dashboard (ocf::heartbeat:IPaddr): Started
> isildur.cooperativaobrera.com.ar
> srv_httpd_dashboard (lsb:httpd.dashboard): Started
> isildur.cooperativaobrera.com.ar
> srv_dashjobs (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 3:
>
> [VM1] mordor # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:40 2011
> Stack: openais
> Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum
> Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 4 Nodes configured, 4 expected votes
> 1 Resources configured.
> ============
>
> Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar
> mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ]
>
> Resource Group: dashboard
> fs_dashboard (ocf::heartbeat:Filesystem): Started
> isildur.cooperativaobrera.com.ar
> ip_dashboard (ocf::heartbeat:IPaddr): Started
> isildur.cooperativaobrera.com.ar
> srv_httpd_dashboard (lsb:httpd.dashboard): Started
> isildur.cooperativaobrera.com.ar
> srv_dashjobs (lsb:dashjobs): Started
> isildur.cooperativaobrera.com.ar
>
> NODE 4:
>
> [VM2] lorien # crm_mon -1
> ============
> Last updated: Tue Aug 16 19:21:54 2011
> Current DC: NONE
> 0 Nodes configured, unknown expected votes
> 0 Resources configured.
> ============
>
> LOGS ON NODE 4:
>
> <attached>
>
> CONFIG COROSYNC (NODE 4, other nodes are the same but changing bindnetaddr):
>
> compatibility: whitetank
>
> totem {
> version: 2
> secauth: off
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 192.168.238.43
> mcastaddr: 226.94.2.1
> mcastport: 5405
> }
> }
>
> logging {
> fileline: off
> to_stderr: no
> to_logfile: yes
> to_syslog: yes
> logfile: /var/log/cluster/corosync.log
> debug: off
> timestamp: on
> logger_subsys {
> subsys: AMF
> debug: off
> }
> }
>
> amf {
> mode: disabled
> }
>
> service {
> # Load the Pacemaker Cluster Resource Manager
> name: pacemaker
> ver: 1
> }
>
> --
> .^. Lic. Gabriel Gomiz - Red Hat Certified Engineer (RHCE)
> /V\ Jefe de Sistemas - Administrador Red y Servidores
> // \\ Gerencia de Sistemas - Cooperativa Obrera Ltda.
> /( )\ Tel (0291) 456-0084
> ^^-^^ s/Window[$s]/LINUX!!/g or die;
>
> PGP: http://admin.cooperativaobrera.com.ar/pgp/ggomiz.txt
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
More information about the Pacemaker
mailing list