[Pacemaker] problem with pacemaker/corosync on CentOS 6.3
fatcharly at gmx.de
fatcharly at gmx.de
Fri Jul 20 15:50:52 UTC 2012
Hi Jake,
I erased the files as mentioned und started the services. This is what I get on pilotpound after crm_mon :
============
Last updated: Fri Jul 20 17:45:58 2012
Last change:
Current DC: NONE
0 Nodes configured, unknown expected votes
0 Resources configured.
============
Looks like the system didn´t joined the cluster.
Any suggestions are welcome
Kind regards
fatharly
------- Original-Nachricht --------
> Datum: Fri, 20 Jul 2012 10:49:15 -0400 (EDT)
> Von: Jake Smith <jsmith at argotec.com>
> An: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
> Betreff: Re: [Pacemaker] problem with pacemaker/corosync on CentOS 6.3
>
> ----- Original Message -----
> > From: fatcharly at gmx.de
> > To: pacemaker at oss.clusterlabs.org
> > Sent: Friday, July 20, 2012 6:08:45 AM
> > Subject: [Pacemaker] problem with pacemaker/corosync on CentOS 6.3
> >
> > Hi,
> >
> > I´m using a pacemaker+corosync bundle to run a pound based
> > loadbalancer. After an update on CentOS 6.3 there is some mismatch
> > of the node status. Via crm_mon on one node eveything looks fine
> > while on the other node everything is offline. Everything was fine
> > on CentOS 6.2.
> >
> > Node powerpound:
> >
> > ============
> > Last updated: Fri Jul 20 12:04:29 2012
> > Last change: Thu Jul 19 17:58:31 2012 via crm_attribute on pilotpound
> > Stack: openais
> > Current DC: powerpound - partition with quorum
> > Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
> > 2 Nodes configured, 2 expected votes
> > 7 Resources configured.
> > ============
> >
> > Online: [ powerpound pilotpound ]
> >
> > HA_IP_1 (ocf::heartbeat:IPaddr2): Started powerpound
> > HA_IP_2 (ocf::heartbeat:IPaddr2): Started powerpound
> > HA_IP_3 (ocf::heartbeat:IPaddr2): Started powerpound
> > HA_IP_4 (ocf::heartbeat:IPaddr2): Started powerpound
> > HA_IP_5 (ocf::heartbeat:IPaddr2): Started powerpound
> > Clone Set: pingclone [ping-gateway]
> > Started: [ pilotpound powerpound ]
> >
> >
> > Node pilotpound:
> >
> > ============
> > Last updated: Fri Jul 20 12:04:32 2012
> > Last change: Thu Jul 19 17:58:17 2012 via crm_attribute on pilotpound
> > Stack: openais
> > Current DC: NONE
> > 2 Nodes configured, 2 expected votes
> > 7 Resources configured.
> > ============
> >
> > OFFLINE: [ powerpound pilotpound ]
> >
> >
> >
> >
> >
> > from /var/log/messages on pilotpound:
> >
> > Jul 20 12:06:12 pilotpound cib[24755]: warning: cib_peer_callback:
> > Discarding cib_apply_diff message (35909) from powerpound: not in
> > our mem bership
> > Jul 20 12:06:12 pilotpound cib[24755]: warning: cib_peer_callback:
> > Discarding cib_apply_diff message (35910) from powerpound: not in
> > our mem bership
> >
> >
> >
> > how could this happened and what can I do to solve this problem ?
>
> Pretty sure it had nothing to do with upgrade - I had this the other day
> on Ubuntu 12.04 after a reboot of both nodes. I believe a couple experts
> called it a "transient" bug. See:
> https://bugzilla.redhat.com/show_bug.cgi?id=820821
> https://bugzilla.redhat.com/show_bug.cgi?id=5040
>
> >
> > Any suggestions are welcome
>
> I fixed by stopping/killing pacemaker/corosync on offending node
> (pilotpound). Then cleared these files out on same node:
> rm /var/lib/heartbeat/crm/cib*
> rm /var/lib/pengine/*
>
> Then restart corosync/pacemaker and the node rejoined fine.
>
> HTH
>
> Jake
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list