[Pacemaker] never ending election

Tue Aug 5 15:42:39 CEST 2008

Unfortunately I cannot change anything at the moment, as the system  =

went into production.

As soon as I have the chance, I'll do it :-)

Best regards,
David
________________________________________________________________________

David Riccitelli

e-mail: david at interact.it
skype: ziodave
phone: +39.0658318336

  roma - tel.+39.0658318301 fax.+39.0658318303 P.I. 04856801008

Rispetta l'ambiente e non stampare questa e-mail a meno che non ti sia  =

realmente utile.
Please consider the environment and don't print this e-mail unless you  =

really need to.

NOTE SULLA PRIVACY
Le informazioni trasmesse attraverso la presente e-mail ed i suoi  =

allegati sono diretti esclusivamente al
destinatario e devono ritenersi riservati con divieto di diffusione e  =

di uso. La diffusione e la comunicazione
da parte di soggetto diverso dal destinatario =E8 vietata dall=92art. 616  =

e ss. c.p. e dal d. l.vo n. 196/03.
Se la presente e-mail ed i suoi allegati fossero stati ricevuti per  =

errore da persona diversa dal destinatario
siete pregati di distruggere tutto quanto ricevuto e di informare il  =

mittente con lo stesso mezzo.
________________________________________________________________________

On 05/ago/08, at 13:51, Andrew Beekhof wrote:

> On Sun, Aug 3, 2008 at 11:18, David Riccitelli <david at interact.it>  =

> wrote:
>> Hello there,
>> Can somebody help me with this problem?
>> I have 2 identical nodes, node #1 and node #2. Nodes are installed  =

>> with
>> CentOS 5 and the current version of heartbeat (2.1.3) and pacemaker  =

>> (0.6.5).
>> Each node has 2 network ports bonded together (mode 1). bonding is
>> configured and working fine.
>> The nodes have one resource configured. And I must say everything  =

>> works
>> fine. All the tests I'm running show perfect failovers, but one test:
>> 1. node #1 has the resource, node #2 is waiting,
>> 2. I remove both network cables from node #1,
>> 3. node #2 doesn't sense node #1 anymore and believes it is dead,
>> 4. node #2 brings up the resource,
>> 5. then I put back node #1 in the network - I believe the nodes  =

>> should see
>> themselves and one of the two will leave the resource,
>> 6. node #1 and node #2 see each other and start counting election  =

>> votes,
>> but for an indefinite time and the resource is active on two nodes  =

>> at the
>> same time:
>> logs (same on both nodes - this pattern repeats forever, until  =

>> heartbeat is
>> manually stopped on one of the nodes):
>
> Is there any chance you could add "debug 1" to ha.cf and retest?
> It seems that the log messages that would shed light on this (the ones
> that indicate why each side felt they "win") are debug ones :(
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker

-------------- next part --------------
Skipped content of type multipart/related