[Pacemaker] token lost - need clarification
Michael Schwartzkopff
ms at sys4.de
Tue Dec 17 08:28:51 UTC 2013
Am Dienstag, 17. Dezember 2013, 09:17:31 schrieb marco at nucleus.it:
> Hi to all,
> i set up a 2 node cluster with a cross cable between the two nodes
> without stonith ; i know this is not the best way but this is the
> scenario i need at that time.
>
> I know the releases are old:
> corosync-1.2.7-1.2
> libcorosync-1.2.7-1.2
> pacemaker-1.0.10-1.4
> libpacemaker3-1.0.10-1.4
>
> Everything was ok for some days/months but a few day ago without
> network interruption ( no messages relative to ethernet modules or
> errors in network statistics or notifications by nagios ping checks )
> between the two nodes something went wrong.
>
> From what i try to understand from the logs attached :
> Token Timeout (10000 ms) retransmit timeout (980 ms)
> token hold (774 ms) retransmits before loss (10 retrans)
>
>
> the 2 nodes lost a token and they try to solve the situation but
> node1 think node2 is up:
>
> Dec 7 05:01:41 node1 pengine: [1138]: info: determine_online_status:
> Node node2 is online
> Dec 7 05:01:41 node1 pengine: [1138]: info:
> determine_online_status: Node node1 is online
>
> and then lost
>
> Dec 7 05:01:54 node1 corosync[1128]: [pcmk ] info:
> ais_mark_unseen_peer_dead: Node node2 was not seen in the previous
> transition
> Dec 7 05:01:54 node1 corosync[1128]: [pcmk ] info: update_member:
> Node 33559980/node2 is now: lost
>
> while node2 think node1 was gone:
>
> Dec 7 05:01:34 node2 corosync[6356]: [pcmk ] info:
> ais_mark_unseen_peer_dead: Node node1 was not seen in the previous
> transition Dec 7 05:01:34 node2 corosync[6356]: [pcmk ] info:
> update_member: Node 16782764/node1 is now: lost
>
> then they go in spilt brain .
> Any suggestion about why node1 saw node2 ath the first time while node2
> declared immediately lost node1 ?
This depends who initiates the round. Both nodes recognized the failure within
20 seconds. This is ok. Especially if you allow 10 Sekunds for a token
timeout.
Mit freundlichen Grüßen,
Michael Schwartzkopff
--
[*] sys4 AG
http://sys4.de, +49 (89) 30 90 46 64, +49 (162) 165 0044
Franziskanerstraße 15, 81669 München
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131217/ccbcd6b1/attachment-0004.sig>
More information about the Pacemaker
mailing list