[Pacemaker] pacemaker node stuck offline

pacemaker at feystorm.net pacemaker at feystorm.net
Fri Mar 22 02:39:14 UTC 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/21/2013 11:15 AM, Andreas Kurz wrote:
> On 2013-03-21 14:31, Patrick Hemmer wrote:
>> I've got a 2-node cluster where it seems last night one of the nodes
>> went offline, and I can't see any reason why.
>>
>> Attached are the logs from the 2 nodes (the relevant timeframe seems to
>> be 2013-03-21 between 06:05 and 06:10).
>> This is on ubuntu 12.04
>
> Looks like your non-redundant cluster-communication was interrupted at
> around that time for whatever reason and your cluster split-brained.
>
> Does the drbd-replication use a different network-connection? If yes,
> why not using it for a redundant ring setup ... and you should use
STONITH.
>
> I also wonder why you have defined "expected_votes='1'" in your
> cluster.conf.
>
> Regards,
> Andreas
But shouldn't it have recovered? The node shows as "OFFLINE", even
though it's clearly communicating with the rest of the cluster. What is
the procedure for getting the node back online. Anything other than
bouncing pacemaker?

Unfortunately no to the different network connection for drbd. These are
2 EC2 instances, so redundant connections aren't available. Though since
it is EC2, I could set up a STONITH to whack the other instance. The
only problem here would be a race condition. The EC2 api for shutting
down or rebooting an instance isn't instantaneous. Both nodes could end
up sending the signal to reboot the other node.

As for expected_votes=1, it's because it's a two-node cluster. Though I
apparently forgot to set the `two_node` attribute :-(

- -Patrick
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRS8RSAAoJED0CF0ckHb4J5/4IAIBTh92ySD9NatBjanOtvwIZ
G7ldoPD/o//pOD8A76ZzJnbN+m5PQ1cykpwuC6j+l+fHbkYlDHYEnjbrdRS2dJFY
i1PibEIIOjeEAiK9PmCphKQ2qbkrKJXB0QdFD0EZjFFeatNfx/MBHInTBVdFa5MI
wZ19qcNELxHZHsrAfgFxYGzKvA1mCVZuRhFXpMoZJ9vo3RUFT1GaLbLA/k8+NHgQ
qPbmiYR0RI1cB+HqWl/Hn+PpWnV9zrF/vcZXISHp+cWpZ+IxzmDowR6iIHP+tC7N
AslkXAfz4BlH0cuM2kjA9ZdkApzGttH7GkMyOrOQ4Rv8rV4teQjMtPogMcqdFuc=
=lYXu
-----END PGP SIGNATURE-----





More information about the Pacemaker mailing list