[ClusterLabs] Sudden stop of pacemaker functions
Jan Pokorný
jpokorny at redhat.com
Wed Feb 17 13:48:46 UTC 2016
On 17/02/16 15:15 +0200, Klechomir wrote:
> Here is the output from your command:
>
> attrd: 609413
> cib: 609409
> corosync: 608778
> crmd: 609415
> lrmd: 609412
> pengine: 609414
> pacemakerd: 609407
> stonithd: 609411
This may mean that you are triggering this nasty bug in libqb:
https://github.com/ClusterLabs/libqb/pull/162
(fixed in libqb-0.17.2)
> Regarding using a newer version, that's what I've been thinking about, but
> I've been using this combination of corosync/pacemaker for many years on a
> different hardware and hever had similar problem.
> The main difference is that I have stonith enabled only the problematic
> cluster, but I also suspect that the node, which causes this problem may
> have some hardware issues.
Stonith/fencing should be configured at any cluster to satisfy fully
what HA clusters are for, full stop.
> BTW my last few tests with the newest corosync/pacemaker gave me very
> annoying delay, when commiting configuration changes (maybe it's a known
> problem?).
Cannot comment on this but definitely good to be aware of possible
performance regressions.
--
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160217/28f761d2/attachment-0004.sig>
More information about the Users
mailing list