[Pacemaker] very slow pacemaker/corosync shutdown

Thu Sep 19 23:52:29 UTC 2013

On 19/09/2013, at 7:45 PM, David Lang <david at lang.hm> wrote:

> On Thu, 19 Sep 2013, Florian Crouzat wrote:
> 
>> Le 19/09/2013 00:25, David Lang a ?crit :
>>> I'm frequently running into a problem that shutting down
>>> pacemaker/corosync takes a very long time (several minutes)
>> 
>> Just to be 100% sure, you always respect the stop order ? Pacemaker *then* CMAN/corosync ?
> 
> 'service pacemaker stop' seems to take down cman as well, but frequently stalls before that.

logs?

> 
> we are definantly not taking down cman ahead of time.
> 
> But we are seeing problems on some systems where we start everything up, verify both nodes are seen, and then a day or so later notice that the two boxes are not communicating (one of the reasons we are looking at disabling multicast, the local networking people have 'interesting' ideas about multicast, and they may be causing problems)

this is quite likely the problem.
multicast support in various parts of the hardware and software stacks seems to be getting worse and worse over time :(

> 
> David Lang
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130920/de165e59/attachment-0004.sig>