[Pacemaker] Time out issue while stopping resource in pacemaker

Mon Oct 13 23:46:26 UTC 2014

On 14 Oct 2014, at 5:11 am, Lax <lkota at cisco.com> wrote:

> Andrew Beekhof <andrew at ...> writes:
> 
> 
>> I'm guessing you don't have stonith?
>> 
>> The underlying philosophy is that the services pacemaker manages need to
> exit before pacemaker can.
>> If the service can't stop, it would be dishonest of pacemaker to do so.
>> 
>> If you had fencing, it would have been able to clean up after a failed
> stop and allow the rest of the cluster to continue.
> 
> Thanks Andrew. I have a 2 node setup so had to turn off stonith. 

One does not imply the other. Stonith is arguably even more important for 2-node clusters.

> 
> One more thing, on another setup with same configuration while running
> pacemaker I keep getting 'gfs_controld[10744]: daemon cpg_join error
> retrying'. Even after I force kill the pacemaker processes and reboot the
> server and bring the pacemaker back up, it keeps giving cpg_join error. Is
> there any way to fix this issue?  

That would be something for the gfs and/or corosync guys I'm afraid

> 
> Thanks
> Lax
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141014/dadae97f/attachment-0009.sig>