[Pacemaker] Question about the behavior when a pacemaker's process crashed

Andrew Beekhof andrew at beekhof.net
Thu Jul 18 06:23:09 EDT 2013


On 17/07/2013, at 6:53 PM, Kazunori INOUE <inouekazu at intellilink.co.jp> wrote:

> (13.07.16 21:18), Andrew Beekhof wrote:
>> 
>> On 16/07/2013, at 7:04 PM, Kazunori INOUE <inouekazu at intellilink.co.jp> wrote:
>> 
>>> (13.07.15 11:00), Andrew Beekhof wrote:
>>>> 
>>>> On 12/07/2013, at 6:28 PM, Kazunori INOUE <inouekazu at intellilink.co.jp> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I'm using pacemaker-1.1.10.
>>>>> When a pacemaker's process crashed, the node is sometimes fenced or is not sometimes fenced.
>>>>> Is this the assumed behavior?
>>>> 
>>>> Yes.
>>>> 
>>>> Sometimes the dev1 respawns the processes fast enough that dev2 gets the "hey, i'm back" notification before the PE gets run and fencing can be initiated.
>>>> In such cases, there is nothing to be gained from fencing - dev1 is reachable and responding.
>>> 
>>> OK... but I want pacemaker to certainly perform either behavior (fence is performed or fence is not performed), since operation is troublesome.
>>> I think that it is better if user can specify behavior as an option.
>> 
>> This makes no sense. Sorry.
>> It is wrong to induce more downtime than absolutely necessary just to make a test pass.
> 
> If careful of the increase in downtime, isn't it better to prevent fencing, in this case?

With hindsight, yes.
But we have no way of knowing at the time.
If you want pacemaker to wait some time for it to come back, you can set crmd-transition-delay which will achieve the same thing it does for attrd.

> Because pacemakerd respawns a broken child process, so the cluster will return to a online state.
> If so, does subsequent fencing not increase a downtime?

Yes, but only we know that because we have more knowledge than the cluster.

> 
> Best regards.
> 
>> 
>>>> 
>>>> It makes writing CTS tests hard, but it is not incorrect.
>>>> 
>>>>> 
>>>>> procedure:
>>>>> $ systemctl start pacemaker
>>>>> $ crm configure load update test.cli
>>>>> $ pkill -9 lrmd
>>>>> 
>>>>> attachment:
>>>>> STONITH.tar.bz2 : it's crm_report when fenced
>>>>> notSTONITH.tar.bz2 : it's crm_report when not fenced
>>>>> 
>>>>> Best regards.
>>>>> <notSTONITH.tar.bz2><STONITH.tar.bz2>_______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>> 
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>> 
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list