[Pacemaker] When STONITH is not completed, a resource starts.

Andrew Beekhof beekhof at gmail.com
Wed Jan 14 12:37:56 UTC 2009


On Wed, Jan 14, 2009 at 09:59,  <renayama19661014 at ybb.ne.jp> wrote:
> Hi,
>
>> > 1)I make it the state that a resource starts in a standby node.
>> > 2)I change it so that a stop error occurs in a dummy resource.
>> > 3)I generate the monitor error of the dummy resource in a standby
>> > node.
>> > 4)After a stop error, STONITH is carried out by a partner node.
>> > 5)Keep STONITH from a standby node waiting.
>> > 6)While STONITH is not completed, I reboot a standby node.
>>
>> Is this in a two-node cluster?
> Yes.
>
>> > Though STONITH from a DC node does not succeed, a resource is started.
>> > When STONITH did not succeed, the resource was not started at a non-
>> > DC node.
>>
>> I don't understand what you're saying here.
>> The first statement says a resource was started and the second says it
>> wasn't... they can't both be true.
>
> I'm sorry.
> It caused misunderstanding.
>
> It is time when STONITH is carried out in the environment of two nodes by a standby node.
>
> A resource is started without waiting for completion of STONITH from a DC node.
> While STONITH is not completed, this problem happens if an active node fell.

So let me see if I understand this correctly...

You start with two healthy nodes.

You cause a resource on A to fail, at which point B tries to shoot it.

The stonith op never completes and before it times out, you restart B.

Resources get started on B.

Questions:

Is the above accurate?
Is only the dummy resource started, or are other ones started too?
When B comes up again, does it form a two-node cluster with A?
Is A still up or has it become the DC and shot itself?

>
> I confirmed the same confirmation based on OpenAIS.
> However, in OpenAIS, the same problem did not occur.
> In OpenAIS, the start of the resource is evaded well.

Sorry, parsing error... I can't tell if you're saying the problem also
exists for clusters based on OpenAIS.
I think you're saying it does not happen if you use OpenAIS instead of
Heartbeat.

>
> --- Andrew Beekhof <beekhof at gmail.com> wrote:
>
>>
>> On Jan 14, 2009, at 2:52 AM, <renayama19661014 at ybb.ne.jp> <renayama19661014 at ybb.ne.jp
>>  > wrote:
>>
>> > Hi,
>> >
>> > About movement of STONITH, I tested it.
>> > (heartbeat 2.99.2 + Pacemaker-1-0-6fd0eebd186e.tar.gz on
>> > RHEL5.2(i386VM))
>> >
>> > When what I confirmed carries out STONITH from a DC node and a non-
>> > DC node.
>> >
>> > I confirmed it in the next flow.
>> >
>> > 1)I make it the state that a resource starts in a standby node.
>> > 2)I change it so that a stop error occurs in a dummy resource.
>> > 3)I generate the monitor error of the dummy resource in a standby
>> > node.
>> > 4)After a stop error, STONITH is carried out by a partner node.
>> > 5)Keep STONITH from a standby node waiting.
>> > 6)While STONITH is not completed, I reboot a standby node.
>>
>> Is this in a two-node cluster?
>>
>> > I watched log.
>>
>> >
>> > Though STONITH from a DC node does not succeed, a resource is started.
>> > When STONITH did not succeed, the resource was not started at a non-
>> > DC node.
>>
>> I don't understand what you're saying here.
>> The first statement says a resource was started and the second says it
>> wasn't... they can't both be true.
>>
>> >
>> >
>> > ---------------------------------------------------------------------------
>> > Jan 13 16:01:25 ais-1 crmd: [6003]: info: send_rsc_command:
>> > Initiating action 7: start
>> > prmDummy1_start_0 on ais-1
>> > ---------------------------------------------------------------------------
>> >
>> > When STONITH did not succeed, I thought that the resource did not
>> > start.
>> > Does not the behavior when STONITH failed from a DC node have a
>> > problem?
>> >
>> > I attach a result of hb_report.
>> > - stonith_exec_dc.tar.gz (A result when STONITH was carried out by a
>> > DC node(ais-1))
>> > - stonith_exec_nodc.tar.gz(A result when STONITH was carried out by
>> > a non-DC node(ais-1))




More information about the Pacemaker mailing list