[Pacemaker] [Problem]The movement of the resource is not possible.

Andrew Beekhof andrew at beekhof.net
Tue Nov 30 09:46:45 UTC 2010


On Mon, Nov 29, 2010 at 5:11 AM,  <renayama19661014 at ybb.ne.jp> wrote:
> Hi Andrew,
>
> Sorry....
> My response was late.
>
>> I think the smartest thing to do here is drop the cib_scope_local flag from -f
>
>        if(do_force) {
>                crm_debug("Forcing...");
> /*              cib_options |= cib_scope_local|cib_quorum_override; */
>                cib_options |= cib_quorum_override;
>        }
>
>
> I confirmed movement with you according to a revision.
> The resource moves well.
>
> Can 1.0 reflect this revision?
> Because there is influence else, is it impossible?

I have no objection to it being added to 1.0, it should be safe.

>
> Best Regards,
> Hideo Yamauchi.
>
> --- Andrew Beekhof <andrew at beekhof.net> wrote:
>
>> 2010/11/8  <renayama19661014 at ybb.ne.jp>:
>> > Hi,
>> >
>> > By two simple node constitution, it caused trouble(monitor error) in a resource.
>> >
>> > ============
>> > Last updated: Mon Nov &#65533;8 10:16:50 2010
>> > Stack: Heartbeat
>> > Current DC: srv02 (f80f87fd-cc09-43c7-80bc-8d9e96de376b) - partition WITHOUT quorum
>> > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
>> > 2 Nodes configured, unknown expected votes
>> > 1 Resources configured.
>> > ============
>> >
>> > Online: [ srv01 srv02 ]
>> >
>> > &#65533;Resource Group: grpDummy
>> > &#65533; &#65533; prmDummy1-1 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-2 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-3 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-4 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> >
>> > Migration summary:
>> > * Node srv02:
>> > * Node srv01:
>> > &#65533; prmDummy1-1: migration-threshold=1 fail-count=1
>> >
>> > Failed actions:
>> > &#65533; &#65533;prmDummy1-1_monitor_30000 (node=srv01, call=7, rc=7, status=complete): not
> running
>> >
>> >
>> > I carried out the next command consecutively after a resource exceeded a fail-over.
>> >
>> > [root at srv01 ~]# crm_resource -C -r prmDummy1-1 -N srv01;crm_resource -M -r grpDummy -N srv01
>> -f -Q
>> >
>> > ============
>> > Last updated: Mon Nov &#65533;8 10:17:33 2010
>> > Stack: Heartbeat
>> > Current DC: srv02 (f80f87fd-cc09-43c7-80bc-8d9e96de376b) - partition WITHOUT quorum
>> > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
>> > 2 Nodes configured, unknown expected votes
>> > 1 Resources configured.
>> > ============
>> >
>> > Online: [ srv01 srv02 ]
>> >
>> > &#65533;Resource Group: grpDummy
>> > &#65533; &#65533; prmDummy1-1 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-2 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-3 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> > &#65533; &#65533; prmDummy1-4 &#65533; &#65533; &#65533; &#65533;(ocf::heartbeat:Dummy): Started
> srv02
>> >
>> > Migration summary:
>> > * Node srv02:
>> > * Node srv01:
>> >
>> > But, the resource does not move to a srv01 node.
>> >
>> > Does the "crm_resource -M" command have to carry it out after waiting for a S_IDLE state?
>> >
>> > Or is this phenomenon a bug?
>> >
>> > &#65533;* I attach a collection of hb_report file
>>
>> So the problem here is that not only does -f  enable logic in
>> move_resource(), but also
>>
>>               cib_options |= cib_scope_local|cib_quorum_override;
>>
>> Combined with the fact that crm_resource -C is not synchronous in 1.0,
>> if you run -M on a non-DC node, the updates hit the local cib while
>> the cluster is re-probing the resource(s).
>> This results in the two CIBs getting out of sync:
>> Nov  8 10:17:15 srv01 crmd: [5367]: WARN: cib_native_callback: CIB
>> command failed: Application of an update diff failed
>> Nov  8 10:17:15 srv01 crmd: [5367]: WARN: cib_native_callback: CIB
>> command failed: Application of an update diff failed
>>
>> and the process of re-syncing them results in the behavior you saw.
>>
>> I think the smartest thing to do here is drop the cib_scope_local flag from -f
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>



More information about the Pacemaker mailing list