[Pacemaker] master-slave resource repeats restart
Kazunori INOUE
inouekazu at intellilink.co.jp
Tue Oct 16 07:08:31 UTC 2012
Hi David,
I opened Bugzilla about this.
* http://bugs.clusterlabs.org/show_bug.cgi?id=5111
Best Regards,
Kazunori INOUE
(12.10.15 23:46), David Vossel wrote:
> ----- Original Message -----
>> From: "Kazunori INOUE" <inouekazu at intellilink.co.jp>
>> To: "pacemaker at oss" <pacemaker at oss.clusterlabs.org>
>> Cc: shimazakik at intellilink.co.jp
>> Sent: Monday, October 15, 2012 4:21:27 AM
>> Subject: [Pacemaker] master-slave resource repeats restart
>>
>> Hi,
>>
>> I am using Pacemaker-1.1.
>> - pacemaker f722cf1ff9 (2012 Oct10)
>> - corosync dc7002195a (2012 Oct11)
>>
>> If monitor (_on-fail is stop_) of a master resource fails, the
>> resource
>> repeats restart in other node.
>
> Weird, so we stop the resource on all nodes, but then recover it on the nodes that didn't have the failure. That doesn't seem right. Please open a new issue in bugs.clusterlabs.org for this.
>
> -- Vossel
>
>> [test case]
>> 1. use StatefulRA which set on-fail="stop" of Master's monitor.
>>
>> [configuration of master-slave resource]
>> ms msAP prmAP \
>> meta master-max="1" master-node-max="1" \
>> clone-max="2" clone-node-max="1"
>>
>> primitive prmAP ocf:pacemaker:Stateful \
>> :
>> op monitor role="Master" interval="10s" timeout="20s"
>> on-fail="stop" \
>> :
>>
>> # crm_mon -rfA1
>> Last updated: Mon Oct 15 16:09:57 2012
>> Last change: Mon Oct 15 16:09:49 2012 via cibadmin on vm5
>> Stack: corosync
>> Current DC: vm5 (2439358656) - partition with quorum
>> Version: 1.1.8-f722cf1
>> 2 Nodes configured, unknown expected votes
>> 4 Resources configured.
>>
>>
>> Online: [ vm5 vm6 ]
>>
>> Full list of resources:
>>
>> Master/Slave Set: msAP [prmAP]
>> Masters: [ vm5 ]
>> Slaves: [ vm6 ]
>> Clone Set: clnPingd [prmPingd]
>> Started: [ vm5 vm6 ]
>>
>> Node Attributes:
>> * Node vm5:
>> + default_ping_set : 100
>> + master-prmAP : 10
>> * Node vm6:
>> + default_ping_set : 100
>> + master-prmAP : 5
>>
>> Migration summary:
>> * Node vm5:
>> * Node vm6:
>>
>> 2. let the master resource on vm5 fail,
>>
>> # echo a >> /var/run/Stateful-prmAP.state
>>
>> then the master-slave resource repeats restart on vm6.
>> the state of the following (a)~(c) is repeated.
>>
>> (a)
>> Full list of resources:
>>
>> Master/Slave Set: msAP [prmAP]
>> Stopped: [ prmAP:0 prmAP:1 ]
>> Clone Set: clnPingd [prmPingd]
>> Started: [ vm5 vm6 ]
>>
>> Node Attributes:
>> * Node vm5:
>> + default_ping_set : 100
>> * Node vm6:
>> + default_ping_set : 100
>>
>> (b)
>> Full list of resources:
>>
>> Master/Slave Set: msAP [prmAP]
>> Slaves: [ vm6 ]
>> Stopped: [ prmAP:1 ]
>> Clone Set: clnPingd [prmPingd]
>> Started: [ vm5 vm6 ]
>>
>> Node Attributes:
>> * Node vm5:
>> + default_ping_set : 100
>> * Node vm6:
>> + default_ping_set : 100
>> + master-prmAP : 5
>>
>> (c)
>> Full list of resources:
>>
>> Master/Slave Set: msAP [prmAP]
>> Masters: [ vm6 ]
>> Stopped: [ prmAP:1 ]
>> Clone Set: clnPingd [prmPingd]
>> Started: [ vm5 vm6 ]
>>
>> Node Attributes:
>> * Node vm5:
>> + default_ping_set : 100
>> * Node vm6:
>> + default_ping_set : 100
>> + master-prmAP : 10
>>
>> Best Regards,
>> Kazunori INOUE
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list