[Pacemaker] [Problem] About the replacement of the master/slave resource.
Andrew Beekhof
andrew at beekhof.net
Tue Sep 11 11:21:44 UTC 2012
On Mon, Sep 10, 2012 at 4:42 PM, <renayama19661014 at ybb.ne.jp> wrote:
> Hi All,
>
> We confirmed movement of the trouble of the clone resource that we combined with Master/Slave resource.
>
> The master / slave resources are replaced under the influence of the trouble of the clonal resource.
>
> We confirmed it in the next procedure.
>
>
> Step1) We start a cluster and send cib.
>
> ============
> Last updated: Mon Sep 10 15:26:25 2012
> Stack: Heartbeat
> Current DC: drbd2 (08607c71-da7b-4abf-b6d5-39ee39552e89) - partition with quorum
> Version: 1.0.12-c6770b8
> 2 Nodes configured, unknown expected votes
> 6 Resources configured.
> ============
>
> Online: [ drbd1 drbd2 ]
>
> Resource Group: grpPostgreSQLDB
> prmApPostgreSQLDB (ocf::pacemaker:Dummy): Started drbd1
> Resource Group: grpStonith1
> prmStonith1-2 (stonith:external/ssh): Started drbd2
> prmStonith1-3 (stonith:meatware): Started drbd2
> Resource Group: grpStonith2
> prmStonith2-2 (stonith:external/ssh): Started drbd1
> prmStonith2-3 (stonith:meatware): Started drbd1
> Master/Slave Set: msDrPostgreSQLDB
> Masters: [ drbd1 ]
> Slaves: [ drbd2 ]
> Clone Set: clnDiskd1
> Started: [ drbd1 drbd2 ]
> Clone Set: clnPingd
> Started: [ drbd1 drbd2 ]
>
> Step2) We cause a monitor error in pingd.
>
> [root at drbd1 ~]# rm -rf /var/run/pingd-default_ping_set
>
> Step3) FailOver is finished.
>
> ============
> Last updated: Mon Sep 10 15:27:08 2012
> Stack: Heartbeat
> Current DC: drbd2 (08607c71-da7b-4abf-b6d5-39ee39552e89) - partition with quorum
> Version: 1.0.12-c6770b8
> 2 Nodes configured, unknown expected votes
> 6 Resources configured.
> ============
>
> Online: [ drbd1 drbd2 ]
>
> Resource Group: grpPostgreSQLDB
> prmApPostgreSQLDB (ocf::pacemaker:Dummy): Started drbd2
> Resource Group: grpStonith1
> prmStonith1-2 (stonith:external/ssh): Started drbd2
> prmStonith1-3 (stonith:meatware): Started drbd2
> Resource Group: grpStonith2
> prmStonith2-2 (stonith:external/ssh): Started drbd1
> prmStonith2-3 (stonith:meatware): Started drbd1
> Master/Slave Set: msDrPostgreSQLDB
> Masters: [ drbd2 ]
> Stopped: [ prmDrPostgreSQLDB:1 ]
> Clone Set: clnDiskd1
> Started: [ drbd1 drbd2 ]
> Clone Set: clnPingd
> Started: [ drbd2 ]
> Stopped: [ prmPingd:0 ]
>
> Failed actions:
> prmPingd:0_monitor_10000 (node=drbd1, call=14, rc=7, status=complete): not running
>
>
>
> However, Master/Slave resources seemed to be replaced when we watched log.
>
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Move resource prmApPostgreSQLDB#011(Started drbd1 -> drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmStonith1-2#011(Started drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmStonith1-3#011(Started drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmStonith2-2#011(Started drbd1)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmStonith2-3#011(Started drbd1)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Move resource prmDrPostgreSQLDB:0#011(Master drbd1 -> drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Stop resource prmDrPostgreSQLDB:1#011(drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmDiskd1:0#011(Started drbd1)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmDiskd1:1#011(Started drbd2)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Stop resource prmPingd:0#011(drbd1)
> Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave resource prmPingd:1#011(Started drbd2)
>
> The replacement is unnecessary, and Slave becomes Master, and inoperative Master should have only to originally stop.
>
> However, this problem seems to be solved in Pacemaker1.1.
>
> Will the correction be possible for Pacemaker1.0?
> Because I have a big difference in placement processing with Pacemaker1.1, I think that the correction to Pacemaker1.0 is difficult.
You're probably right. I will have a look soon.
>
> * This problem may have been reported as a known problem.
> * I registered this problem with Bugzilla.
> * http://bugs.clusterlabs.org/show_bug.cgi?id=5103
great :)
>
> Best Regards,
> Hideo Yamauchi.
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list