[Pacemaker] active standby failover
Andrew Beekhof
andrew at beekhof.net
Thu Apr 11 22:11:12 UTC 2013
On 10/04/2013, at 7:30 PM, Rus Hughes <russell.hughes at gmail.com> wrote:
> Hi,
>
> I hope I've got the right list, I'm still a little confused about where CMAN ends and Pacemaker begins!
Think of CMAN as some extra APIs for corosync.
Anything you would configure in Pacemaker when using corosync is still configured there when using CMAN.
Glad to hear you found the problem.
> We're using Pacemaker and CMAN on Oracle 6.3 to try and create an active/standby failover pair, but seem to have some annoying conditions that are making this tricky.
>
> We hae 2 nodes, vfontopensips1 and vfontopensips2 that we want a VIP to float between based on the availability of a single daemon we have called OSP
>
> We have a daemon, called OSP that we want running at all times on both nodes, we dont want Pacemaker to stop/start this so I believe the correct thing to do is configure it as unmanaged?
>
> We have one virtual IP that we want on one of the nodes running the OSP daemon.
>
> The ideal condition is vfontopensips1 has an instance of OSP and the VIP on it and vfontopensips2 has a running instance of OSP on it. If OSP dies or fails on vfontopensips1 we want the VIP to move to vfontopensips2 immediately, we wont want Pacemaker/CMAN to try and restart it.
>
> If OSP is then restarted/fixed manually on vfontopensips1 we'd like Pacemaker/CMAN to detect that monitor events are now working and mark the node as available but to not move the VIP back to it unless there's a a failure on vfontopensips2.
>
> Here's the output of crm configure show
>
> node vfontopensips1
> node vfontopensips2
> primitive ClusterIPPres ocf:heartbeat:IPaddr2 \
> params ip="10.30.0.176" cidr_netmask="32" \
> op monitor interval="5s"
> primitive osp ocf:netdev:osp \
> params interval="1s" \
> op monitor interval="5s" \
> meta is-managed="false" migration-threshold="1" on-fail="standby"
> colocation osp-with-ip 200: osp ClusterIPPres
> property $id="cib-bootstrap-options" \
> dc-version="1.1.8-7.el6-394e906" \
> cluster-infrastructure="cman" \
> stonith-enabled="false" \
> no-quorum-policy="ignore" \
> last-lrm-refresh="1365509847"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100" \
> migration-threshold="1" \
> allow-migrate="true" \
> failure-timeout="5s"
>
> I've attached the OSP OCF file to this email.
>
> This is the output of crm_mon when both OSP instances are up and vfontopensips1 has the VIP
>
> *********************
>
> Last updated: Wed Apr 10 10:21:11 2013
> Last change: Tue Apr 9 16:39:00 2013 via cibadmin on vfontopensips1
> Stack: cman
> Current DC: vfontopensips1 - partition with quorum
> Version: 1.1.8-7.el6-394e906
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
>
>
> Online: [ vfontopensips1 vfontopensips2 ]
>
> ClusterIPPres (ocf::heartbeat:IPaddr2): Started vfontopensips1
> osp (ocf::netdev:osp): Started (unmanaged) [ vfontopensips1 vfontopensips2 ]
>
> *********************
>
> If OSP fails vfontopensips1 loses the VIP but the VIP doesnt move to vfontopensips2 .. crm_mon outputs
>
> *********************
>
> Last updated: Wed Apr 10 10:22:20 2013
> Last change: Tue Apr 9 16:39:00 2013 via cibadmin on vfontopensips1
> Stack: cman
> Current DC: vfontopensips1 - partition with quorum
> Version: 1.1.8-7.el6-394e906
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
>
>
> Online: [ vfontopensips1 vfontopensips2 ]
>
> osp (ocf::netdev:osp): Started (unmanaged) FAILED [ vfontopensips1 vfontopensips2 ]
>
> Failed actions:
> osp_monitor_5000 (node=vfontopensips1, call=96, rc=7, status=complete): not running
> ClusterIPPres_migrate_to_0 (node=vfontopensips1, call=161, rc=3, status=complete): unimplemented feature
> ClusterIPPres_migrate_from_0 (node=vfontopensips2, call=138, rc=3, status=complete): unimplemented feature
>
> *********************
>
> If we fix OSP the VIP then comes back up on vfontopensips1
>
> *********************
>
> Last updated: Wed Apr 10 10:23:26 2013
> Last change: Tue Apr 9 16:39:00 2013 via cibadmin on vfontopensips1
> Stack: cman
> Current DC: vfontopensips1 - partition with quorum
> Version: 1.1.8-7.el6-394e906
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
>
>
> Online: [ vfontopensips1 vfontopensips2 ]
>
> ClusterIPPres (ocf::heartbeat:IPaddr2): Started vfontopensips1
> osp (ocf::netdev:osp): Started (unmanaged) [ vfontopensips1 vfontopensips2 ]
>
> *********************
>
> Obviously this isn't the behaviour I'm after as OSP is up and available on vfontopensips2 so we'd like the VIP to move there..
>
> Any hints would be great please as this has been confusing me for a few days now!
>
> The versions we are using are:
>
> cman-3.0.12.1-49.el6.x86_64
> pacemaker-libs-1.1.8-7.el6.x86_64
> pacemaker-1.1.8-7.el6.x86_64
> pacemaker-cluster-libs-1.1.8-7.el6.x86_64
> pacemaker-cli-1.1.8-7.el6.x86_64
>
> Linux vfontopensips1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Cheers,
>
> Rus
> <osp>_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list