[Pacemaker] resource agent starting out-of-order

Andrew Beekhof andrew at beekhof.net
Wed Mar 9 03:33:10 EST 2011


On Thu, Mar 3, 2011 at 7:05 AM, AP <pacemaker at inml.weebeastie.net> wrote:
> Hi,
>
> Having deep issues with my cluster setup. Everything works ok until
> I add a VirtualDomain RA in. Then things go pearshaped in that it seems
> to ignore the "order" crm config for it and starts as soon as it can.
>
> The crm config is provided below. Basically p-vd_vg.test1 attempts to
> start despite p-libvirtd not being started and p-drbd_vg.test1 not
> being master (or slave for that matter - ie it's not configured at all).
>
> Eventually p-libvirtd and p-drbd_vg.test1 start and p-vd_vg.test1 attempts
> to, pengine on the node where p-vd_vg.test1 is already running complains
> with:
>
> Mar  3 16:49:16 breadnut pengine: [2097]: ERROR: native_create_actions: Resource p-vd_vg.test1 (ocf::VirtualDomain) is active on 2 nodes attempting recovery
> Mar  3 16:49:16 breadnut pengine: [2097]: WARN: See http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.

Well... did you read that link?

>
> Then mass slaughter occurs and p-vd_vg.test1 is restarted where it was
> running previously whilst the other node gets an error for it.
>
> Essentially I cannot restart the 2nd node without it breaking the 1st.
>
> Now, as I understand it, a lone primitive will run once on any node - this
> is just fine by me.
>
> colo-vd_vg.test1 indicates that p-vd_vg.test1 should run where ms-drbd_vg.test1
> is master. ms-drbd_vg.test1 should only be master where clone-libvirtd is
> started.
>
> order-vg.test1 indicates that ms-drbd_vg.test1 should start after clone-lvm_gh
> is started (successfully). (This used to have a promote for ms-drbd_vg.test1
> but then ms-drbd_vg.test1 would be demoted and not stopped on shutdown which
> would cause clone-lvm_gh to error out on stop)
>
> order-vd_vg.test1 indicates p-vd_vg.test1 should only start where
> ms-drbd_vg.test1 and clone-libvirtd have both successfully started (the
> order of their starting being irrelevant).
>
> cli-standby-p-vd_vg.test1 was put there by my migrating p-vd_vg.test1
> about the place.
>
> This happens with or without fencing and with fencing configured as below
> or as just a single primited with both nodes in the hostlist.
>
> Help with this would be awesome and appreciated. I do not know what I am
> missing here. The config makes sense to me so I don't even know where
> to start poking and prodding. I be flailing.
>
> Config and s/w version list is below:
>
> OS: Debian Squeeze
> Kernel: 2.6.37.2
>
> PACKAGES:
>
> ii  cluster-agents                      1:1.0.4-0ubuntu1~custom1     The reusable cluster components for Linux HA
> ii  cluster-glue                        1.0.7-3ubuntu1~custom1       The reusable cluster components for Linux HA
> ii  corosync                            1.3.0-1ubuntu1~custom1       Standards-based cluster framework (daemon and modules)
> ii  libccs3                             3.1.0-0ubuntu1~custom1       Red Hat cluster suite - cluster configuration libraries
> ii  libcib1                             1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - CIB
> ii  libcman3                            3.1.0-0ubuntu1~custom1       Red Hat cluster suite - cluster manager libraries
> ii  libcorosync4                        1.3.0-1ubuntu1~custom1       Standards-based cluster framework (libraries)
> ii  libcrmcluster1                      1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - CRM
> ii  libcrmcommon2                       1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - common CRM
> ii  libfence4                           3.1.0-0ubuntu1~custom1       Red Hat cluster suite - fence client library
> ii  liblrm2                             1.0.7-3ubuntu1~custom1       Reusable cluster libraries -- liblrm2
> ii  libpe-rules2                        1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - rules for P-Engine
> ii  libpe-status3                       1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - status for P-Engine
> ii  libpengine3                         1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - P-Engine
> ii  libpils2                            1.0.7-3ubuntu1~custom1       Reusable cluster libraries -- libpils2
> ii  libplumb2                           1.0.7-3ubuntu1~custom1       Reusable cluster libraries -- libplumb2
> ii  libplumbgpl2                        1.0.7-3ubuntu1~custom1       Reusable cluster libraries -- libplumbgpl2
> ii  libstonith1                         1.0.7-3ubuntu1~custom1       Reusable cluster libraries -- libstonith1
> ii  libstonithd1                        1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - stonith
> ii  libtransitioner1                    1.1.5-0ubuntu1~ppa1~custom1  The Pacemaker libraries - transitioner
> ii  pacemaker                           1.1.5-0ubuntu1~ppa1~custom1  HA cluster resource manager
>
> CONFIG:
>
> node breadnut
> node breadnut2 \
>        attributes standby="off"
> primitive fencing-bn stonith:meatware \
>        params hostlist="breadnut" \
>        op start interval="0" timeout="60s" \
>        op stop interval="0" timeout="70s" \
>        op monitor interval="10" timeout="60s"
> primitive fencing-bn2 stonith:meatware \
>        params hostlist="breadnut2" \
>        op start interval="0" timeout="60s" \
>        op stop interval="0" timeout="70s" \
>        op monitor interval="10" timeout="60s"
> primitive p-drbd_vg.test1 ocf:linbit:drbd \
>        params drbd_resource="vg.test1" \
>        operations $id="ops-drbd_vg.test1" \
>        op start interval="0" timeout="240s" \
>        op stop interval="0" timeout="100s" \
>        op monitor interval="20" role="Master" timeout="20s" \
>        op monitor interval="30" role="Slave" timeout="20s"
> primitive p-libvirtd ocf:local:libvirtd \
>        meta allow-migrate="off" \
>        op start interval="0" timeout="200s" \
>        op stop interval="0" timeout="100s" \
>        op monitor interval="10" timeout="200s"
> primitive p-lvm_gh ocf:heartbeat:LVM \
>        params volgrpname="gh" \
>        meta allow-migrate="off" \
>        op start interval="0" timeout="90s" \
>        op stop interval="0" timeout="100s" \
>        op monitor interval="10" timeout="100s"
> primitive p-vd_vg.test1 ocf:heartbeat:VirtualDomain \
>        params config="/etc/libvirt/qemu/vg.test1.xml" \
>        params migration_transport="tcp" \
>        meta allow-migrate="true" is-managed="true" \
>        op start interval="0" timeout="120s" \
>        op stop interval="0" timeout="120s" \
>        op migrate_to interval="0" timeout="120s" \
>        op migrate_from interval="0" timeout="120s" \
>        op monitor interval="10s" timeout="120s"
> ms ms-drbd_vg.test1 p-drbd_vg.test1 \
>        meta resource-stickines="100" notify="true" master-max="2" target-role="Master"
> clone clone-libvirtd p-libvirtd \
>        meta interleave="true"
> clone clone-lvm_gh p-lvm_gh \
>        meta interleave="true"
> location cli-standby-p-vd_vg.test1 p-vd_vg.test1 \
>        rule $id="cli-standby-rule-p-vd_vg.test1" -inf: #uname eq breadnut2
> location loc-fencing-bn fencing-bn -inf: breadnut
> location loc-fencing-bn2 fencing-bn2 -inf: breadnut2
> colocation colo-vd_vg.test1 inf: p-vd_vg.test1:Started ms-drbd_vg.test1:Master clone-libvirtd:Started
> order order-vd_vg.test1 inf: ( ms-drbd_vg.test1:start clone-libvirtd:start ) p-vd_vg.test1:start
> order order-vg.test1 inf: clone-lvm_gh:start ms-drbd_vg.test1:start
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>        cluster-infrastructure="openais" \
>        default-resource-stickiness="1000" \
>        stonith-enabled="true" \
>        expected-quorum-votes="2" \
>        no-quorum-policy="ignore" \
>        last-lrm-refresh="1299128317"
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>




More information about the Pacemaker mailing list