[Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

Gianluca Cecchi gianluca.cecchi at gmail.com
Tue Mar 11 19:32:59 EDT 2014


On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>
> On 8 Mar 2014, at 11:31 am, Gianluca Cecchi <gianluca.cecchi at gmail.com> wrote:
>
>> I provoke power off of ovirteng01. Fencing agent works ok on
>> ovirteng02 and reboots it.
>> I stop boot ofovirteng01 at grub prompt to simulate problem in boot
>> (for example system put in console mode due to filesystem problem)
>> In the mean time ovirteng02 becomes master of drbd resource, but
>> doesn't start the group
>
> Can you attach the following file from ovirteng02:
>    /var/lib/pacemaker/pengine/pe-input-1082.bz2
>
> That will hold the answer
>

Thanks for your time Andrew.
Here it is:
https://drive.google.com/file/d/0BwoPbcrMv8mvNXI0M0dYenlRUFU/edit?usp=sharing

I note this inside the file:
    <constraints>
      <rsc_colocation id="colocation-ovirt-ms_OvirtData-INFINITY"
rsc="ovirt" rsc-role="Started" score="INFINITY"
with-rsc="ms_OvirtData" with-rsc-role="Master"/>
      <rsc_order first="ms_OvirtData" first-action="promote"
id="order-ms_OvirtData-ovirt-mandatory" then="ovirt"
then-action="start"/>
      <rsc_location id="cli-ban-ovirt-on-ovirteng02.localdomain.local"
rsc="ovirt" role="Started" node="ovirteng02.localdomain.local"
score="-INFINITY"/>
      <rsc_location rsc="ms_OvirtData"
id="drbd-fence-by-handler-ovirt-ms_OvirtData">
        <rule role="Master" score="-INFINITY"
id="drbd-fence-by-handler-ovirt-rule-ms_OvirtData">
          <expression attribute="#uname" operation="ne"
value="ovirteng02.localdomain.local"
id="drbd-fence-by-handler-ovirt-expr-ms_OvirtData"/>
        </rule>
      </rsc_location>
    </constraints>

does this mean that a constraint remained for some reason after a
previous test, so that ovirteng02 is unable to run ovirt group?

Can I check previous pe-input files to debug when constraint was put?

By the way I just checked again both nodes with power off when primary
and it works for both as expected.
If I reproduce what above didn't work (so poweroff of ovirteng01 while
master and with group running) the group correctly starts now on
ovirteng02.
While keeping ovirteng01 (rebooted by fencing agent) on grub prompt,
the command "pcs cluster edit" gives this on ovirteng02:

    <constraints>
      <rsc_colocation id="colocation-ovirt-ms_OvirtData-INFINITY"
rsc="ovirt" rsc-role="Started" score="INFINITY"
with-rsc="ms_OvirtData" with-rsc-role="Master"/>
      <rsc_order first="ms_OvirtData" first-action="promote"
id="order-ms_OvirtData-ovirt-mandatory" then="ovirt"
then-action="start"/>
      <rsc_location rsc="ms_OvirtData"
id="drbd-fence-by-handler-ovirt-ms_OvirtData">
        <rule role="Master" score="-INFINITY"
id="drbd-fence-by-handler-ovirt-rule-ms_OvirtData">
          <expression attribute="#uname" operation="ne"
value="ovirteng02.localdomain.local"
id="drbd-fence-by-handler-ovirt-expr-ms_OvirtData"/>
        </rule>
      </rsc_location>
    </constraints>

So the problem seems to be the line

      <rsc_location id="cli-ban-ovirt-on-ovirteng02.localdomain.local"
rsc="ovirt" role="Started" node="ovirteng02.localdomain.local"
score="-INFINITY"/>

correct?
could it be the effect of a "pcs resource move ovirt" without a "pcs
resource clear ovirt"?

Gianluca




More information about the Pacemaker mailing list