[Pacemaker] Order of resources in a group and crm_diff

Gao,Yan ygao at suse.com
Fri Jun 6 01:21:29 EDT 2014


On 01/29/14 13:44, Andrew Beekhof wrote:
> 
> On 28 Jan 2014, at 10:11 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 
>> Hi all,
>>
>> Just discovered, that when I add resource to a middle of
>> (running) group, it is added to the end.
>>
>> I mean, if I update following (crmsh syntax)
>>
>> group dhcp-server vip-10-5-200-244 dhcpd
>>
>> with
>>
>> group dhcp-server vip-10-5-200-244 vip-10-5-201-244 dhcpd
>>
>> with 'crm configure load update', actual definition becomes
>>
>> group dhcp-server vip-10-5-200-244 dhcpd vip-10-5-201-244
>>
>> Also, strange enough, if I get XML CIB with cibadmin -Q, then edit
>> order of primitives with text editor, crm_diff doesn't show any differences:
>>
>> cib-orig.xml:
>> ...
>>      <group id="dhcp-server">
>>        <primitive id="vip-10-5-200-244" class="ocf" provider="heartbeat" type="IPaddr2">
>>          <instance_attributes id="vip-10-5-200-244-instance_attributes">
>>            <nvpair name="ip" value="10.5.200.244" id="vip-10-5-200-244-instance_attributes-ip"/>
>>            <nvpair name="cidr_netmask" value="32" id="vip-10-5-200-244-instance_attributes-cidr_netmask"/>
>>            <nvpair name="nic" value="vlan1" id="vip-10-5-200-244-instance_attributes-nic"/>
>>          </instance_attributes>
>>          <operations>
>>            <op name="start" interval="0" timeout="20" id="vip-10-5-200-244-start-0"/>
>>            <op name="stop" interval="0" timeout="20" id="vip-10-5-200-244-stop-0"/>
>>            <op name="monitor" interval="30" id="vip-10-5-200-244-monitor-30"/>
>>          </operations>
>>        </primitive>
>>        <primitive id="dhcpd" class="lsb" type="dhcpd">
>>          <operations>
>>            <op name="monitor" interval="10" timeout="15" id="dhcpd-monitor-10"/>
>>            <op name="start" interval="0" timeout="90" id="dhcpd-start-0"/>
>>            <op name="stop" interval="0" timeout="90" id="dhcpd-stop-0"/>
>>          </operations>
>>          <meta_attributes id="dhcpd-meta_attributes">
>>            <nvpair id="dhcpd-meta_attributes-target-role" name="target-role" value="Started"/>
>>          </meta_attributes>
>>        </primitive>
>>        <primitive id="vip-10-5-201-244" class="ocf" provider="heartbeat" type="IPaddr2">
>>          <instance_attributes id="vip-10-5-201-244-instance_attributes">
>>            <nvpair name="ip" value="10.5.201.244" id="vip-10-5-201-244-instance_attributes-ip"/>
>>            <nvpair name="cidr_netmask" value="24" id="vip-10-5-201-244-instance_attributes-cidr_netmask"/>
>>            <nvpair name="nic" value="vlan201" id="vip-10-5-201-244-instance_attributes-nic"/>
>>          </instance_attributes>
>>          <operations>
>>            <op name="start" interval="0" timeout="20" id="vip-10-5-201-244-start-0"/>
>>            <op name="stop" interval="0" timeout="20" id="vip-10-5-201-244-stop-0"/>
>>            <op name="monitor" interval="30" id="vip-10-5-201-244-monitor-30"/>
>>          </operations>
>>        </primitive>
>>      </group>
>> ...
>>
>> cib.xml:
>> ...
>>     <group id="dhcp-server">
>>        <primitive id="vip-10-5-200-244" class="ocf" provider="heartbeat" type="IPaddr2">
>>          <instance_attributes id="vip-10-5-200-244-instance_attributes">
>>            <nvpair name="ip" value="10.5.200.244" id="vip-10-5-200-244-instance_attributes-ip"/>
>>            <nvpair name="cidr_netmask" value="32" id="vip-10-5-200-244-instance_attributes-cidr_netmask"/>
>>            <nvpair name="nic" value="vlan1" id="vip-10-5-200-244-instance_attributes-nic"/>
>>          </instance_attributes>
>>          <operations>
>>            <op name="start" interval="0" timeout="20" id="vip-10-5-200-244-start-0"/>
>>            <op name="stop" interval="0" timeout="20" id="vip-10-5-200-244-stop-0"/>
>>            <op name="monitor" interval="30" id="vip-10-5-200-244-monitor-30"/>
>>          </operations>
>>        </primitive>
>>        <primitive id="vip-10-5-201-244" class="ocf" provider="heartbeat" type="IPaddr2">
>>          <instance_attributes id="vip-10-5-201-244-instance_attributes">
>>            <nvpair name="ip" value="10.5.201.244" id="vip-10-5-201-244-instance_attributes-ip"/>
>>            <nvpair name="cidr_netmask" value="24" id="vip-10-5-201-244-instance_attributes-cidr_netmask"/>
>>            <nvpair name="nic" value="vlan201" id="vip-10-5-201-244-instance_attributes-nic"/>
>>          </instance_attributes>
>>          <operations>
>>            <op name="start" interval="0" timeout="20" id="vip-10-5-201-244-start-0"/>
>>            <op name="stop" interval="0" timeout="20" id="vip-10-5-201-244-stop-0"/>
>>            <op name="monitor" interval="30" id="vip-10-5-201-244-monitor-30"/>
>>          </operations>
>>        </primitive>
>>        <primitive id="dhcpd" class="lsb" type="dhcpd">
>>          <operations>
>>            <op name="monitor" interval="10" timeout="15" id="dhcpd-monitor-10"/>
>>            <op name="start" interval="0" timeout="90" id="dhcpd-start-0"/>
>>            <op name="stop" interval="0" timeout="90" id="dhcpd-stop-0"/>
>>          </operations>
>>          <meta_attributes id="dhcpd-meta_attributes">
>>            <nvpair id="dhcpd-meta_attributes-target-role" name="target-role" value="Started"/>
>>          </meta_attributes>
>>        </primitive>
>>      </group>
>> ...
>>
>> # crm_diff --original cib-orig.xml --new cib.xml
>>
>> shows nothing.
>>
>> And, 'cibadmin --replace --xml-file cib.xml' does nothing:
>>
>> Jan 28 11:01:21 booter-0 cib[2693]:   notice: cib:diff: Diff: --- 0.427.2
>> Jan 28 11:01:21 booter-0 cib[2693]:   notice: cib:diff: Diff: +++ 0.427.19 df366a02885285cc95529f402bfdac12
>> Jan 28 11:01:21 booter-0 cib[2693]:   notice: cib:diff: --           <nvpair id="status-2-shutdown" name="shutdown" value="0"/>
>> Jan 28 11:01:21 booter-0 cib[2693]:   notice: cib:diff: ++ <cib epoch="427" num_updates="19" admin_epoch="0" validate-with="pacemaker-1.2" cib-last-written="Tue Jan 28 10:46:06 2014" update-origin="booter-0" update-client="cibadmin" crm_feature_set="3.0.8" have-quorum="1" dc-uuid="1"/>
> 
> Thats a known deficiency in the v1 diff format (and why we need costly digests to detect ordering changes).
> Happily .12 will have a new and improve diff format that will handle this correctly.
> 
>>
>> But, after I do
>>
>> # crm_shadow --create-empty myShadow
>> shadow[myShadow] # cibadmin -E --force
>> shadow[myShadow] # cibadmin --replace --xml-file cib.xml
>> shadow[myShadow] # crm_shadow --commit myShadow --force
>> Now type Ctrl-D to exit the crm_shadow shell
>> shadow[myShadow] # exit
>>
>> group becomes defined in a proper order.
>>
>> That's why the only suspect is xml-diff algorithm.
>>
>> Andrew, David, could you please look?
> 
> Its also partly how crmsh is using diffs.
> It could be verifying the diff produces the correct result by verifying the above mentioned digest.
> Or it could do a replace for the group instead...
I'm a bit surprised that even a replace cannot successfully reorder
resources in a group. I tried it on 1.1.9 ~ 1.1.11.

On DC:
Jun  6 12:18:51 sles11-1 cib[1814]:   notice: cib_perform_op:
Configuration ordering change detected
Jun  6 12:18:51 sles11-1 cib[1814]:   notice: cib:diff: Diff: --- 0.3835.86
Jun  6 12:18:51 sles11-1 cib[1814]:   notice: cib:diff: Diff: +++
0.3835.1 21300207d1fe995ea0475be3dc60718f


On non-DC:
Jun  6 12:16:50 sles11-2 cib[32053]:  warning: cib_process_diff: Diff
0.3835.81 -> 0.3835.1 from sles11-1 not applied to 0.3835.81: Failed
application of an update diff
Jun  6 12:16:50 sles11-2 cib[32053]:  warning: cib_process_replace:
Replacement 0.3835.1 from sles11-1 not applied to 0.3835.81: current
num_updates is greater than the replacement


I think the crm_shadow way mentioned above works because it bumps
"epoch" itself.


If we replace only the snippet of the group with
cibadmin -R -o resources -x group.xml

, it'll apply the change in DC's cib, while it'll leave the non-DC's cib
out of sync.


1.1.12-rc goes a different way in cib_perform_cib() and works.

So, for 1.1.10/1.1.11, is it supposed to be like:

--- pacemaker.orig/lib/cib/cib_utils.c
+++ pacemaker/lib/cib/cib_utils.c
@@ -565,6 +565,7 @@ cib_perform_op(const char *op, int call_
         } else if (crm_str_eq(new_digest, last_digest, TRUE) == FALSE) {

             crm_notice("Configuration ordering change detected");
+            cib_update_counter(scratch, XML_ATTR_GENERATION, FALSE);
             cib_update_counter(scratch, XML_ATTR_NUMUPDATES, TRUE);

             crm_trace("Old: %s, New: %s", last_digest, new_digest);

?

Regards,
  Yan

> 
>>
>> Thank you,
>> Vladislav
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

-- 
Gao,Yan <ygao at suse.com>
Software Engineer
China Server Team, SUSE.




More information about the Pacemaker mailing list