<div dir="ltr">Well, I&#39;m not getting failures right now simply with attributes, but I can induce a failure by stopping the vm-db02 (it puts db02 into an unclean state, and attempts to migrate the unrelated vm-compute-test).  I&#39;ve collected the commands from my latest interactions, a crm_report, and a gdb traceback from the core file that crmd dumped, into bug 5164.</div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jul 2, 2013 at 8:40 PM, David Vossel <span dir="ltr">&lt;<a href="mailto:dvossel@redhat.com" target="_blank">dvossel@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
<br>
<br>
<br>
----- Original Message -----<br>
&gt; From: &quot;Lindsay Todd&quot; &lt;<a href="mailto:rltodd.ml1@gmail.com">rltodd.ml1@gmail.com</a>&gt;<br>
&gt; To: &quot;The Pacemaker cluster resource manager&quot; &lt;<a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a>&gt;<br>
</div><div class="im">&gt; Sent: Tuesday, July 2, 2013 5:36:43 PM<br>
&gt; Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
&gt;<br>
</div><div class="im">&gt; You didn&#39;t notice that after setting attributes on &quot;db02&quot;, the remote node<br>
&gt; &quot;db02&quot; went offline as &quot;unclean&quot;, even though vm-db02 was still running?<br>
<br>
</div>nope... apparently I&#39;m blind :)<br>
<div class="im"><br>
&gt; That strikes me as wrong! Once it gets into this state, I can order vm-db02<br>
&gt; to stop, but it never will. Indeed, pacemaker doesn&#39;t do much at this point<br>
<br>
</div>I&#39;m really confused about how a remote-node could manage to get into an &quot;UNCLEAN&quot; state. Interesting.  Can you reproduce it easily? A crm_report attached to a <a href="http://bugs.clusterlabs.org" target="_blank">bugs.clusterlabs.org</a> issue would be helpful.  If you haven&#39;t erased your logs you could still retrieve everything in the report the the specific time period it occurred in.  I definitely need to get that worked out.<br>

<div class="im"><br>
&gt; -- I can put everything into standby mode, and services don&#39;t shut down.<br>
&gt; That is why the forcible reboot. Also, why I don&#39;t know (yet) what would<br>
&gt; happen to a service on db02 when this happens -- it takes too long to<br>
&gt; restart the cluster to carry out too many tests in one day!<br>
&gt;<br>
&gt; I&#39;ll review asymmetrical clusters -- I think my mistake was thinking an<br>
&gt; infinite score location constraint to put DummyOnVM on db02 would prevent it<br>
&gt; from running anywhere else, but of course of db02 isn&#39;t running, my one rule<br>
&gt; isn&#39;t equivalent to having -inf scores elsewhere. Still odd that shutting<br>
&gt; down vm-db02 would trigger a migration of an unrelated VM.<br>
<br>
</div>look into resource stickiness.  Setting a default resource stickiness should prevent this.  It might be that shutting down vm-db02 some how meant that pacemaker decided to balance out the resources in a way that involved migrating the other vm.<br>

<span class="HOEnZb"><font color="#888888"><br>
-- Vossel<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
&gt; (The fact that<br>
&gt; would also stop vm-swbuild is the known problem that constraints don&#39;t work<br>
&gt; well with migration.)<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; On Tue, Jul 2, 2013 at 6:20 PM, David Vossel &lt; <a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a> &gt; wrote:<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; ----- Original Message -----<br>
&gt; &gt; From: &quot;Lindsay Todd&quot; &lt; <a href="mailto:rltodd.ml1@gmail.com">rltodd.ml1@gmail.com</a> &gt;<br>
&gt; &gt; To: &quot;The Pacemaker cluster resource manager&quot; &lt;<br>
&gt; &gt; <a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a> &gt;<br>
&gt; &gt; Sent: Tuesday, July 2, 2013 4:05:22 PM<br>
&gt; &gt; Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
&gt; &gt;<br>
&gt; &gt; Sorry for the delayed response, but I was out last week. I&#39;ve applied this<br>
&gt; &gt; patch to 1.1.10-rc5 and have been testing:<br>
&gt; &gt;<br>
&gt;<br>
&gt; Thanks for testing :)<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; # crm_attribute --type status --node &quot;db02&quot; --name &quot;service_postgresql&quot;<br>
&gt; &gt; --update &quot;true&quot;<br>
&gt; &gt; # crm_attribute --type status --node &quot;db02&quot; --name &quot;service_postgresql&quot;<br>
&gt; &gt; scope=status name=service_postgresql value=true<br>
&gt; &gt; # crm resource stop vm-db02<br>
&gt; &gt; # crm resource start vm-db02<br>
&gt; &gt; ### Wait a bit<br>
&gt; &gt; # crm_attribute --type status --node &quot;db02&quot; --name &quot;service_postgresql&quot;<br>
&gt; &gt; scope=status name=service_postgresql value=(null)<br>
&gt; &gt; Error performing operation: No such device or address<br>
&gt; &gt; # crm_attribute --type status --node &quot;db02&quot; --name &quot;service_postgresql&quot;<br>
&gt; &gt; --update &quot;true&quot;<br>
&gt; &gt; # crm_attribute --type status --node &quot;db02&quot; --name &quot;service_postgresql&quot;<br>
&gt; &gt; scope=status name=service_postgresql value=true<br>
&gt; &gt;<br>
&gt; &gt; Good so far. But now look at this (every node was clean, and all services<br>
&gt; &gt; were running, before we started):<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; # crm status<br>
&gt; &gt; Last updated: Tue Jul 2 16:15:14 2013<br>
&gt; &gt; Last change: Tue Jul 2 16:15:12 2013 via crmd on cvmh02<br>
&gt; &gt; Stack: cman<br>
&gt; &gt; Current DC: cvmh02 - partition with quorum<br>
&gt; &gt; Version: 1.1.10rc5-1.el6.ccni-2718638<br>
&gt; &gt; 9 Nodes configured, unknown expected votes<br>
&gt; &gt; 59 Resources configured.<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Node db02: UNCLEAN (offline)<br>
&gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01<br>
&gt; &gt; ldap02:vm-ldap02 ]<br>
&gt; &gt; OFFLINE: [ swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt;<br>
&gt; &gt; Full list of resources:<br>
&gt; &gt;<br>
&gt; &gt; fence-cvmh01 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh02 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01<br>
&gt; &gt; Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-p-libvirtd [p-libvirtd]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-watch-ib0 [p-watch-ib0]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-gpfs [p-fs-gpfs]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh03<br>
&gt; &gt; vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Stopped<br>
&gt; &gt; vm-db02 (ocf::ccni:xcatVirtualDomain): Started cvmh02<br>
&gt; &gt; vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03<br>
&gt; &gt; vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04<br>
&gt; &gt; DummyOnVM (ocf::pacemaker:Dummy): Started cvmh01<br>
&gt; &gt;<br>
&gt; &gt; Not so good, and I&#39;m not sure how to clean this up. I can&#39;t seem to stop<br>
&gt;<br>
&gt; clean what up? I don&#39;t understand what I&#39;m expected to notice out of place<br>
&gt; here?! The remote-node us up, everything looks happy.<br>
&gt;<br>
&gt; &gt; vm-db02 any more, even after I&#39;ve entered:<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; # crm_node -R db02 --force<br>
&gt;<br>
&gt; That won&#39;t stop the remote-node. &#39;crm resource stop vm-db02&#39; should though.<br>
&gt;<br>
&gt; &gt; # crm resource start vm-db02<br>
&gt;<br>
&gt; ha, I&#39;m so confused. why are you trying to start it? I thought you were<br>
&gt; trying to stop the resource?<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; ### Wait a bit<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; # crm status<br>
&gt; &gt; Last updated: Tue Jul 2 16:32:38 2013<br>
&gt; &gt; Last change: Tue Jul 2 16:27:28 2013 via cibadmin on cvmh01<br>
&gt; &gt; Stack: cman<br>
&gt; &gt; Current DC: cvmh02 - partition with quorum<br>
&gt; &gt; Version: 1.1.10rc5-1.el6.ccni-2718638<br>
&gt; &gt; 8 Nodes configured, unknown expected votes<br>
&gt; &gt; 54 Resources configured.<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 ldap01:vm-ldap01 ldap02:vm-ldap02<br>
&gt; &gt; swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt; OFFLINE: [ db02:vm-db02 ]<br>
&gt; &gt;<br>
&gt; &gt; fence-cvmh01 (stonith:fence_ipmilan): Started cvmh03<br>
&gt; &gt; fence-cvmh02 (stonith:fence_ipmilan): Started cvmh03<br>
&gt; &gt; fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01<br>
&gt; &gt; Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-p-libvirtd [p-libvirtd]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-watch-ib0 [p-watch-ib0]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-gpfs [p-fs-gpfs]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh02<br>
&gt; &gt; vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Started cvmh01<br>
&gt; &gt; vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03<br>
&gt; &gt; vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04<br>
&gt; &gt; DummyOnVM (ocf::pacemaker:Dummy): Started cvmh01<br>
&gt; &gt;<br>
&gt; &gt; My only recourse has been to reboot the cluster.<br>
&gt; &gt;<br>
&gt; &gt; So let&#39;s do that and try<br>
&gt; &gt; setting a location constraint on DummyOnVM, to force it on db02...<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Last updated: Tue Jul 2 16:43:46 2013<br>
&gt; &gt; Last change: Tue Jul 2 16:27:28 2013 via cibadmin on cvmh01<br>
&gt; &gt; Stack: cman<br>
&gt; &gt; Current DC: cvmh02 - partition with quorum<br>
&gt; &gt; Version: 1.1.10rc5-1.el6.ccni-2718638<br>
&gt; &gt; 8 Nodes configured, unknown expected votes<br>
&gt; &gt; 54 Resources configured.<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01<br>
&gt; &gt; ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt;<br>
&gt; &gt; fence-cvmh01 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh02 (stonith:fence_ipmilan): Started cvmh03<br>
&gt; &gt; fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04<br>
&gt; &gt; fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01<br>
&gt; &gt; Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-p-libvirtd [p-libvirtd]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-watch-ib0 [p-watch-ib0]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; Clone Set: c-fs-gpfs [p-fs-gpfs]<br>
&gt; &gt; Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]<br>
&gt; &gt; Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]<br>
&gt; &gt; vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh01<br>
&gt; &gt; vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Started cvmh01<br>
&gt; &gt; vm-db02 (ocf::ccni:xcatVirtualDomain): Started cvmh02<br>
&gt; &gt; vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03<br>
&gt; &gt; vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04<br>
&gt; &gt; DummyOnVM (ocf::pacemaker:Dummy): Started cvmh03<br>
&gt; &gt;<br>
&gt; &gt; # pcs constraint location DummyOnVM prefers db02<br>
&gt; &gt; # crm status<br>
&gt; &gt; ...<br>
&gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01<br>
&gt; &gt; ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt; ...<br>
&gt; &gt; DummyOnVM (ocf::pacemaker:Dummy): Started db02<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; That&#39;s what we want to see. It would be interesting to stop db02. I expect<br>
&gt; &gt; DummyOnVM to stop.<br>
&gt;<br>
&gt; OH, okay, so you wanted DummyOnVM to start on db02.<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; # crm resource stop vm-db02<br>
&gt; &gt; # crm status<br>
&gt; &gt; ...<br>
&gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 ldap01:vm-ldap01 ldap02:vm-ldap02 ]<br>
&gt; &gt; OFFLINE: [ db02:vm-db02 swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt; ...<br>
&gt; &gt; DummyOnVM (ocf::pacemaker:Dummy): Started cvmh02<br>
&gt; &gt;<br>
&gt; &gt; Failed actions:<br>
&gt; &gt; vm-compute-test_migrate_from_0 (node=cvmh02, call=147, rc=1, status=Timed<br>
&gt; &gt; Out, last-rc-change=Tue Jul 2 16:48:17 2013<br>
&gt; &gt; , queued=20003ms, exec=0ms<br>
&gt; &gt; ): unknown error<br>
&gt; &gt;<br>
&gt; &gt; Well, that is odd. (It is the case that vm-swbuildsl6 has an order<br>
&gt; &gt; dependency<br>
&gt; &gt; on vm-compute-test, as I was trying to understand how migrations worked<br>
&gt; &gt; with<br>
&gt; &gt; order dependencies (not very well).<br>
&gt;<br>
&gt; I don&#39;t think this failure has anything to do with the order dependencies. If<br>
&gt; pacemaker attempted to live migrate the vm and it fails, that&#39;s a resource<br>
&gt; problem. Do you have your virtual machine images on shared storage?<br>
&gt;<br>
&gt; &gt; Once vm-compute-test recovers,<br>
&gt; &gt; vm-swbuildsl6 does come back up.) This isn&#39;t really very good -- if I am<br>
&gt; &gt; running services in VM or other containers, I need them to run only in that<br>
&gt; &gt; container!<br>
&gt;<br>
&gt; Read about the differences between asymmetrical and symmetrical clusters. I<br>
&gt; think this will help this make sense. By default resources can run anywhere,<br>
&gt; you just gave more weight to db02 for the Dummy resource, meaning it prefers<br>
&gt; that node when it is around.<br>
&gt;<br>
&gt; <a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on" target="_blank">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on</a><br>

&gt;<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt; If I start vm-db02 back up, I see that DummyOnVM is stopped and moved to<br>
&gt; &gt; db02.<br>
&gt;<br>
&gt; Yep, this is what I&#39;d expect for a symmetrical cluster.<br>
&gt;<br>
&gt; Thanks again for the feedback, hope the asymmetrical/symmetrical cluster<br>
&gt; stuff helps :)<br>
&gt;<br>
&gt; -- Vossel<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; On Thu, Jun 20, 2013 at 4:16 PM, David Vossel &lt; <a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a> &gt; wrote:<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; ----- Original Message -----<br>
&gt; &gt; &gt; From: &quot;David Vossel&quot; &lt; <a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a> &gt;<br>
&gt; &gt; &gt; To: &quot;The Pacemaker cluster resource manager&quot; &lt;<br>
&gt; &gt; &gt; <a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a> &gt;<br>
&gt; &gt; &gt; Sent: Thursday, June 20, 2013 1:35:44 PM<br>
&gt; &gt; &gt; Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; ----- Original Message -----<br>
&gt; &gt; &gt; &gt; From: &quot;David Vossel&quot; &lt; <a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a> &gt;<br>
&gt; &gt; &gt; &gt; To: &quot;The Pacemaker cluster resource manager&quot;<br>
&gt; &gt; &gt; &gt; &lt; <a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a> &gt;<br>
&gt; &gt; &gt; &gt; Sent: Wednesday, June 19, 2013 4:47:58 PM<br>
&gt; &gt; &gt; &gt; Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; ----- Original Message -----<br>
&gt; &gt; &gt; &gt; &gt; From: &quot;Lindsay Todd&quot; &lt; <a href="mailto:rltodd.ml1@gmail.com">rltodd.ml1@gmail.com</a> &gt;<br>
&gt; &gt; &gt; &gt; &gt; To: &quot;The Pacemaker cluster resource manager&quot;<br>
&gt; &gt; &gt; &gt; &gt; &lt; <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a> &gt;<br>
&gt; &gt; &gt; &gt; &gt; Sent: Wednesday, June 19, 2013 4:11:58 PM<br>
&gt; &gt; &gt; &gt; &gt; Subject: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; I built a set of rpms for pacemaker 1.1.0-rc4 and updated my test<br>
&gt; &gt; &gt; &gt; &gt; cluster<br>
&gt; &gt; &gt; &gt; &gt; (hopefully won&#39;t be a &quot;test&quot; cluster forever), as well as my VMs<br>
&gt; &gt; &gt; &gt; &gt; running<br>
&gt; &gt; &gt; &gt; &gt; pacemaker-remote. The OS everywhere is Scientific Linux 6.4. I am<br>
&gt; &gt; &gt; &gt; &gt; wanting<br>
&gt; &gt; &gt; &gt; &gt; to<br>
&gt; &gt; &gt; &gt; &gt; set some attributes on remote nodes, which I can use to control where<br>
&gt; &gt; &gt; &gt; &gt; services run.<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; The first deviation I note from the documentation is the naming of<br>
&gt; &gt; &gt; &gt; &gt; the<br>
&gt; &gt; &gt; &gt; &gt; remote<br>
&gt; &gt; &gt; &gt; &gt; nodes. I see:<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; Last updated: Wed Jun 19 16:50:39 2013<br>
&gt; &gt; &gt; &gt; &gt; Last change: Wed Jun 19 16:19:53 2013 via cibadmin on cvmh04<br>
&gt; &gt; &gt; &gt; &gt; Stack: cman<br>
&gt; &gt; &gt; &gt; &gt; Current DC: cvmh02 - partition with quorum<br>
&gt; &gt; &gt; &gt; &gt; Version: 1.1.10rc4-1.el6.ccni-d19719c<br>
&gt; &gt; &gt; &gt; &gt; 8 Nodes configured, unknown expected votes<br>
&gt; &gt; &gt; &gt; &gt; 49 Resources configured.<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01<br>
&gt; &gt; &gt; &gt; &gt; ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; Full list of resources:<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; and so forth. The &quot;remote-node&quot; names are simply the hostname, so the<br>
&gt; &gt; &gt; &gt; &gt; vm-db02<br>
&gt; &gt; &gt; &gt; &gt; VirtualDomain resource has a remote-node name of db02. The &quot;Pacemaker<br>
&gt; &gt; &gt; &gt; &gt; Remote&quot; manual suggests this should be displayed as &quot;db02&quot;, not<br>
&gt; &gt; &gt; &gt; &gt; &quot;db02:vm-db02&quot;, although I can see how the latter format would be<br>
&gt; &gt; &gt; &gt; &gt; useful.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Yep, this got changed since the documentation was published. We wanted<br>
&gt; &gt; &gt; &gt; people to be able to recognize which remote-node went with which<br>
&gt; &gt; &gt; &gt; resource<br>
&gt; &gt; &gt; &gt; easily.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; So now let&#39;s set an attribute on this remote node. What name do I<br>
&gt; &gt; &gt; &gt; &gt; use?<br>
&gt; &gt; &gt; &gt; &gt; How<br>
&gt; &gt; &gt; &gt; &gt; about:<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; # crm_attribute --node &quot;db02:vm-db02&quot; \<br>
&gt; &gt; &gt; &gt; &gt; --name &quot;service_postgresql&quot; \<br>
&gt; &gt; &gt; &gt; &gt; --update &quot;true&quot;<br>
&gt; &gt; &gt; &gt; &gt; Could not map name=db02:vm-db02 to a UUID<br>
&gt; &gt; &gt; &gt; &gt; Please choose from one of the matches above and suppy the &#39;id&#39; with<br>
&gt; &gt; &gt; &gt; &gt; --attr-id<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; Perhaps not the most informative output, but obviously it fails.<br>
&gt; &gt; &gt; &gt; &gt; Let&#39;s<br>
&gt; &gt; &gt; &gt; &gt; try<br>
&gt; &gt; &gt; &gt; &gt; the unqualified name:<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; &gt; # crm_attribute --node &quot;db02&quot; \<br>
&gt; &gt; &gt; &gt; &gt; --name &quot;service_postgresql&quot; \<br>
&gt; &gt; &gt; &gt; &gt; --update &quot;true&quot;<br>
&gt; &gt; &gt; &gt; &gt; Remote-nodes do not maintain permanent attributes,<br>
&gt; &gt; &gt; &gt; &gt; &#39;service_postgresql=true&#39;<br>
&gt; &gt; &gt; &gt; &gt; will be removed after db02 reboots.<br>
&gt; &gt; &gt; &gt; &gt; Error setting service_postgresql=true (section=status,<br>
&gt; &gt; &gt; &gt; &gt; set=status-db02):<br>
&gt; &gt; &gt; &gt; &gt; No<br>
&gt; &gt; &gt; &gt; &gt; such device or address<br>
&gt; &gt; &gt; &gt; &gt; Error performing operation: No such device or address<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; I just tested this and ran into the same errors you did. Turns out this<br>
&gt; &gt; &gt; happens when the remote-node&#39;s status section is empty. If you start a<br>
&gt; &gt; &gt; resource on the node and then set the attribute it will work... obviously<br>
&gt; &gt; &gt; this is a bug. I&#39;m working on a fix.<br>
&gt; &gt;<br>
&gt; &gt; This should help with the attributes bit.<br>
&gt; &gt;<br>
&gt; &gt; <a href="https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204" target="_blank">https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204</a><br>

&gt; &gt;<br>
&gt; &gt; -- Vossel<br>
&gt; &gt;<br>
&gt; &gt; _______________________________________________<br>
&gt; &gt; Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
&gt; &gt; <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
&gt; &gt;<br>
&gt; &gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
&gt; &gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; &gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; _______________________________________________<br>
&gt; &gt; Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
&gt; &gt; <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
&gt; &gt;<br>
&gt; &gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
&gt; &gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; &gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt; &gt;<br>
&gt;<br>
&gt; _______________________________________________<br>
&gt; Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
&gt; <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
&gt;<br>
&gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
&gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt;<br>
&gt;<br>
&gt; _______________________________________________<br>
&gt; Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
&gt; <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
&gt;<br>
&gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
&gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt;<br>
<br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br></div>