[Pacemaker] Stopping resource using pcs

Fri Feb 28 12:19:49 EST 2014

----- Original Message -----
> From: "K Mehta" <kiranmehta1981 at gmail.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Friday, February 28, 2014 7:05:47 AM
> Subject: Re: [Pacemaker] Stopping resource using pcs
> 
> Can anyone tell me why --wait parameter always causes pcs resource disable to
> return failure though resource actually stops within time ?

does it only show an error with multi-state resources?  It is probably a bug.  

-- Vossel

> 
> 
> On Wed, Feb 26, 2014 at 10:45 PM, K Mehta < kiranmehta1981 at gmail.com > wrote:
> 
> 
> 
> Deleting master resource id does not work. I see the same issue.
> However, uncloning helps. Delete works after disabling and uncloning.
> 
> I see anissue in using --wait option with disable. Resources moves into
> stopped state but still error an error message is printed.
> When --wait option is not provided, error message is not seen
> 
> [root at sys11 ~]# pcs resource
> Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> Masters: [ sys11 ]
> Slaves: [ sys12 ]
> [root at sys11 ~]# pcs resource disable ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> --wait
> Error: unable to stop: 'ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8', please
> check logs for failure information
> [root at sys11 ~]# pcs resource
> Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> [root at sys11 ~]# pcs resource disable ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> --wait
> Error: unable to stop: 'ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8', please
> check logs for failure information <<<<<error message
> [root at sys11 ~]# pcs resource enable ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [root at sys11 ~]# pcs resource
> Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> Masters: [ sys11 ]
> Slaves: [ sys12 ]
> [root at sys11 ~]# pcs resource disable ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [root at sys11 ~]# pcs resource
> Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> 
> 
> 
> 
> 
> On Wed, Feb 26, 2014 at 8:55 PM, David Vossel < dvossel at redhat.com > wrote:
> 
> 
> 
> ----- Original Message -----
> > From: "Frank Brendel" < frank.brendel at eurolog.com >
> > To: pacemaker at oss.clusterlabs.org
> > Sent: Wednesday, February 26, 2014 8:53:19 AM
> > Subject: Re: [Pacemaker] Stopping resource using pcs
> > 
> > I guess we need some real experts here.
> > 
> > I think it's because you're attempting to delete the resource and not the
> > Master.
> > Try deleting the Master instead of the resource.
> 
> Yes, delete the Master resource id, not the primitive resource within the
> master. When using pcs, you should always refer to the resource's top most
> parent id, not the id of the children resources within the parent. If you
> make a resource a clone, start using the clone id. Same with master. If you
> add a resource to a group, reference the group id from then on and not any
> of the children resources within the group.
> 
> As a general practice, it is always better to stop a resource (pcs resource
> disable) and only delete the resource after the stop has completed.
> 
> This is especially important for group resources where stop order matters. If
> you delete a group, then we have no information on what order to stop the
> resources in that group. This can cause stop failures when the orphaned
> resources are cleaned up.
> 
> Recently pcs gained the ability to attempt to stop resources before deleting
> them in order to avoid scenarios like i described above. Pcs will block for
> a period of time waiting for the resource to stop before deleting it. Even
> with this logic in place it is preferred to stop the resource manually then
> delete the resource once you have verified it stopped.
> 
> -- Vossel
> 
> > 
> > I had a similar problem with a cloned group and solved it by un-cloning
> > before deleting the group.
> > Maybe un-cloning the multi-state resource could help too.
> > It's easy to reproduce.
> > 
> > # pcs resource create resPing ping host_list="10.0.0.1 10.0.0.2" op monitor
> > on-fail="restart"
> > # pcs resource group add groupPing resPing
> > # pcs resource clone groupPing clone-max=3 clone-node-max=1
> > # pcs resource
> > Clone Set: groupPing-clone [groupPing]
> > Started: [ node1 node2 node3 ]
> > # pcs resource delete groupPing-clone
> > Deleting Resource (and group) - resPing
> > Error: Unable to remove resource 'resPing' (do constraints exist?)
> > # pcs resource unclone groupPing
> > # pcs resource delete groupPing
> > Removing group: groupPing (and all resources within group)
> > Stopping all resources in group: groupPing...
> > Deleting Resource (and group) - resPing
> > 
> > Log:
> > Feb 26 15:43:16 node1 cibadmin[2368]: notice: crm_log_args: Invoked:
> > /usr/sbin/cibadmin -o resources -D --xml-text <group id="groupPing">#012
> > <primitive class="ocf" id="resPing" provider="pacemaker" type="ping">#012
> > <instance_attributes id="resPing-instance_attributes">#012 <nvpair
> > id="resPing-instance_attributes-host_list" name="host_list" value="10.0.0.1
> > 10.0.0.2"/>#012 </instance_attributes>#012 <operations>#012 <op
> > id="resPing-monitor-on-fail-restart" interval="60s" name="monitor"
> > on-fail="restart"/>#012 </operations>#012 </primi
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Expecting an element
> > meta_attributes, got nothing
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Invalid sequence in
> > interleave
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element clone failed to
> > validate content
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element resources has
> > extra
> > content: primitive
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Invalid sequence in
> > interleave
> > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element cib failed to
> > validate content
> > Feb 26 15:43:16 node1 cib[1820]: warning: cib_perform_op: Updated CIB does
> > not validate against pacemaker-1.2 schema/dtd
> > Feb 26 15:43:16 node1 cib[1820]: warning: cib_diff_notify: Update (client:
> > cibadmin, call:2): 0.516.7 -> 0.517.1 (Update does not conform to the
> > configured schema)
> > Feb 26 15:43:16 node1 stonith-ng[1821]: warning: update_cib_cache_cb:
> > [cib_diff_notify] ABORTED: Update does not conform to the configured schema
> > (-203)
> > Feb 26 15:43:16 node1 cib[1820]: warning: cib_process_request: Completed
> > cib_delete operation for section resources: Update does not conform to the
> > configured schema (rc=-203, origin=local/cibadmin/2, version=0.516.7)
> > 
> > 
> > Frank
> > 
> > Am 26.02.2014 15 :00, schrieb K Mehta:
> > 
> > 
> > 
> > Here is the config and output of few commands
> > 
> > [root at sys11 ~]# pcs config
> > Cluster Name: kpacemaker1.1
> > Corosync Nodes:
> > 
> > Pacemaker Nodes:
> > sys11 sys12
> > 
> > Resources:
> > Master: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Meta Attrs: clone-max=2 globally-unique=false target-role=Started
> > Resource: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 (class=ocf
> > provider=heartbeat type=vgc-cm-agent.ocf)
> > Attributes: cluster_uuid=de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Operations: monitor interval=30s role=Master timeout=100s
> > (vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-monitor-interval-30s)
> > monitor interval=31s role=Slave timeout=100s
> > (vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-monitor-interval-31s)
> > 
> > Stonith Devices:
> > Fencing Levels:
> > 
> > Location Constraints:
> > Resource: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Enabled on: sys11 (score:200)
> > (id:location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200)
> > Enabled on: sys12 (score:200)
> > (id:location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200)
> > Resource: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Enabled on: sys11 (score:200)
> > (id:location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200)
> > Enabled on: sys12 (score:200)
> > (id:location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200)
> > Ordering Constraints:
> > Colocation Constraints:
> > 
> > Cluster Properties:
> > cluster-infrastructure: cman
> > dc-version: 1.1.8-7.el6-394e906
> > no-quorum-policy: ignore
> > stonith-enabled: false
> > symmetric-cluster: false
> > 
> > 
> > 
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Masters: [ sys11 ]
> > Slaves: [ sys12 ]
> > 
> > 
> > 
> > [root at sys11 ~]# pcs resource disable
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > 
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> > 
> > 
> > [root at sys11 ~]# pcs resource delete
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Removing Constraint -
> > location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200
> > Removing Constraint -
> > location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200
> > Removing Constraint -
> > location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200
> > Removing Constraint -
> > location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200
> > Attempting to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8...Error:
> > Unable
> > to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 before deleting (re-run
> > with --force to force deletion)
> > 
> > 
> > [root at sys11 ~]# pcs resource delete
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > Attempting to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8...Error:
> > Unable
> > to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 before deleting (re-run
> > with --force to force deletion)
> > 
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> > 
> > [root at sys11 ~]# pcs resource delete
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > --force
> > Deleting Resource - vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [root at sys11 ~]# pcs resource
> > NO resources configured
> > [root at sys11 ~]#
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>