[Pacemaker] Stopping resource using pcs

Fri Feb 28 13:21:32 EST 2014

Yes, the issue is seen only with multi state resource. Non multi state
resource work fine. Looks like is_resource_started function in utils.py
does not compare resource name properly. Let fs be the resource name.
is_resource_started compares fs with fs:0 and fs:1 and hence match is not
found and false is returned.

def resource_disable(argv):
    if len(argv) < 1:
        utils.err("You must specify a resource to disable")

    resource = argv[0]
    args = ["crm_resource", "-r", argv[0], "-m", "-p", "target-role", "-v",
"Stopped"]
    output, retval = utils.run(args)
    if retval != 0:
        utils.err(output)

    if "--wait" in utils.pcs_options:
        wait = utils.pcs_options["--wait"]
        if not wait.isdigit():
            utils.err("%s is not a valid number of seconds to wait" % wait)
            sys.exit(1)
        did_stop = utils.is_resource_started(resource,int(wait),True) <<<
did_stop is false

        if did_stop:
            return True
        else:
            utils.err("unable to stop: '%s', please check logs for failure
information" % resource)

def is_resource_started(resource,wait,stopped=False):
    expire_time = int(time.time()) + wait
    while True:
        state = getClusterState()
        resources = state.getElementsByTagName("resource")
        for res in resources:
            if res.getAttribute("id") == resource:  <<<< never succeeds
                if (res.getAttribute("role") == "Started" and not stopped)
or (res.getAttribute("role") == "Stopped" and stopped):
                    return True
                break
        if (expire_time < int(time.time())):
            break
        time.sleep(1)
    return False    <<< False is returned

On Fri, Feb 28, 2014 at 10:49 PM, David Vossel <dvossel at redhat.com> wrote:

>
>
>
>
> ----- Original Message -----
> > From: "K Mehta" <kiranmehta1981 at gmail.com>
> > To: "The Pacemaker cluster resource manager" <
> pacemaker at oss.clusterlabs.org>
> > Sent: Friday, February 28, 2014 7:05:47 AM
> > Subject: Re: [Pacemaker] Stopping resource using pcs
> >
> > Can anyone tell me why --wait parameter always causes pcs resource
> disable to
> > return failure though resource actually stops within time ?
>
> does it only show an error with multi-state resources?  It is probably a
> bug.
>
> -- Vossel
>
> >
> >
> > On Wed, Feb 26, 2014 at 10:45 PM, K Mehta < kiranmehta1981 at gmail.com >
> wrote:
> >
> >
> >
> > Deleting master resource id does not work. I see the same issue.
> > However, uncloning helps. Delete works after disabling and uncloning.
> >
> > I see anissue in using --wait option with disable. Resources moves into
> > stopped state but still error an error message is printed.
> > When --wait option is not provided, error message is not seen
> >
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Masters: [ sys11 ]
> > Slaves: [ sys12 ]
> > [root at sys11 ~]# pcs resource disable
> ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > --wait
> > Error: unable to stop: 'ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8', please
> > check logs for failure information
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> > [root at sys11 ~]# pcs resource disable
> ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > --wait
> > Error: unable to stop: 'ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8', please
> > check logs for failure information <<<<<error message
> > [root at sys11 ~]# pcs resource enable
> ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Masters: [ sys11 ]
> > Slaves: [ sys12 ]
> > [root at sys11 ~]# pcs resource disable
> ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [root at sys11 ~]# pcs resource
> > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> >
> >
> >
> >
> >
> > On Wed, Feb 26, 2014 at 8:55 PM, David Vossel < dvossel at redhat.com >
> wrote:
> >
> >
> >
> > ----- Original Message -----
> > > From: "Frank Brendel" < frank.brendel at eurolog.com >
> > > To: pacemaker at oss.clusterlabs.org
> > > Sent: Wednesday, February 26, 2014 8:53:19 AM
> > > Subject: Re: [Pacemaker] Stopping resource using pcs
> > >
> > > I guess we need some real experts here.
> > >
> > > I think it's because you're attempting to delete the resource and not
> the
> > > Master.
> > > Try deleting the Master instead of the resource.
> >
> > Yes, delete the Master resource id, not the primitive resource within the
> > master. When using pcs, you should always refer to the resource's top
> most
> > parent id, not the id of the children resources within the parent. If you
> > make a resource a clone, start using the clone id. Same with master. If
> you
> > add a resource to a group, reference the group id from then on and not
> any
> > of the children resources within the group.
> >
> > As a general practice, it is always better to stop a resource (pcs
> resource
> > disable) and only delete the resource after the stop has completed.
> >
> > This is especially important for group resources where stop order
> matters. If
> > you delete a group, then we have no information on what order to stop the
> > resources in that group. This can cause stop failures when the orphaned
> > resources are cleaned up.
> >
> > Recently pcs gained the ability to attempt to stop resources before
> deleting
> > them in order to avoid scenarios like i described above. Pcs will block
> for
> > a period of time waiting for the resource to stop before deleting it.
> Even
> > with this logic in place it is preferred to stop the resource manually
> then
> > delete the resource once you have verified it stopped.
> >
> > -- Vossel
> >
> > >
> > > I had a similar problem with a cloned group and solved it by un-cloning
> > > before deleting the group.
> > > Maybe un-cloning the multi-state resource could help too.
> > > It's easy to reproduce.
> > >
> > > # pcs resource create resPing ping host_list="10.0.0.1 10.0.0.2" op
> monitor
> > > on-fail="restart"
> > > # pcs resource group add groupPing resPing
> > > # pcs resource clone groupPing clone-max=3 clone-node-max=1
> > > # pcs resource
> > > Clone Set: groupPing-clone [groupPing]
> > > Started: [ node1 node2 node3 ]
> > > # pcs resource delete groupPing-clone
> > > Deleting Resource (and group) - resPing
> > > Error: Unable to remove resource 'resPing' (do constraints exist?)
> > > # pcs resource unclone groupPing
> > > # pcs resource delete groupPing
> > > Removing group: groupPing (and all resources within group)
> > > Stopping all resources in group: groupPing...
> > > Deleting Resource (and group) - resPing
> > >
> > > Log:
> > > Feb 26 15:43:16 node1 cibadmin[2368]: notice: crm_log_args: Invoked:
> > > /usr/sbin/cibadmin -o resources -D --xml-text <group
> id="groupPing">#012
> > > <primitive class="ocf" id="resPing" provider="pacemaker"
> type="ping">#012
> > > <instance_attributes id="resPing-instance_attributes">#012 <nvpair
> > > id="resPing-instance_attributes-host_list" name="host_list"
> value="10.0.0.1
> > > 10.0.0.2"/>#012 </instance_attributes>#012 <operations>#012 <op
> > > id="resPing-monitor-on-fail-restart" interval="60s" name="monitor"
> > > on-fail="restart"/>#012 </operations>#012 </primi
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Expecting an element
> > > meta_attributes, got nothing
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Invalid sequence in
> > > interleave
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element clone failed
> to
> > > validate content
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element resources has
> > > extra
> > > content: primitive
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Invalid sequence in
> > > interleave
> > > Feb 26 15:43:16 node1 cib[1820]: error: xml_log: Element cib failed to
> > > validate content
> > > Feb 26 15:43:16 node1 cib[1820]: warning: cib_perform_op: Updated CIB
> does
> > > not validate against pacemaker-1.2 schema/dtd
> > > Feb 26 15:43:16 node1 cib[1820]: warning: cib_diff_notify: Update
> (client:
> > > cibadmin, call:2): 0.516.7 -> 0.517.1 (Update does not conform to the
> > > configured schema)
> > > Feb 26 15:43:16 node1 stonith-ng[1821]: warning: update_cib_cache_cb:
> > > [cib_diff_notify] ABORTED: Update does not conform to the configured
> schema
> > > (-203)
> > > Feb 26 15:43:16 node1 cib[1820]: warning: cib_process_request:
> Completed
> > > cib_delete operation for section resources: Update does not conform to
> the
> > > configured schema (rc=-203, origin=local/cibadmin/2, version=0.516.7)
> > >
> > >
> > > Frank
> > >
> > > Am 26.02.2014 15 :00, schrieb K Mehta:
> > >
> > >
> > >
> > > Here is the config and output of few commands
> > >
> > > [root at sys11 ~]# pcs config
> > > Cluster Name: kpacemaker1.1
> > > Corosync Nodes:
> > >
> > > Pacemaker Nodes:
> > > sys11 sys12
> > >
> > > Resources:
> > > Master: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Meta Attrs: clone-max=2 globally-unique=false target-role=Started
> > > Resource: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 (class=ocf
> > > provider=heartbeat type=vgc-cm-agent.ocf)
> > > Attributes: cluster_uuid=de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Operations: monitor interval=30s role=Master timeout=100s
> > > (vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-monitor-interval-30s)
> > > monitor interval=31s role=Slave timeout=100s
> > > (vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-monitor-interval-31s)
> > >
> > > Stonith Devices:
> > > Fencing Levels:
> > >
> > > Location Constraints:
> > > Resource: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Enabled on: sys11 (score:200)
> > > (id:location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200)
> > > Enabled on: sys12 (score:200)
> > > (id:location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200)
> > > Resource: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Enabled on: sys11 (score:200)
> > > (id:location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200)
> > > Enabled on: sys12 (score:200)
> > > (id:location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200)
> > > Ordering Constraints:
> > > Colocation Constraints:
> > >
> > > Cluster Properties:
> > > cluster-infrastructure: cman
> > > dc-version: 1.1.8-7.el6-394e906
> > > no-quorum-policy: ignore
> > > stonith-enabled: false
> > > symmetric-cluster: false
> > >
> > >
> > >
> > > [root at sys11 ~]# pcs resource
> > > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > > Masters: [ sys11 ]
> > > Slaves: [ sys12 ]
> > >
> > >
> > >
> > > [root at sys11 ~]# pcs resource disable
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > >
> > > [root at sys11 ~]# pcs resource
> > > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> > >
> > >
> > > [root at sys11 ~]# pcs resource delete
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Removing Constraint -
> > > location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200
> > > Removing Constraint -
> > > location-ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200
> > > Removing Constraint -
> > > location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys11-200
> > > Removing Constraint -
> > > location-vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8-sys12-200
> > > Attempting to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8...Error:
> > > Unable
> > > to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 before deleting
> (re-run
> > > with --force to force deletion)
> > >
> > >
> > > [root at sys11 ~]# pcs resource delete
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > Attempting to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8...Error:
> > > Unable
> > > to stop: vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8 before deleting
> (re-run
> > > with --force to force deletion)
> > >
> > > [root at sys11 ~]# pcs resource
> > > Master/Slave Set: ms-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > [vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8]
> > > Stopped: [ vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:0
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8:1 ]
> > >
> > > [root at sys11 ~]# pcs resource delete
> > > vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > --force
> > > Deleting Resource - vha-de5566b1-c2a3-4dc6-9712-c82bb43f19d8
> > > [root at sys11 ~]# pcs resource
> > > NO resources configured
> > > [root at sys11 ~]#
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140228/ed381f59/attachment-0003.html>