[Pacemaker] Issue in deleting multi state resource

Mon May 19 03:43:19 EDT 2014

Please see my reply inline. Attached is the crm_report output.

On Thu, May 8, 2014 at 5:45 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> On 8 May 2014, at 12:38 am, K Mehta <kiranmehta1981 at gmail.com> wrote:
>
> > I created a multi state resource ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2).
> >
> > Here is the configuration:
> > ==========================
> > [root at vsanqa11 ~]# pcs config
> > Cluster Name: vsanqa11_12
> > Corosync Nodes:
> >
> > Pacemaker Nodes:
> >  vsanqa11 vsanqa12
> >
> > Resources:
> >  Master: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >   Meta Attrs: clone-max=2 globally-unique=false target-role=started
> >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (class=ocf
> provider=heartbeat type=vgc-cm-agent.ocf)
> >    Attributes: cluster_uuid=2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >    Operations: monitor interval=30s role=Master timeout=100s
> (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-30s)
> >                monitor interval=31s role=Slave timeout=100s
> (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-31s)
> >
> > Stonith Devices:
> > Fencing Levels:
> >
> > Location Constraints:
> >   Resource: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >     Enabled on: vsanqa11 (score:INFINITY)
> (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> >     Enabled on: vsanqa12 (score:INFINITY)
> (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >     Enabled on: vsanqa11 (score:INFINITY)
> (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> >     Enabled on: vsanqa12 (score:INFINITY)
> (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> > Ordering Constraints:
> > Colocation Constraints:
> >
> > Cluster Properties:
> >  cluster-infrastructure: cman
> >  dc-version: 1.1.10-14.el6_5.2-368c726
> >  last-lrm-refresh: 1399466204
> >  no-quorum-policy: ignore
> >  stonith-enabled: false
> >
> > ==============================================
> > When i try to create and delete this resource in a loop,
>
> Why would you do that? :-)
>

Just to test if things are fine if resource is created and deleted in quick
succession. But the issue is also seen arbitrarily. Issue is sometimes seen
even in first iteration of the loop.

>
> > after few iteration, delete fails as shown below. This can be reproduced
> easily. I make sure to unclone resource before deleting the resource.
> Unclone happens successfully
>
> Can you tell us the exact commands you ran?
>
Here is the list of commands used for creation

        pcs cluster cib $CLUSTER_CREATE_LOG || exit 1

        pcs -f $CLUSTER_CREATE_LOG property set stonith-enabled=false ||
exit 1
        pcs -f $CLUSTER_CREATE_LOG property set no-quorum-policy=ignore  ||
exit 1
        #syntax for following command is different across pcs 9.26 and 9.90
        pcs -f $CLUSTER_CREATE_LOG resource defaults
resource-stickiness=100 > /dev/null 2>&1
        if [ $? -ne 0 ]; then
                pcs -f $CLUSTER_CREATE_LOG resource rsc defaults
resource-stickiness=100  || exit 1
        fi
        pcs -f $CLUSTER_CREATE_LOG resource create vha-$uuid
ocf:heartbeat:vgc-cm-agent.ocf\
                cluster_uuid=$uuid \
                op monitor role="Master" interval=30s timeout=100s\
                op monitor role="Slave" interval=31s timeout=100s  || exit 1
        pcs -f $CLUSTER_CREATE_LOG resource master ms-${uuid} vha-${uuid}
meta clone-max=2 \
                globally-unique=false target-role=started || exit 1

        pcs -f $CLUSTER_CREATE_LOG constraint  location vha-${uuid}
 prefers $node1 || exit 1
        pcs -f $CLUSTER_CREATE_LOG constraint  location vha-${uuid}
 prefers $node2 || exit 1
        pcs -f $CLUSTER_CREATE_LOG constraint  location ms-${uuid}  prefers
$node1 || exit 1
        pcs -f $CLUSTER_CREATE_LOG constraint  location ms-${uuid}  prefers
$node2 || exit 1

        #syntax for following command different across pcs 9.26 and 9.90
        pcs cluster cib-push $CLUSTER_CREATE_LOG  > /dev/null 2>&1
        if [ $? -ne 0 ]; then
                pcs cluster  push cib $CLUSTER_CREATE_LOG
        fi

        if [ $? -eq 0 ]; then
                echo "Success"
        else
                echo "Failure"
                exit 1
        fi

List of commands for deletion

        pcs resource show vha-${uuid} > /dev/null 2>&1
        if [ $? -eq 0 ]; then
                pcs resource unclone ms-${uuid} > /dev/null 2>&1
                if [ $? -ne 0 ]; then
                        echo "Failed to unclone resource with uuid: $uuid"
                        #do not exit because this command always fails
                        #in pcs 9.26 version. Attempt to delete
                fi

                pcs resource delete vha-${uuid} > /dev/null 2>&1
                if [ $? -ne 0 ]; then
                        echo "Failed to delete resource with uuid: $uuid"
                        exit 1
                fi
        else
                if [ $verbose -eq 1 ]; then
                        echo "Could not find resource with uuid ${uuid}"
                fi

        fi

>
> >
> > Removing Constraint -
> location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY
> > Removing Constraint -
> location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY
> > Attempting to stop: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2...Error:
> Unable to stop: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 before deleting
> (re-run with --force to force deletion)
> > Failed to delete resource with uuid: 2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >
> > ==============================================
> >
> > Log file snippet of relevant time
>
> And on the other node?  Did you configure a log file?  That would also be
> interesting.
> Actually, better make that a crm_report... the PE files will likely be
> interesting too.
>
> > ============================================
> >
> > May  7 07:20:12 vsanqa12 vgc-vha-config: /usr/bin/vgc-vha-config --stop
> /dev/vgca0_vha
> > May  7 07:20:12 vsanqa12 crmd[4319]:   notice: do_state_transition:
> State transition S_NOT_DC -> S_PENDING [ input=I_PENDING
> cause=C_FSA_INTERNAL origin=do_election_count_vote ]
> > May  7 07:20:12 vsanqa12 kernel: VGC: [0000006711341b03:I] Stopped
> vHA/vShare instance /dev/vgca0_vha
> > May  7 07:20:12 vsanqa12 stonith-ng[4315]:   notice: unpack_config: On
> loss of CCM Quorum: Ignore
> > May  7 07:20:12 vsanqa12 vgc-vha-config: Success
> > May  7 07:20:13 vsanqa12 stonith-ng[4315]:   notice: unpack_config: On
> loss of CCM Quorum: Ignore
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for:
> master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (<null>)
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update:
> Sent delete 4404: node=vsanqa12,
> attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null),
> section=status
> > May  7 07:20:13 vsanqa12 crmd[4319]:   notice: process_lrm_event: LRM
> operation vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2_stop_0 (call=1379, rc=0,
> cib-update=1161, confirmed=true) ok
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update:
> Sent delete 4406: node=vsanqa12,
> attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null),
> section=status
> > May  7 07:20:12 vsanqa12 kernel: VGC: [0000006711341b03:I] Stopped
> vHA/vShare instance /dev/vgca0_vh088-a1fa-464a-b00d-f4bccb4f5af2=(null)
> failed: Application of an updMay  7 07:20:12 vsanqa12 stonith-ng[4315]:
> notice: unpack_config: On loss of CCM Quorum: Ignore
> > May  7 07:20:12 vsanqa12 vgc-vha-config: Success
> > May  7 07:20:13 vsanqa12 stonith-ng[4315]:   notice: unpack_config: On
> loss of CCM Quorum: Ignore
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_trigger_update:
> Sending flush op to all hosts for:
> master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (<null>)
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update:
> Sent delete 4404: node=vsanqa12,
> attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null),
> section=status
> > May  7 07:20:13 vsanqa12 crmd[4319]:   notice: process_lrm_event: LRM
> operation vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2_stop_0 (call=1379, rc=0,
> cib-update=1161, confirmed=true) ok
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update:
> Sent delete 4406: node=vsanqa12,
> attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null),
> section=status
> > May  7 07:20:13 vsanqa12 attrd[4317]:  warning: attrd_cib_callback:
> Update 4404 for master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2=(null)
> failed: Application of an update diff failed
> > May  7 07:20:13 vsanqa12 cib[4314]:  warning: cib_process_diff: Diff
> 0.6804.2 -> 0.6804.3 from vsanqa11 not applied to 0.6804.2: Failed
> application of an update diff
> > May  7 07:20:13 vsanqa12 cib[4314]:   notice: cib_server_process_diff:
> Not applying diff 0.6804.3 -> 0.6804.4 (sync in progress)
> >
> >
> > [root at vsanqa12 ~]# pcs status
> > Cluster name: vsanqa11_12
> > Last updated: Wed May  7 07:24:29 2014
> > Last change: Wed May  7 07:20:13 2014 via crm_resource on vsanqa11
> > Stack: cman
> > Current DC: vsanqa11 - partition with quorum
> > Version: 1.1.10-14.el6_5.2-368c726
> > 2 Nodes configured
> > 1 Resources configured
> >
> >
> > Online: [ vsanqa11 vsanqa12 ]
> >
> > Full list of resources:
> >
> >  vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> (ocf::heartbeat:vgc-cm-agent.ocf):      Stopped
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140519/f8d2690a/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pcmk-Mon-19-May-2014.tar.bz2
Type: application/x-bzip2
Size: 221096 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140519/f8d2690a/attachment-0003.bz2>