[Pacemaker] resource's start/stop not getting called

Shravan Mishra shravan.mishra at gmail.com
Thu Oct 29 01:16:17 UTC 2009


# crm_verify -LV
crm_verify[5605]: 2009/10/28_21:04:28 ERROR: unpack_resources:
Resource start-up disabled since no STONITH resources have been
defined
crm_verify[5605]: 2009/10/28_21:04:28 ERROR: unpack_resources: Either
configure some or disable STONITH with the stonith-enabled option
crm_verify[5605]: 2009/10/28_21:04:28 ERROR: unpack_resources: NOTE:
Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid



crm_attribute -t crm_config -n stonith-enabled -v false

Thanks  man, I owe you a beer.

-Shravan


On Wed, Oct 28, 2009 at 9:02 PM, Luke Bigum <lbigum at iseek.com.au> wrote:
> I don't *think* it's a problem with your RA. What's the output of 'crm_mon -f1' and 'crm_verify -LV' ?
>
> Luke Bigum
> Systems Administrator
>  (p) 1300 661 668
>  (f)  1300 661 540
> (e)  lbigum at iseek.com.au
> http://www.iseek.com.au
> Level 1, 100 Ipswich Road Woolloongabba QLD 4102
>
>
>
> This e-mail and any files transmitted with it may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorised to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message.
>
>
>
> -----Original Message-----
> From: Shravan Mishra [mailto:shravan.mishra at gmail.com]
> Sent: Thursday 29 October 2009 10:48 AM
> To: pacemaker at oss.clusterlabs.org
> Subject: Re: [Pacemaker] resource's start/stop not getting called
>
> Hi Luke,
>
> I had tried a variation of what you have suggested which was:
>
>  monitor_()
>  {
>      touch /monitor
>      return $OCF_NOT_RUNNING
>  }
>
> just to see if start is getting called at all, but to no avail.
>
> I just did exactly what you suggested but I still only see /monitor file.
>
> One interesting thing is that in the <status/> section I only see
> monitoring related details without any start or stop related details
> for my resource on a node.
> If start should have happened it should have been part of the status
> information, it's vital  status information.
>
>
> On Wed, Oct 28, 2009 at 8:14 PM, Luke Bigum <lbigum at iseek.com.au> wrote:
>> Hi Shravan,
>>
>> Your monitor operation is always returning OCF_SUCCESS, which will tell Pacemaker it's always running, always.
>>
>> Try something like this in your RA:
>>
>> monitor_()
>> {
>>      touch /monitor
>>        if [ -f "/start" ]; then
>>          return $OCF_SUCCESS
>>      fi
>>      return $OCF_NOT_RUNNING
>> }
>>
>> And in your stop operation, remove the 'start' file to indicate your resource is not running:
>>
>> stop_()
>> {
>>        rm /start
>>        return $OCF_SUCCESS
>> }
>>
>> Luke Bigum
>> Systems Administrator
>>  (p) 1300 661 668
>>  (f)  1300 661 540
>> (e)  lbigum at iseek.com.au
>> http://www.iseek.com.au
>> Level 1, 100 Ipswich Road Woolloongabba QLD 4102
>>
>>
>>
>> This e-mail and any files transmitted with it may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorised to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message.
>>
>>
>> -----Original Message-----
>> From: Shravan Mishra [mailto:shravan.mishra at gmail.com]
>> Sent: Thursday 29 October 2009 8:11 AM
>> To: pacemaker at oss.clusterlabs.org
>> Subject: [Pacemaker] resource's start/stop not getting called
>>
>> Hello guys,
>>
>> I'm little confused here.
>>
>> My resource's start or stop is not getting called or so I understand
>> because of the behavior I see based on my script.
>> But monitor operation is getting called.
>>
>> My resource agent called "safe" is only getting called by pacemaker
>> when monitoring but not for starting or stopping.
>>
>> I only see /monitor file getting created and not /start or /stop files.
>>
>>
>> For now, temporarily I have my do-nothing script located here :
>>
>> /usr/lib/oc/resource.d/pacemaker/safe
>>
>> The contents of which are :
>>
>> ======================
>> #!/bin/sh
>>
>> # initialization
>> . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>
>> usage_()
>> {
>>        return $OCF_SUCCESS
>> }
>>
>> isrunning_safe()
>> {
>>        return $OCF_SUCCESS
>> }
>>
>> monitor_()
>> {
>>        touch /monitor
>>        return $OCF_SUCCESS
>> }
>>
>> start_()
>> {
>>        touch /start
>>        monitor_
>>        if [ $? = $OCF_SUCCESS ]; then
>>                return $OCF_SUCCESS
>>        fi
>>        return $OCF_SUCCESS
>> }
>>
>> stop_()
>> {
>>        touch /stop
>>        return $OCF_SUCCESS
>> }
>>
>> status_()
>> {
>>        monitor_
>>        if [ $? = $OCF_SUCCESS ]; then
>>                return $OCF_SUCCESS
>>        fi
>> }
>>
>>
>> metadata()
>> {
>>        return $OCF_SUCCESS
>> }
>>
>> validate_all_()
>> {
>>        return $OCF_SUCCESS
>> }
>>
>>
>> COMMAND=$1
>>
>> case "$COMMAND" in
>>        start)
>>                start_
>>                exit $func_status
>>                ;;
>>        stop)
>>                stop_
>>                exit $func_status
>>                ;;
>>        status)
>>                status_
>>                exit $?
>>                ;;
>>        monitor)
>>                monitor_
>>                func_status=$?
>>                exit $func_status
>>                ;;
>>        meta-data)
>>                exit 0
>>                ;;
>>        validate-all)
>>                validate_all_
>>                exit $?
>>                ;;
>>        *)
>>                usage
>>                ;;
>> esac
>> =========================
>>
>> Output of cibadmin --query gives my config :
>>
>> ===========================
>> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
>> have-quorum="1" admin_epoch="0" epoch="144" dc-uuid="host_128"
>> num_updates="6">
>>  <configuration>
>>    <crm_config>
>>      <cluster_property_set id="cib-bootstrap-options">
>>        <nvpair id="cib-bootstrap-options-dc-version"
>> name="dc-version"
>> value="1.0.5-9e9faaab40f3f97e3c0d623e4a4c47ed83fa1601"/>
>>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="openais"/>
>>        <nvpair id="cib-bootstrap-options-expected-quorum-votes"
>> name="expected-quorum-votes" value="2"/>
>>        <nvpair name="symmetric-cluster"
>> id="cib-bootstrap-options-symmetric-cluster" value="true"/>
>>        <nvpair id="cib-bootstrap-options-is-managed-default"
>> name="is-managed-default" value="true"/>
>>      </cluster_property_set>
>>    </crm_config>
>>    <nodes>
>>      <node id="host_145" uname="host_145" type="normal"/>
>>      <node id="host_128" uname="host_128" type="normal"/>
>>    </nodes>
>>    <resources>
>>      <primitive class="ocf" type="safe" provider="pacemaker" id="safe_SVCS">
>>        <operations>
>>          <op name="start" interval="0" id="op-safe_SVCS-1" timeout="1min"/>
>>          <op interval="0" id="op-safe_SVCS-2" name="stop" timeout="1min"/>
>>          <op id="op-safe_SVCS-3" name="monitor" timeout="5s" interval="30s"/>
>>        </operations>
>>        <instance_attributes id="safe_SVCS-instance_attributes">
>>          <nvpair id="safe_SVCS-instance_attributes-target-role"
>> name="target-role" value="Started"/>
>>          <nvpair id="safe_SVCS-instance_attributes-is-managed"
>> name="is-managed" value="true"/>
>>        </instance_attributes>
>>      </primitive>
>>    </resources>
>>    <constraints>
>>      <rsc_location rsc="safe_SVCS" node="host_145" id="safe_SVCS_run"
>> score="INFINITY"/>
>>      <rsc_location rsc="safe_SVCS" node="host_128"
>> id="safe_SVCS-dont-run" score="50"/>
>>    </constraints>
>>  </configuration>
>>  <status>
>>    <node_state uname="host_128" ha="active" in_ccm="true"
>> crmd="online" shutdown="0" join="member" expected="member"
>> id="host_128" crm-debug-origin="do_state_transition">
>>      <transient_attributes id="host_128">
>>        <instance_attributes id="status-host_128">
>>          <nvpair id="status-host_128-probe_complete"
>> name="probe_complete" value="true"/>
>>        </instance_attributes>
>>      </transient_attributes>
>>      <lrm id="host_128">
>>        <lrm_resources>
>>          <lrm_resource id="safe_SVCS" type="safe" class="ocf"
>> provider="pacemaker">
>>            <lrm_rsc_op id="safe_SVCS_monitor_0" operation="monitor"
>> crm-debug-origin="build_active_RAs" crm_feature_set="3.0.1"
>> transition-key="5:1:7:9b57f404-ae10-4f8a-9e81-4f02c28f71be"
>> transition-magic="0:0;5:1:7:9b57f404-ae10-4f8a-9e81-4f02c28f71be"
>> call-id="2" rc-code="0" op-status="0" interval="0"
>> last-run="1256759783" last-rc-change="1256759783" exec-time="20"
>> queue-time="0" op-digest="b43714e34c3a33fee83d41f2016b1d71"/>
>>            <lrm_rsc_op id="safe_SVCS_monitor_30000"
>> operation="monitor" crm-debug-origin="build_active_RAs"
>> crm_feature_set="3.0.1"
>> transition-key="9:2:0:9b57f404-ae10-4f8a-9e81-4f02c28f71be"
>> transition-magic="0:0;9:2:0:9b57f404-ae10-4f8a-9e81-4f02c28f71be"
>> call-id="3" rc-code="0" op-status="0" interval="30000"
>> last-run="1256761194" last-rc-change="1256759784" exec-time="10"
>> queue-time="0" op-digest="c6cdeb51fad8244dc5200a2f34d54796"/>
>>          </lrm_resource>
>>        </lrm_resources>
>>      </lrm>
>>    </node_state>
>>    <node_state uname="host_145" ha="active" in_ccm="true"
>> crmd="online" join="member" shutdown="0" id="host_145"
>> expected="member" crm-debug-origin="do_update_resource">
>>      <lrm id="host_145">
>>        <lrm_resources>
>>          <lrm_resource id="safe_SVCS" type="safe" class="ocf"
>> provider="pacemaker">
>>            <lrm_rsc_op id="safe_SVCS_monitor_0" operation="monitor"
>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.1"
>> transition-key="5:1:7:aebd004e-a447-43bf-9dc3-ad276b15302b"
>> transition-magic="0:2;5:1:7:aebd004e-a447-43bf-9dc3-ad276b15302b"
>> call-id="2" rc-code="2" op-status="0" interval="0"
>> last-run="1256765055" last-rc-change="1256765055" exec-time="20"
>> queue-time="0" op-digest="b43714e34c3a33fee83d41f2016b1d71"/>
>>          </lrm_resource>
>>        </lrm_resources>
>>      </lrm>
>>      <transient_attributes id="host_145">
>>        <instance_attributes id="status-host_145">
>>          <nvpair id="status-host_145-probe_complete"
>> name="probe_complete" value="true"/>
>>        </instance_attributes>
>>      </transient_attributes>
>>    </node_state>
>>  </status>
>> </cib>
>> ==============================
>>
>>
>>
>> Please advise.
>>
>> Sincerely
>> Shravan
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>




More information about the Pacemaker mailing list