[Pacemaker] When pacemaker expects resource to go directly to Master after start?
Andrei Borzenkov
arvidjaar at gmail.com
Thu Oct 2 12:41:59 CEST 2014
On Thu, Oct 2, 2014 at 2:36 PM, emmanuel segura <emi2fast at gmail.com> wrote:
> I don't know if you can use Dummy primitivi as MS
>
> egrep "promote|demote" /usr/lib/ocf/resource.d/pacemaker/Dummy
> echo $?
> 1
>
Yes, I know I'm not bright, but still not *that* stupid :)
cn1:/usr/lib/ocf/resource.d # grep -E 'promote|demote' test/Dummy
<action name="promote" timeout="20" />
<action name="demote" timeout="20" />
promote) echo MASTER > ${OCF_RESKEY_state};;
demote) echo SLAVE > ${OCF_RESKEY_state};;
cn1:/usr/lib/ocf/resource.d # ocf-tester -n XXX $PWD/test/Dummy
Beginning tests for /usr/lib/ocf/resource.d/test/Dummy...
* Your agent does not support the notify action (optional)
/usr/lib/ocf/resource.d/test/Dummy passed all tests
cn1:/usr/lib/ocf/resource.d #
>
>
>
> 2014-10-02 12:02 GMT+02:00 Andrei Borzenkov <arvidjaar at gmail.com>:
>> According to documentation (Pacemaker 1.1.x explained) "when
>> [Master/Slave] the resource is started, it must come up in the
>> mode called Slave". But what I observe here - in some cases pacemaker
>> treats Slave state as error. As example (pacemaker 1.1.9):
>>
>> Oct 2 13:23:34 cn1 pengine[9446]: notice: unpack_rsc_op: Operation
>> monitor found resource test_Dummy:0 active in master mode on cn1
>>
>> So resource currently is Master on node cn1. Second node boots and
>> starts pacemaker which now decides to restart it on the first node (I
>> know why it happens, so it is not relevant to this question :) )
>>
>> Oct 2 13:23:34 cn1 pengine[9446]: notice: LogActions: Restart
>> test_Dummy:0 (Master cn1)
>> Oct 2 13:23:34 cn1 pengine[9446]: notice: LogActions: Start
>> test_Dummy:1 (cn2)
>> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 31: monitor test_Dummy:1_monitor_0 on cn2
>> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 84: demote test_Dummy_demote_0 on cn1 (local)
>> Oct 2 13:23:34 cn1 crmd[9447]: notice: process_lrm_event: LRM
>> operation test_Dummy_demote_0 (call=1227, rc=0, cib-update=7826,
>> confirmed=true) ok
>> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 85: stop test_Dummy_stop_0 on cn1 (local)
>> Oct 2 13:23:34 cn1 crmd[9447]: notice: process_lrm_event: LRM
>> operation test_Dummy_stop_0 (call=1234, rc=0, cib-update=7827,
>> confirmed=true) ok
>>
>> As expected it calls demote first and stop next. At this point
>> resource is stopped.
>>
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 83: start test_Dummy_start_0 on cn1 (local)
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 87: start test_Dummy:1_start_0 on cn2
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: process_lrm_event: LRM
>> operation test_Dummy_start_0 (call=1244, rc=0, cib-update=7830,
>> confirmed=true) ok
>>
>> Resource is started again. In full conformance with requirement above,
>> it is now slave.
>>
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 88: monitor test_Dummy:1_monitor_11000 on cn2
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
>> action 3: monitor test_Dummy_monitor_10000 on cn1 (local)
>> Oct 2 13:23:35 cn1 crmd[9447]: notice: process_lrm_event: LRM
>> operation test_Dummy_monitor_10000 (call=1247, rc=0, cib-update=7831,
>> confirmed=false) ok
>> Oct 2 13:23:35 cn1 crmd[9447]: warning: status_from_rc: Action 3
>> (test_Dummy_monitor_10000) on cn1 failed (target: 8 vs. rc: 0): Error
>>
>> Oops! Why pacemaker expects resource to be Master on cn1? It had been
>> stopped, it was started, it was not promoted yet. Only after recovery
>> from above "error" does it get promoted:
>>
>> Oct 2 13:23:41 cn1 pengine[9446]: notice: LogActions: Promote
>> test_Dummy:0 (Slave -> Master cn1)
>>
>> primitive pcm_Dummy ocf:pacemaker:Dummy
>> primitive test_Dummy ocf:test:Dummy \
>> op monitor interval="10" role="Master" \
>> op monitor interval="11" \
>> op start interval="0" timeout="30" \
>> op stop interval="0" timeout="120" \
>> op promote interval="0" timeout="20" \
>> op demote interval="0" timeout="20"
>> ms ms_Dummy test_Dummy \
>> meta target-role="Master"
>> clone cln_Dummy pcm_Dummy
>> order ms_Dummy-after-cln_Dummy 2000: cln_Dummy ms_Dummy
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list