[Pacemaker] When pacemaker expects resource to go directly to Master after start?
Andrew Beekhof
andrew at beekhof.net
Mon Oct 6 04:08:22 CEST 2014
On 2 Oct 2014, at 8:02 pm, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> According to documentation (Pacemaker 1.1.x explained) "when
> [Master/Slave] the resource is started, it must come up in the
> mode called Slave". But what I observe here - in some cases pacemaker
> treats Slave state as error. As example (pacemaker 1.1.9):
>
> Oct 2 13:23:34 cn1 pengine[9446]: notice: unpack_rsc_op: Operation
> monitor found resource test_Dummy:0 active in master mode on cn1
>
> So resource currently is Master on node cn1. Second node boots and
> starts pacemaker which now decides to restart it on the first node (I
> know why it happens, so it is not relevant to this question :) )
>
> Oct 2 13:23:34 cn1 pengine[9446]: notice: LogActions: Restart
> test_Dummy:0 (Master cn1)
> Oct 2 13:23:34 cn1 pengine[9446]: notice: LogActions: Start
> test_Dummy:1 (cn2)
> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 31: monitor test_Dummy:1_monitor_0 on cn2
> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 84: demote test_Dummy_demote_0 on cn1 (local)
> Oct 2 13:23:34 cn1 crmd[9447]: notice: process_lrm_event: LRM
> operation test_Dummy_demote_0 (call=1227, rc=0, cib-update=7826,
> confirmed=true) ok
> Oct 2 13:23:34 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 85: stop test_Dummy_stop_0 on cn1 (local)
> Oct 2 13:23:34 cn1 crmd[9447]: notice: process_lrm_event: LRM
> operation test_Dummy_stop_0 (call=1234, rc=0, cib-update=7827,
> confirmed=true) ok
>
> As expected it calls demote first and stop next. At this point
> resource is stopped.
>
> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 83: start test_Dummy_start_0 on cn1 (local)
> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 87: start test_Dummy:1_start_0 on cn2
> Oct 2 13:23:35 cn1 crmd[9447]: notice: process_lrm_event: LRM
> operation test_Dummy_start_0 (call=1244, rc=0, cib-update=7830,
> confirmed=true) ok
>
> Resource is started again. In full conformance with requirement above,
> it is now slave.
>
> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 88: monitor test_Dummy:1_monitor_11000 on cn2
> Oct 2 13:23:35 cn1 crmd[9447]: notice: te_rsc_command: Initiating
> action 3: monitor test_Dummy_monitor_10000 on cn1 (local)
> Oct 2 13:23:35 cn1 crmd[9447]: notice: process_lrm_event: LRM
> operation test_Dummy_monitor_10000 (call=1247, rc=0, cib-update=7831,
> confirmed=false) ok
> Oct 2 13:23:35 cn1 crmd[9447]: warning: status_from_rc: Action 3
> (test_Dummy_monitor_10000) on cn1 failed (target: 8 vs. rc: 0): Error
>
> Oops! Why pacemaker expects resource to be Master on cn1? It had been
> stopped, it was started, it was not promoted yet.
true. more than likely a bug that has been fixed since 1.1.9.
if you send through a crm_report i can verify if the current code would have some the right thing
> Only after recovery
> from above "error" does it get promoted:
>
> Oct 2 13:23:41 cn1 pengine[9446]: notice: LogActions: Promote
> test_Dummy:0 (Slave -> Master cn1)
>
> primitive pcm_Dummy ocf:pacemaker:Dummy
> primitive test_Dummy ocf:test:Dummy \
> op monitor interval="10" role="Master" \
> op monitor interval="11" \
> op start interval="0" timeout="30" \
> op stop interval="0" timeout="120" \
> op promote interval="0" timeout="20" \
> op demote interval="0" timeout="20"
> ms ms_Dummy test_Dummy \
> meta target-role="Master"
> clone cln_Dummy pcm_Dummy
> order ms_Dummy-after-cln_Dummy 2000: cln_Dummy ms_Dummy
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20141006/3b94c2ba/attachment.sig>
More information about the Pacemaker
mailing list