[Pacemaker] Action from a different CRMD transition results in restarting services

Latrous, Youssef YLatrous at BroadViewNet.com
Wed Dec 12 14:31:57 EST 2012


Hi,

 

I run into the following issue and I couldn't find what it really means:

 

        Detected action msgbroker_monitor_10000 from a different
transition: 16048 vs. 18014

 

I can see that its impact is to stop/start a service but I'd like to
understand it a bit more.

 

Thank you in advance for any information.

 

 

Logs about this issue:

...

Dec  6 22:55:05 Node1 crmd: [5235]: info: process_graph_event: Detected
action msgbroker_monitor_10000 from a different transition: 16048 vs.
18014

Dec  6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph:
process_graph_event:477 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=msgbroker_monitor_10000,
magic=0:7;104:16048:0:5fb57f01-3397-45a8-905f-c48cecdc8692, cib=0.971.5)
: Old event

Dec  6 22:55:05 Node1 crmd: [5235]: WARN: update_failcount: Updating
failcount for msgbroker on Node0 after failed monitor: rc=7
(update=value++, time=1354852505)

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=abort_transition_graph ]

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_state_transition: All 2
cluster nodes are eligible to run resources.

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28069:
Requesting the current CIB: S_POLICY_ENGINE

Dec  6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph:
te_update_diff:142 - Triggered transition abort (complete=1, tag=nvpair,
id=status-Node0-fail-count-msgbroker, magic=NA, cib=0.971.6) : Transient
attribute: update

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28070:
Requesting the current CIB: S_POLICY_ENGINE

Dec  6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph:
te_update_diff:142 - Triggered transition abort (complete=1, tag=nvpair,
id=status-Node0-last-failure-msgbroker, magic=NA, cib=0.971.7) :
Transient attribute: update

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28071:
Requesting the current CIB: S_POLICY_ENGINE

Dec  6 22:55:05 Node1 attrd: [5232]: info: find_hash_entry: Creating
hash entry for last-failure-msgbroker

Dec  6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke_callback:
Invoking the PE: query=28071, ref=pe_calc-dc-1354852505-39407, seq=12,
quorate=1

Dec  6 22:55:05 Node1 pengine: [5233]: notice: unpack_config: On loss of
CCM Quorum: Ignore

Dec  6 22:55:05 Node1 pengine: [5233]: notice: unpack_rsc_op: Operation
txpublisher_monitor_0 found resource txpublisher active on Node1

Dec  6 22:55:05 Node1 pengine: [5233]: WARN: unpack_rsc_op: Processing
failed op msgbroker_monitor_10000 on Node0: not running (7)

...

Dec  6 22:55:05 Node1 pengine: [5233]: notice: common_apply_stickiness:
msgbroker can fail 999999 more times on Node0 before being forced off

...

Dec  6 22:55:05 Node1 pengine: [5233]: notice: RecurringOp:  Start
recurring monitor (10s) for msgbroker on Node0

...

Dec  6 22:55:05 Node1 pengine: [5233]: notice: LogActions: Recover
msgbroker  (Started Node0)

...

Dec  6 22:55:05 Node1 crmd: [5235]: info: te_rsc_command: Initiating
action 37: stop msgbroker_stop_0 on Node0

 

 

Transition 18014 details:

 

Dec  6 22:52:18 Node1 pengine: [5233]: notice: process_pe_message:
Transition 18014: PEngine Input stored in:
/var/lib/pengine/pe-input-3270.bz2

Dec  6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Dec  6 22:52:18 Node1 crmd: [5235]: info: unpack_graph: Unpacked
transition 18014: 0 actions in 0 synapses

Dec  6 22:52:18 Node1 crmd: [5235]: info: do_te_invoke: Processing graph
18014 (ref=pe_calc-dc-1354852338-39406) derived from
/var/lib/pengine/pe-input-3270.bz2

Dec  6 22:52:18 Node1 crmd: [5235]: info: run_graph:
====================================================

Dec  6 22:52:18 Node1 crmd: [5235]: notice: run_graph: Transition 18014
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-3270.bz2): Complete

Dec  6 22:52:18 Node1 crmd: [5235]: info: te_graph_trigger: Transition
18014 is now complete

Dec  6 22:52:18 Node1 crmd: [5235]: info: notify_crmd: Transition 18014
status: done - <null>

Dec  6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Dec  6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: Starting
PEngine Recheck Timer

 

 

Youssef

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20121212/da7f07ce/attachment-0002.html>


More information about the Pacemaker mailing list