[Pacemaker] Failure after intermittent network outage
Pavel Levshin
pavel at levshin.spb.ru
Fri Mar 11 12:31:14 UTC 2011
Hi Andrew.
I'm sorry, but I can not agree.
Look again at the DC log. Here it says: "Action lost". This is why I use
this term.
Then it declares every monitor action as it has failed with rc=1, which
is not true. Note that even those actions which were directed to
inexistent RA are listed as failed with rc=1. (DRBD is not installed on
target server, so there is no ocf:linbit:drbd).
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 30]: In-flight (id:
ilo-wapgw1-1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
30: ilo-wapgw1-1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 31]: In-flight (id:
ilo-wapgw1-2:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
31: ilo-wapgw1-2:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 32]: In-flight (id:
ilo-wapgw1-log:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
32: ilo-wapgw1-log:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 33]: In-flight (id:
p-drbd-mdirect1-1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
33: p-drbd-mdirect1-1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 34]: In-flight (id:
p-drbd-mdirect1-2:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
34: p-drbd-mdirect1-2:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 35]: In-flight (id:
p-drbd-mproxy1-1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
35: p-drbd-mproxy1-1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 36]: In-flight (id:
p-drbd-mproxy1-2:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
36: p-drbd-mproxy1-2:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 37]: In-flight (id:
p-drbd-mrouter1-1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
37: p-drbd-mrouter1-1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 38]: In-flight (id:
p-drbd-mrouter1-2:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
38: p-drbd-mrouter1-2:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 39]: In-flight (id:
vm-mdirect1-1_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
39: vm-mdirect1-1_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 40]: In-flight (id:
vm-mdirect1-2_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
40: vm-mdirect1-2_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 41]: In-flight (id:
vm-mproxy1-1_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
41: vm-mproxy1-1_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 42]: In-flight (id:
vm-mproxy1-2_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
42: vm-mproxy1-2_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 43]: In-flight (id:
vm-mrouter1-1_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
43: vm-mrouter1-1_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 44]: In-flight (id:
vm-mrouter1-2_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
44: vm-mrouter1-2_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 45]: In-flight (id:
ip-puppetmaster_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
45: ip-puppetmaster_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 46]: In-flight (id:
ip-logserver_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
46: ip-logserver_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 47]: In-flight (id:
vm-vradius1_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
47: vm-vradius1_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 48]: In-flight (id:
p-drbd-vradius1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
48: p-drbd-vradius1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 49]: In-flight (id: vm-ppg1_monitor_0,
loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
49: vm-ppg1_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: action_timer_callback:
Timer popped (timeout=20000, abort_level=1000000, complete=false)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: ERROR: print_elem: Aborting
transition, action lost: [Action 50]: In-flight (id:
p-drbd-ppg1:0_monitor_0, loc: wapgw1-log, priority: 0)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
action_timer_callback:486 - Triggered transition abort (complete=0) :
Action lost
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: cib_action_update: rsc_op
50: p-drbd-ppg1:0_monitor_0 on wapgw1-log timed out
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: WARN: status_from_rc: Action 30
(ilo-wapgw1-1:0_monitor_0) on wapgw1-log failed (target: 7 vs. rc: 1): Error
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
match_graph_event:272 - Triggered transition abort (complete=0,
tag=lrm_rsc_op, id=ilo-wapgw1-1:0_monitor_0,
magic=2:1;30:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.8) :
Event failed
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: match_graph_event: Action
ilo-wapgw1-1:0_monitor_0 (30) confirmed on wapgw1-log (rc=4)
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: te_rsc_command: Initiating
action 29: probe_complete probe_complete on wapgw1-log - no waiting
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: run_graph:
====================================================
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: notice: run_graph: Transition
1353 (Complete=22, Pending=0, Fired=0, Skipped=13, Incomplete=2,
Source=/var/lib/pengine/pe-input-1525.bz2): Stopped
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: te_graph_trigger:
Transition 1353 is now complete
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
do_te_invoke:191 - Triggered transition abort (complete=1) : Peer Cancelled
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3669:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3670:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
ilo-wapgw1-2:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=ilo-wapgw1-2:0_monitor_0,
magic=2:1;31:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.9) :
Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3671:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
ilo-wapgw1-log:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=ilo-wapgw1-log:0_monitor_0,
magic=2:1;32:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.10)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3672:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mdirect1-1:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mdirect1-1:0_monitor_0,
magic=2:1;33:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.11)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3673:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mdirect1-2:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mdirect1-2:0_monitor_0,
magic=2:1;34:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.12)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3674:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mproxy1-1:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mproxy1-1:0_monitor_0,
magic=2:1;35:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.13)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3675:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mproxy1-2:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mproxy1-2:0_monitor_0,
magic=2:1;36:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.14)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3676:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mrouter1-1:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mrouter1-1:0_monitor_0,
magic=2:1;37:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.15)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3677:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-mrouter1-2:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-mrouter1-2:0_monitor_0,
magic=2:1;38:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.16)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3678:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mdirect1-1_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mdirect1-1_monitor_0,
magic=2:1;39:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.17)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3679:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mdirect1-2_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mdirect1-2_monitor_0,
magic=2:1;40:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.18)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3680:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mproxy1-1_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mproxy1-1_monitor_0,
magic=2:1;41:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.19)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3681:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mproxy1-2_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mproxy1-2_monitor_0,
magic=2:1;42:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.20)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3682:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mrouter1-1_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mrouter1-1_monitor_0,
magic=2:1;43:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.21)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3683:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-mrouter1-2_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-mrouter1-2_monitor_0,
magic=2:1;44:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.22)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3684:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
ip-puppetmaster_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=ip-puppetmaster_monitor_0,
magic=2:1;45:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.23)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3685:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
ip-logserver_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=ip-logserver_monitor_0,
magic=2:1;46:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.24)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3686:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-vradius1_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-vradius1_monitor_0,
magic=2:1;47:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.25)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3687:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-vradius1:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-vradius1:0_monitor_0,
magic=2:1;48:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.26)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3688:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
vm-ppg1_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=vm-ppg1_monitor_0,
magic=2:1;49:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.27)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3689:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: process_graph_event: Action
p-drbd-ppg1:0_monitor_0 arrived after a completed transition
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: abort_transition_graph:
process_graph_event:467 - Triggered transition abort (complete=1,
tag=lrm_rsc_op, id=p-drbd-ppg1:0_monitor_0,
magic=2:1;50:1353:7:22dc5497-478f-49ff-b07f-9fcd6da325cd, cib=0.799.28)
: Inactive graph
Mar 1 11:17:20 wapgw1-2 crmd: [5749]: info: do_pe_invoke: Query 3690:
Requesting the current CIB: S_POLICY_ENGINE
Mar 1 11:17:21 wapgw1-2 crmd: [5749]: info: do_pe_invoke_callback:
Invoking the PE: query=3690, ref=pe_calc-dc-1298967441-1625, seq=3504,
quorate=1
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: info: determine_online_status:
Node wapgw1-log is online
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ip-puppetmaster_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-log:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mproxy1-2:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mdirect1-1:0_monitor_0 on wapgw1-log:
unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mrouter1-1:0_monitor_0 on wapgw1-log:
unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-ppg1:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-1:0_start_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-1:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mdirect1-1_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mproxy1-1_monitor_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mdirect1-2_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mproxy1-2_monitor_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mrouter1-1_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mdirect1-2:0_monitor_0 on wapgw1-log:
unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mrouter1-2:0_monitor_0 on wapgw1-log:
unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-mrouter1-2_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-mproxy1-1:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-vradius1_monitor_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op vm-ppg1_monitor_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-2:0_start_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-2:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ip-logserver_monitor_0 on wapgw1-log: unknown error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op p-drbd-vradius1:0_monitor_0 on wapgw1-log: unknown
error (1)
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: info: determine_online_status:
Node wapgw1-1 is online
Mar 1 11:17:21 wapgw1-2 pengine: [5748]: WARN: unpack_rsc_op:
Processing failed op ilo-wapgw1-log:1_start_0 on wapgw1-1: unknown error (1)
This is the part of code in te_callbacks.c which is responsible for this:
===============
gboolean
action_timer_callback(gpointer data)
{
crm_action_timer_t *timer = NULL;
CRM_CHECK(data != NULL, return FALSE);
timer = (crm_action_timer_t*)data;
stop_te_timer(timer);
crm_warn("Timer popped (timeout=%d, abort_level=%d, complete=%s)",
timer->timeout,
transition_graph->abort_priority,
transition_graph->complete?"true":"false");
CRM_CHECK(timer->action != NULL, return FALSE);
if(transition_graph->complete) {
crm_warn("Ignoring timeout while not in transition");
} else if(timer->reason == timeout_action_warn) {
print_action(
LOG_WARNING,"Action missed its timeout: ",
timer->action);
/* Don't check the FSA state
*
* We might also be in S_INTEGRATION or some other state
waiting for this
* action so we can close the transition and continue
*/
} else {
/* fail the action */
gboolean send_update = TRUE;
const char *task = crm_element_value(timer->action->xml,
XML_LRM_ATTR_TASK);
print_action(LOG_ERR, "Aborting transition, action lost: ",
timer->action);
timer->action->failed = TRUE;
timer->action->confirmed = TRUE;
abort_transition(INFINITY, tg_restart, "Action lost", NULL);
update_graph(transition_graph, timer->action);
trigger_graph();
if(timer->action->type != action_type_rsc) {
send_update = FALSE;
} else if(safe_str_eq(task, "cancel")) {
/* we dont need to update the CIB with these */
send_update = FALSE;
}
if(send_update) {
/* cib_action_update(timer->action, LRM_OP_PENDING,
EXECRA_STATUS_UNKNOWN); */
cib_action_update(timer->action, LRM_OP_TIMEOUT,
EXECRA_UNKNOWN_ERROR);
}
}
==========
CIB had been updated with EXECRA_UNKNOWN_ERROR, and so on.
> Either remove the RA, or make sure it returns something sensible when
> tools or configuration it needs are not available.
This is what I mean by "error-prone". Such RA may appear again from fresh RPM. And errors in RAs just happen.
OK, I see, there is a way: I could copy each RA to the new location (like ocf:safe:VirtualDomain), so they will not be touched by RPMS.
I could even give each resource it's own RA, such as VirtualDomain-X, VirtualDomain-Y and so on, and place them only on those nodes where resource can run.
I only think it is not the best possible way to go.
> No. For safety we still need to verify that X is not running on node
> C before we allow it to be active anywhere else.
> That you know the X is unavailable on C is one thing, but the cluster
> needs to know too.
Therefore, I propose an addition to the Pacemaker: a way to tell the cluster that resource X cannot be executed on node C. Currently, it is done through status section of the CIB. I wish there was a way to do the same via configuration. Then the cluster could get rid of quirks with unneeded RAs.
Maybe anyone will support my proposal?
--
Pavel Levshin //flicker
More information about the Pacemaker
mailing list