[Pacemaker] Pacemaker failover problem
Erich Weiler
weiler at soe.ucsc.edu
Tue Mar 9 15:04:45 UTC 2010
Thanks for the reply! Yes, I have checked that my LSB scripts are
compliant. If this can provide any insight, here is a clip from
/var/log/messsages on genome-ldap2 when genome-ldap1 goes down (when the
order constraint is in place):
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info:
handle_shutdown_request: Creating shutdown request for genome-ldap1
(state=S_IDLE)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap1, magic=NA, cib=0.117.3) :
Transient attribute: update
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 2 cluster nodes are eligible to run resources.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
126: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=126, ref=pe_calc-dc-1268146762-101, seq=304,
quorate=1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: short_print:
Started: [ genome-ldap2 genome-ldap1 ]
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1 times on genome-ldap2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice:
common_apply_stickiness: LDAP-clone can fail 999999 more times on
genome-ldap2 before being forced off
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: RecurringOp:
Start recurring monitor (30s) for LDAP-IP on genome-ldap2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Move
resource LDAP-IP (Started genome-ldap1 -> genome-ldap2)
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions:
Restart resource LDAP:0 (Started genome-ldap2)
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP:1 (genome-ldap1)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 65: 12 actions in 12 synapses
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 65 (ref=pe_calc-dc-1268146762-101) derived from
/var/lib/pengine/pe-warn-98.bz2
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 16 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 13: stop LDAP:1_stop_0 on genome-ldap1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 65: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-98.bz2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:1_stop_0 (13) confirmed on genome-ldap1 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 17 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 8: stop LDAP-IP_stop_0 on genome-ldap1
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_stop_0 (8) confirmed on genome-ldap1 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 9: start LDAP-IP_start_0 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=9:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_start_0 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:12: start
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 4 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_crm_command:
Executing crm-event (20): do_shutdown on genome-ldap1
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip -f inet addr add
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip -f inet addr add
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip link set eth0 up
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip link set eth0 up
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO:
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83
auto not_used not_used
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO:
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83
auto not_used not_used
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_start_0 (call=12, rc=0, cib-update=127, confirmed=true) ok
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_start_0 (9) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 10: monitor LDAP-IP_monitor_30000 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=10:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_monitor_30000 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:13: monitor
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 14 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 12: start LDAP:0_start_0 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=12:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP:0_start_0 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:14: start
Mar 9 06:59:22 genome-ldap2 lrmd: [4333]: WARN: For LSB init script, no
additional parameters are needed.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_monitor_30000 (call=13, rc=0, cib-update=128,
confirmed=false) ok
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_monitor_30000 (10) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) Checking configuration files for slapd:
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stderr) config file testing succeeded
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) OK ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) Starting slapd:
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) FAILED
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_start_0 (call=14, rc=1, cib-update=129, confirmed=true)
unknown error
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: WARN: status_from_rc: Action
12 (LDAP:0_start_0) on genome-ldap2 failed (target: 0 vs. rc: 1): Error
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: WARN: update_failcount:
Updating failcount for LDAP:0 on genome-ldap2 after failed start: rc=1
(update=INFINITY, time=1268146763)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
match_graph_event:272 - Triggered transition abort (complete=0,
tag=lrm_rsc_op, id=LDAP:0_start_0,
magic=0:1;12:65:0:0cc80735-d478-48c0-8260-02b627bed719, cib=0.117.8) :
Event failed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort priority upgraded from 0 to 1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort action done superceeded by restart
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:0_start_0 (12) confirmed on genome-ldap2 (rc=4)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 15 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
65 (Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0,
Source=/var/lib/pengine/pe-warn-98.bz2): Stopped
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 65 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 2 cluster nodes are eligible to run resources.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
130: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-LDAP:0 (INFINITY)
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info:
cib_process_shutdown_req: Shutdown REQ from genome-ldap1
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_shutdown_req for section 'all'
(origin=genome-ldap1/genome-ldap1/(null), version=0.117.8): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update:
Sent update 31: fail-count-LDAP:0=INFINITY
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-LDAP:0 (1268146763)
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update:
Sent update 33: last-failure-LDAP:0=1268146763
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=130, ref=pe_calc-dc-1268146763-108, seq=304,
quorate=1
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.9) :
Transient attribute: update
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.10) :
Transient attribute: update
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
131: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
132: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP:0 (lsb:ldap): Started genome-ldap2 FAILED
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice:
common_apply_stickiness: LDAP-clone can fail 999999 more times on
genome-ldap2 before being forced off
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: RecurringOp:
Start recurring monitor (10s) for LDAP:0 on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP-IP (Started genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions:
Recover resource LDAP:0 (Started genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=132, ref=pe_calc-dc-1268146763-109, seq=304,
quorate=1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: handle_response:
pe_calc calculation pe_calc-dc-1268146763-108 is obsolete
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 66: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-99.bz2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP:0 (lsb:ldap): Started genome-ldap2 FAILED
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1000000 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN:
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after
1000000 failures (max=1000000)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:0 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP-IP cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP-IP (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP:0 (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 67: 6 actions in 6 synapses
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 67 (ref=pe_calc-dc-1268146763-109) derived from
/var/lib/pengine/pe-warn-100.bz2
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 10 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_crm_command:
Executing crm-event (14): do_shutdown on genome-ldap1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 2: stop LDAP:0_stop_0 on genome-ldap2 (local)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=2:67:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP:0_stop_0 )
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:15: stop
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_monitor_10000 (call=11, status=1, cib-update=0,
confirmed=true) Cancelled
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] CLM
CONFIGURATION CHANGE
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] New Configuration:
Mar 9 06:59:23 genome-ldap2 cib: [2034]: notice: ais_dispatch:
Membership 308: quorum lost
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: ais_dispatch:
Membership 308: quorum lost
Mar 9 06:59:23 genome-ldap2 lrmd: [4390]: WARN: For LSB init script, no
additional parameters are needed.
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.85)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 67: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-100.bz2
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: crm_update_peer: Node
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84)
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: ais_status_callback:
status: genome-ldap1 is now lost (was member)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Left:
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_peer: Node
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84)
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.84)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: erase_node_from_join:
Removed node genome-ldap1 from join calculations: welcomed=0 itegrated=0
finalized=0 confirmed=1
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Joined:
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_quorum:
Updating quorum status to false (call=135)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) Stopping slapd:
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/133, version=0.117.10): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 308: memb=1,
new=0, lost=1
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: memb: genome-ldap2 1426129162
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: lost: genome-ldap1 1409351946
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] CLM
CONFIGURATION CHANGE
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] New Configuration:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.85)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Left:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Joined:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 308: memb=1, new=0, lost=0
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: MEMB: genome-ldap2 1426129162
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node genome-ldap1 was not seen in the
previous transition
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
update_member: Node 1409351946/genome-ldap1 is now: lost
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
send_member_notification: Sending membership update 308 to 2 children
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [TOTEM ] A processor
joined or left the membership and a new membership was formed.
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [MAIN ] Completed
service synchronization, ready to provide service.
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element:
cib:diff: - <cib have-quorum="1" admin_epoch="0" epoch="117"
num_updates="11" />
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element:
cib:diff: + <cib have-quorum="0" admin_epoch="0" epoch="118"
num_updates="1" />
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section cib
(origin=local/crmd/135, version=0.118.1): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
need_abort:59 - Triggered transition abort (complete=0) : Non-status change
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort priority upgraded from 0 to 1000000
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort action done superceeded by restart
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: need_abort: Aborting on
change to have-quorum
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents:
Archived previous version as /var/lib/heartbeat/crm/cib-42.raw
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents:
Wrote version 0.118.0 of the CIB to disk (digest:
5ab39d8c6134011247378d7b7a8e8cb9)
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.3Fcmhq (digest:
/var/lib/heartbeat/crm/cib.sbJeBo)
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/137, version=0.118.1): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) OK
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_stop_0 (call=15, rc=0, cib-update=138, confirmed=true) ok
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:0_stop_0 (2) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 11 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
67 (Complete=4, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pengine/pe-warn-100.bz2): Stopped
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 67 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 1 cluster nodes are eligible to run resources.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
139: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=139, ref=pe_calc-dc-1268146763-113, seq=308,
quorate=0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:0 LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1000000 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN:
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after
1000000 failures (max=1000000)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:0 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP-IP cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP-IP (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:0 (Stopped)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 68: 2 actions in 2 synapses
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 68 (ref=pe_calc-dc-1268146763-113) derived from
/var/lib/pengine/pe-warn-101.bz2
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 5: stop LDAP-IP_stop_0 on genome-ldap2 (local)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=5:68:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_stop_0 )
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:16: stop
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_monitor_30000 (call=13, status=1, cib-update=0,
confirmed=true) Cancelled
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 68: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-101.bz2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -f inet addr delete
10.1.1.83/16 dev eth0
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -f inet addr delete
10.1.1.83/16 dev eth0
Mar 9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -o -f inet addr
show eth0
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -o -f inet addr show
eth0
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_stop_0 (call=16, rc=0, cib-update=140, confirmed=true) ok
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_stop_0 (5) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 2 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
68 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-warn-101.bz2): Complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 68 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: notify_crmd: Transition
68 status: done - <null>
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
Starting PEngine Recheck Timer
Mar 9 06:59:26 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) ARPING 10.1.1.83 from 10.1.1.83 eth0 Sent 5
probes (5 broadcast(s)) Received 0 response(s)
I also thought the expected behavior was to stop LDAP first, start the
IP, then restart LDAP. I think I'd rather have it behave that way. Any
clue as to what may be causing the problem?
Here again is my config:
node genome-ldap1
node genome-ldap2
primitive LDAP lsb:ldap \
op monitor interval="10s" timeout="15s" \
meta target-role="Started"
primitive LDAP-IP ocf:heartbeat:IPaddr2 \
params ip="10.1.1.83" nic="eth0" cidr_netmask="16" \
op monitor interval="30s" timeout="20s" \
meta target-role="Started"
clone LDAP-clone LDAP \
meta clone-max="2" clone-node-max="1" globally-unique="false"
location LDAP-IP-placement-1 LDAP-IP 100: genome-ldap1
location LDAP-IP-placement-2 LDAP-IP 50: genome-ldap2
location LDAP-placement-1 LDAP-clone 100: genome-ldap1
location LDAP-placement-2 LDAP-clone 100: genome-ldap2
colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
order LDAP-after-IP inf: LDAP-IP LDAP-clone
property $id="cib-bootstrap-options" \
dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
symmetric-cluster="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1268087338"
[root at genome-ldap2 ~]#
Thanks!
-erich
Andrew Beekhof wrote:
> On Tue, Mar 9, 2010 at 12:27 AM, Erich Weiler <weiler at soe.ucsc.edu> wrote:
>> I think I may have found an answer. I had this in my config:
>>
>> order LDAP-after-IP inf: LDAP-IP LDAP-clone
>>
>> And, according to the logs, it *looks* like what happens when genome-ldap1
>> goes gown, the IP goes over to genome-ldap2, AND THEN tries to start LDAP
>> there, even though LDAP is already started there because it is an anonymous
>> clone. LDAP cannot start (because it is already started) and throws an
>> error exit code, and presumably pacemaker freaks out because of that and
>> shuts down LDAP on all nodes. Then the floating IP disappears because of
>> the line:
>>
>> colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
>>
>> which is expected at that point. It seems that when I tested this with
>> older versions of pacemaker, this didn't happen. Should 'order' statements
>> be avoided entirely when dealing with anonymous clones? Is that behavior
>> expected?
>
> The ordering constraint should have caused the cluster to stop LDAP first.
> Have you checked both scripts are fully LSB compliant?
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list