[Pacemaker] Stuck in a STONITH cycle

David Parker dparker at utica.edu
Fri Oct 12 16:58:11 EDT 2012


Hello,

I have two nodes set up in a cluster to provide a MySQL server (mysqld) 
in HA on a virtual IP address.  This was working fine until I had to 
reboot the servers.  All I did was change the interface each node uses 
for its primary IP address (changed from eth1 to eth0 on each node).  
Now I'm stuck in a cycle.  Let's say node 1 has the virtual IP and is 
running mysqld, and node 2 is down.  When node 2 boots up, it will 
STONITH node 1 for no apparent reason and take over the resources, which 
shouldn't happen.  When node 1 boots up again, it will STONITH node 2 
and take over the resources, which again shouldn't happen.

Obviously, the goal is to have both nodes running, and whichever one 
boots up second should see that the first one has the resources and then 
stand by.  I'll include the relevant log entries and my crm config at 
the end of this message.  In the logs, the only thing I see that seems 
unusual (other than the fact that a STONITH is executed against a 
running node) is this:

Oct 12 16:27:22 ha1 crmd: [1176]: info: populate_cib_nodes_ha: 
Requesting the list of configured nodes
Oct 12 16:27:23 ha1 crmd: [1176]: WARN: get_uuid: Could not calculate 
UUID for ha2
Oct 12 16:27:23 ha1 crmd: [1176]: WARN: populate_cib_nodes_ha: Node ha2: 
no uuid found
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_state_transition: All 1 
cluster nodes are eligible to run resources.

The exact opposite shows up on the node "ha2" (it says ha1 has no 
uuid).  Did the UUID of each node change because the physical interface 
changed?  Any other ideas?  Thanks in advance.

     - Dave

Here are the log lines:

Oct 12 16:26:13 ha1 heartbeat: [1176]: info: Starting 
"/usr/lib/heartbeat/crmd" as uid 103  gid 105 (pid 1176)
Oct 12 16:26:13 ha1 crmd: [1176]: info: crm_log_init: Changed active 
directory to /var/lib/heartbeat/cores/hacluster
Oct 12 16:26:13 ha1 crmd: [1176]: info: Invoked: /usr/lib/heartbeat/crmd
Oct 12 16:26:13 ha1 crmd: [1176]: info: main: CRM Hg Version: 
74392a28b7f31d7ddc86689598bd23114f58978b
Oct 12 16:26:13 ha1 crmd: [1176]: info: crmd_init: Starting crmd
Oct 12 16:26:13 ha1 crmd: [1176]: info: G_main_add_SignalHandler: Added 
signal handler for signal 17
Oct 12 16:26:14 ha1 crmd: [1176]: info: do_cib_control: CIB connection 
established
Oct 12 16:26:14 ha1 crmd: [1176]: info: crm_cluster_connect: Connecting 
to Heartbeat
Oct 12 16:26:15 ha1 crmd: [1176]: info: register_heartbeat_conn: 
Hostname: ha1
Oct 12 16:26:15 ha1 crmd: [1176]: info: register_heartbeat_conn: UUID: 
1b48f410-44d1-4e89-8b52-ff23b32db1bc
Oct 12 16:26:15 ha1 crmd: [1176]: info: do_ha_control: Connected to the 
cluster
Oct 12 16:26:15 ha1 crmd: [1176]: info: do_ccm_control: CCM connection 
established... waiting for first callback
Oct 12 16:26:15 ha1 crmd: [1176]: info: do_started: Delaying start, CCM 
(0000000000100000) not connected
Oct 12 16:26:15 ha1 crmd: [1176]: info: crmd_init: Starting crmd's mainloop
Oct 12 16:26:15 ha1 crmd: [1176]: info: config_query_callback: Checking 
for expired actions every 900000ms
Oct 12 16:26:15 ha1 crmd: [1176]: notice: crmd_client_status_callback: 
Status update: Client ha1/crmd now has status [online] (DC=false)
Oct 12 16:26:15 ha1 crmd: [1176]: info: crm_new_peer: Node 0 is now 
known as ha1
Oct 12 16:26:15 ha1 crmd: [1176]: info: crm_update_peer_proc: ha1.crmd 
is now online
Oct 12 16:26:15 ha1 crmd: [1176]: info: crmd_client_status_callback: Not 
the DC
Oct 12 16:26:15 ha1 crmd: [1176]: notice: crmd_client_status_callback: 
Status update: Client ha1/crmd now has status [online] (DC=false)
Oct 12 16:26:16 ha1 crmd: [1176]: info: crmd_client_status_callback: Not 
the DC
Oct 12 16:26:16 ha1 crmd: [1176]: info: mem_handle_event: Got an event 
OC_EV_MS_NEW_MEMBERSHIP from ccm
Oct 12 16:26:16 ha1 crmd: [1176]: info: mem_handle_event: instance=1, 
nodes=1, new=1, lost=0, n_idx=0, new_idx=0, old_idx=3
Oct 12 16:26:16 ha1 crmd: [1176]: info: crmd_ccm_msg_callback: Quorum 
(re)attained after event=NEW MEMBERSHIP (id=1)
Oct 12 16:26:16 ha1 crmd: [1176]: info: ccm_event_detail: NEW 
MEMBERSHIP: trans=1, nodes=1, new=1, lost=0 n_idx=0, new_idx=0, old_idx=3
Oct 12 16:26:16 ha1 crmd: [1176]: info: ccm_event_detail: #011CURRENT: 
ha1 [nodeid=0, born=1]
Oct 12 16:26:16 ha1 crmd: [1176]: info: ccm_event_detail: #011NEW:     
ha1 [nodeid=0, born=1]
Oct 12 16:26:16 ha1 crmd: [1176]: info: crm_update_peer: Node ha1: id=0 
state=member (new) addr=(null) votes=-1 born=1 seen=1 
proc=00000000000000000000000000000200
Oct 12 16:26:16 ha1 crmd: [1176]: info: crm_update_peer_proc: ha1.ais is 
now online
Oct 12 16:26:16 ha1 crmd: [1176]: info: do_started: The local CRM is 
operational
Oct 12 16:26:16 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_STARTING -> S_PENDING [ input=I_PENDING 
cause=C_FSA_INTERNAL origin=do_started ]
Oct 12 16:27:17 ha1 crmd: [1176]: info: crm_timer_popped: Election 
Trigger (I_DC_TIMEOUT) just popped!
Oct 12 16:27:17 ha1 crmd: [1176]: WARN: do_log: FSA: Input I_DC_TIMEOUT 
from crm_timer_popped() received in state S_PENDING
Oct 12 16:27:17 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT 
cause=C_TIMER_POPPED origin=crm_timer_popped ]
Oct 12 16:27:17 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC 
cause=C_FSA_INTERNAL origin=do_election_check ]
Oct 12 16:27:17 ha1 crmd: [1176]: info: do_te_control: Registering TE 
UUID: 4f352feb-6be6-4af1-bb9e-af8b7ca28f58
Oct 12 16:27:17 ha1 crmd: [1176]: WARN: cib_client_add_notify_callback: 
Callback already present
Oct 12 16:27:17 ha1 crmd: [1176]: info: set_graph_functions: Setting 
custom graph functions
Oct 12 16:27:17 ha1 crmd: [1176]: info: unpack_graph: Unpacked 
transition -1: 0 actions in 0 synapses
Oct 12 16:27:17 ha1 crmd: [1176]: info: start_subsystem: Starting 
sub-system "pengine"
Oct 12 16:27:20 ha1 crmd: [1176]: info: do_dc_takeover: Taking over DC 
status for this partition
Oct 12 16:27:20 ha1 crmd: [1176]: info: join_make_offer: Making join 
offers based on membership 1
Oct 12 16:27:20 ha1 crmd: [1176]: info: do_dc_join_offer_all: join-1: 
Waiting on 1 outstanding join acks
Oct 12 16:27:20 ha1 crmd: [1176]: info: te_connect_stonith: Attempting 
connection to fencing daemon...
Oct 12 16:27:21 ha1 crmd: [1176]: info: te_connect_stonith: Connected
Oct 12 16:27:21 ha1 crmd: [1176]: info: config_query_callback: Checking 
for expired actions every 900000ms
Oct 12 16:27:21 ha1 crmd: [1176]: info: update_dc: Set DC to ha1 (3.0.1)
Oct 12 16:27:21 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED 
cause=C_FSA_INTERNAL origin=check_join_state ]
Oct 12 16:27:21 ha1 crmd: [1176]: info: do_state_transition: All 1 
cluster nodes responded to the join offer.
Oct 12 16:27:21 ha1 crmd: [1176]: info: do_dc_join_finalize: join-1: 
Syncing the CIB from ha1 to the rest of the cluster
Oct 12 16:27:22 ha1 crmd: [1176]: info: update_attrd: Connecting to attrd...
Oct 12 16:27:22 ha1 crmd: [1176]: info: erase_xpath_callback: Deletion 
of "//node_state[@uname='ha1']/transient_attributes": ok (rc=0)
Oct 12 16:27:22 ha1 crmd: [1176]: info: do_dc_join_ack: join-1: Updating 
node state to member for ha1
Oct 12 16:27:22 ha1 crmd: [1176]: info: erase_xpath_callback: Deletion 
of "//node_state[@uname='ha1']/lrm": ok (rc=0)
Oct 12 16:27:22 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED 
cause=C_FSA_INTERNAL origin=check_join_state ]
Oct 12 16:27:22 ha1 crmd: [1176]: info: populate_cib_nodes_ha: 
Requesting the list of configured nodes
Oct 12 16:27:23 ha1 crmd: [1176]: WARN: get_uuid: Could not calculate 
UUID for ha2
Oct 12 16:27:23 ha1 crmd: [1176]: WARN: populate_cib_nodes_ha: Node ha2: 
no uuid found
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_state_transition: All 1 
cluster nodes are eligible to run resources.
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_dc_join_final: Ensuring DC, 
quorum and node attributes are up-to-date
Oct 12 16:27:23 ha1 crmd: [1176]: info: crm_update_quorum: Updating 
quorum status to true (call=21)
Oct 12 16:27:23 ha1 crmd: [1176]: info: abort_transition_graph: 
do_te_invoke:191 - Triggered transition abort (complete=1) : Peer Cancelled
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_pe_invoke: Query 22: 
Requesting the current CIB: S_POLICY_ENGINE
Oct 12 16:27:23 ha1 crmd: [1176]: info: abort_transition_graph: 
need_abort:59 - Triggered transition abort (complete=1) : Non-status change
Oct 12 16:27:23 ha1 crmd: [1176]: info: need_abort: Aborting on change 
to admin_epoch
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_pe_invoke: Query 23: 
Requesting the current CIB: S_POLICY_ENGINE
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_pe_invoke_callback: Invoking 
the PE: query=23, ref=pe_calc-dc-1350073643-7, seq=1, quorate=1
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 12 16:27:23 ha1 crmd: [1176]: info: unpack_graph: Unpacked 
transition 0: 20 actions in 20 synapses
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_te_invoke: Processing graph 0 
(ref=pe_calc-dc-1350073643-7) derived from /var/lib/pengine/pe-warn-1370.bz2
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 4: monitor stonith-ha1_monitor_0 on ha1 (local)
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=4:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=stonith-ha1_monitor_0 )
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 5: monitor stonith-ha2_monitor_0 on ha1 (local)
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=5:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=stonith-ha2_monitor_0 )
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 6: monitor MysqlIP_monitor_0 on ha1 (local)
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=6:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=MysqlIP_monitor_0 )
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 7: monitor mysqld_monitor_0 on ha1 (local)
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=7:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=mysqld_monitor_0 )
Oct 12 16:27:23 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
stonith-ha1_monitor_0 (call=2, rc=7, cib-update=24, confirmed=true) not 
running
Oct 12 16:27:23 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
stonith-ha2_monitor_0 (call=3, rc=7, cib-update=25, confirmed=true) not 
running
Oct 12 16:27:23 ha1 crmd: [1176]: info: match_graph_event: Action 
stonith-ha1_monitor_0 (4) confirmed on ha1 (rc=0)
Oct 12 16:27:23 ha1 crmd: [1176]: info: match_graph_event: Action 
stonith-ha2_monitor_0 (5) confirmed on ha1 (rc=0)
Oct 12 16:27:23 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
mysqld_monitor_0 (call=5, rc=7, cib-update=26, confirmed=true) not running
Oct 12 16:27:23 ha1 crmd: [1176]: info: match_graph_event: Action 
mysqld_monitor_0 (7) confirmed on ha1 (rc=0)
Oct 12 16:27:23 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
MysqlIP_monitor_0 (call=4, rc=7, cib-update=27, confirmed=true) not running
Oct 12 16:27:23 ha1 crmd: [1176]: info: match_graph_event: Action 
MysqlIP_monitor_0 (6) confirmed on ha1 (rc=0)
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on ha1 (local) - no waiting
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
2 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 8: start stonith-ha2_start_0 on ha1 (local)
Oct 12 16:27:23 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=8:0:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=stonith-ha2_start_0 )
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
14 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
20 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
19 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
15 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
12 fired and confirmed
Oct 12 16:27:23 ha1 crmd: [1176]: info: abort_transition_graph: 
te_update_diff:146 - Triggered transition abort (complete=0, 
tag=transient_attributes, id=1b48f410-44d1-4e89-8b52-ff23b32db1bc, 
magic=NA, cib=0.61.6) : Transient attribute:
  update
Oct 12 16:27:23 ha1 crmd: [1176]: info: update_abort_priority: Abort 
priority upgraded from 0 to 1000000
Oct 12 16:27:23 ha1 crmd: [1176]: info: update_abort_priority: Abort 
action done superceeded by restart
Oct 12 16:27:34 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
stonith-ha2_start_0 (call=6, rc=0, cib-update=28, confirmed=true) ok
Oct 12 16:27:34 ha1 crmd: [1176]: info: match_graph_event: Action 
stonith-ha2_start_0 (8) confirmed on ha1 (rc=0)
Oct 12 16:27:34 ha1 crmd: [1176]: info: run_graph: 
====================================================
Oct 12 16:27:34 ha1 crmd: [1176]: notice: run_graph: Transition 0 
(Complete=12, Pending=0, Fired=0, Skipped=8, Incomplete=0, 
Source=/var/lib/pengine/pe-warn-1370.bz2): Stopped
Oct 12 16:27:34 ha1 crmd: [1176]: info: te_graph_trigger: Transition 0 
is now complete
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_state_transition: All 1 
cluster nodes are eligible to run resources.
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_pe_invoke: Query 29: 
Requesting the current CIB: S_POLICY_ENGINE
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_pe_invoke_callback: Invoking 
the PE: query=29, ref=pe_calc-dc-1350073654-14, seq=1, quorate=1
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 12 16:27:34 ha1 crmd: [1176]: info: unpack_graph: Unpacked 
transition 1: 9 actions in 9 synapses
Oct 12 16:27:34 ha1 crmd: [1176]: info: do_te_invoke: Processing graph 1 
(ref=pe_calc-dc-1350073654-14) derived from 
/var/lib/pengine/pe-warn-1371.bz2
Oct 12 16:27:34 ha1 stonithd: [1174]: info: client tengine [pid: 1176] 
requests a STONITH operation RESET on node ha2
Oct 12 16:27:34 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
9 fired and confirmed
Oct 12 16:27:34 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
13 fired and confirmed
Oct 12 16:27:34 ha1 crmd: [1176]: info: te_fence_node: Executing reboot 
fencing operation (15) on ha2 (timeout=60000)
Oct 12 16:28:08 ha1 crmd: [1176]: info: tengine_stonith_callback: 
call=1266, optype=1, node_name=ha2, result=0, node_list=ha1, 
action=15:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
1 fired and confirmed
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
14 fired and confirmed
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 6: start MysqlIP_start_0 on ha1 (local)
Oct 12 16:28:08 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=6:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=MysqlIP_start_0 )
Oct 12 16:28:08 ha1 crmd: [1176]: info: erase_xpath_callback: Deletion 
of "//node_state[@uname='ha2']/lrm": ok (rc=0)
Oct 12 16:28:08 ha1 crmd: [1176]: info: erase_xpath_callback: Deletion 
of "//node_state[@uname='ha2']/transient_attributes": ok (rc=0)
Oct 12 16:28:08 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
MysqlIP_start_0 (call=7, rc=0, cib-update=33, confirmed=true) ok
Oct 12 16:28:08 ha1 crmd: [1176]: info: match_graph_event: Action 
MysqlIP_start_0 (6) confirmed on ha1 (rc=0)
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 7: monitor MysqlIP_monitor_10000 on ha1 (local)
Oct 12 16:28:08 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=7:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=MysqlIP_monitor_10000 )
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_rsc_command: Initiating 
action 8: start mysqld_start_0 on ha1 (local)
Oct 12 16:28:08 ha1 crmd: [1176]: info: do_lrm_rsc_op: Performing 
key=8:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58 op=mysqld_start_0 )
Oct 12 16:28:08 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
mysqld_start_0 (call=9, rc=0, cib-update=34, confirmed=true) ok
Oct 12 16:28:08 ha1 crmd: [1176]: info: match_graph_event: Action 
mysqld_start_0 (8) confirmed on ha1 (rc=0)
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_pseudo_action: Pseudo action 
10 fired and confirmed
Oct 12 16:28:08 ha1 crmd: [1176]: info: process_lrm_event: LRM operation 
MysqlIP_monitor_10000 (call=8, rc=0, cib-update=35, confirmed=false) ok
Oct 12 16:28:08 ha1 crmd: [1176]: info: match_graph_event: Action 
MysqlIP_monitor_10000 (7) confirmed on ha1 (rc=0)
Oct 12 16:28:08 ha1 crmd: [1176]: info: run_graph: 
====================================================
Oct 12 16:28:08 ha1 crmd: [1176]: notice: run_graph: Transition 1 
(Complete=9, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-warn-1371.bz2): Complete
Oct 12 16:28:08 ha1 crmd: [1176]: info: te_graph_trigger: Transition 1 
is now complete
Oct 12 16:28:08 ha1 crmd: [1176]: info: notify_crmd: Transition 1 
status: done - <null>
Oct 12 16:28:08 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 12 16:28:08 ha1 crmd: [1176]: info: do_state_transition: Starting 
PEngine Recheck Timer
Oct 12 16:43:08 ha1 crmd: [1176]: info: crm_timer_popped: PEngine 
Recheck Timer (I_PE_CALC) just popped!
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_TIMER_POPPED origin=crm_timer_popped ]
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: Progressed 
to state S_POLICY_ENGINE after C_TIMER_POPPED
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: All 1 
cluster nodes are eligible to run resources.
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_pe_invoke: Query 36: 
Requesting the current CIB: S_POLICY_ENGINE
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_pe_invoke_callback: Invoking 
the PE: query=36, ref=pe_calc-dc-1350074588-18, seq=1, quorate=1
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 12 16:43:08 ha1 crmd: [1176]: info: unpack_graph: Unpacked 
transition 2: 0 actions in 0 synapses
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_te_invoke: Processing graph 2 
(ref=pe_calc-dc-1350074588-18) derived from 
/var/lib/pengine/pe-input-4140.bz2
Oct 12 16:43:08 ha1 crmd: [1176]: info: run_graph: 
====================================================
Oct 12 16:43:08 ha1 crmd: [1176]: notice: run_graph: Transition 2 
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-input-4140.bz2): Complete
Oct 12 16:43:08 ha1 crmd: [1176]: info: te_graph_trigger: Transition 2 
is now complete
Oct 12 16:43:08 ha1 crmd: [1176]: info: notify_crmd: Transition 2 
status: done - <null>
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 12 16:43:08 ha1 crmd: [1176]: info: do_state_transition: Starting 
PEngine Recheck Timer

Here's my config:

root at ha1:~# cibadmin --query
<cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" 
have-quorum="1" admin_epoch="0" epoch="61" num_updates="11" 
cib-last-written="Fri Oct 12 16:07:33 2012" 
dc-uuid="1b48f410-44d1-4e89-8b52-ff23b32db1bc">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version" 
value="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure" 
name="cluster-infrastructure" value="Heartbeat"/>
<nvpair id="cib-bootstrap-options-stonith-enabled" 
name="stonith-enabled" value="true"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="1b48f410-44d1-4e89-8b52-ff23b32db1bc" uname="ha1" type="normal"/>
<node id="9790fe6e-67b2-4817-abf4-966b5aa6948c" uname="ha2" type="normal"/>
</nodes>
<resources>
<primitive class="stonith" id="stonith-ha1" type="external/riloe">
<instance_attributes id="stonith-ha1-instance_attributes">
<nvpair id="stonith-ha1-instance_attributes-hostlist" name="hostlist" 
value="ha1"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_hostname" 
name="ilo_hostname" value="10.0.1.111"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_user" name="ilo_user" 
value="Administrator"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_password" 
name="ilo_password" value="xxx"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_can_reset" 
name="ilo_can_reset" value="1"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_protocol" 
name="ilo_protocol" value="2.0"/>
<nvpair id="stonith-ha1-instance_attributes-ilo_powerdown_method" 
name="ilo_powerdown_method" value="button"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="stonith-ha2" type="external/riloe">
<instance_attributes id="stonith-ha2-instance_attributes">
<nvpair id="stonith-ha2-instance_attributes-hostlist" name="hostlist" 
value="ha2"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_hostname" 
name="ilo_hostname" value="10.0.1.112"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_user" name="ilo_user" 
value="Administrator"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_password" 
name="ilo_password" value="xxx"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_can_reset" 
name="ilo_can_reset" value="1"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_protocol" 
name="ilo_protocol" value="2.0"/>
<nvpair id="stonith-ha2-instance_attributes-ilo_powerdown_method" 
name="ilo_powerdown_method" value="button"/>
</instance_attributes>
</primitive>
<group id="mysql-resources">
<primitive class="ocf" id="MysqlIP" provider="heartbeat" type="IPaddr2">
<instance_attributes id="MysqlIP-instance_attributes">
<nvpair id="MysqlIP-instance_attributes-ip" name="ip" value="192.168.25.9"/>
<nvpair id="MysqlIP-instance_attributes-cidr_netmask" 
name="cidr_netmask" value="32"/>
</instance_attributes>
<operations>
<op id="MysqlIP-monitor-10s" interval="10s" name="monitor"/>
</operations>
</primitive>
<primitive id="mysqld" class="lsb" type="mysqld"/>
</group>
</resources>
<constraints>
<rsc_location id="loc-3" rsc="stonith-ha1" node="ha1" score="-INFINITY"/>
<rsc_location id="loc-4" rsc="stonith-ha2" node="ha2" score="-INFINITY"/>
<rsc_location id="loc-5" rsc="mysql-resources" node="ha1" score="100"/>
</constraints>
<rsc_defaults/>
<op_defaults/>
</configuration>
<status>
<node_state id="1b48f410-44d1-4e89-8b52-ff23b32db1bc" uname="ha1" 
ha="active" in_ccm="true" crmd="online" join="member" expected="member" 
crm-debug-origin="do_update_resource" shutdown="0">
<lrm id="1b48f410-44d1-4e89-8b52-ff23b32db1bc">
<lrm_resources>
<lrm_resource id="stonith-ha1" type="external/riloe" class="stonith">
<lrm_rsc_op id="stonith-ha1_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="4:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:7;4:0:7:4f352feb-6be6-4
af1-bb9e-af8b7ca28f58" call-id="2" rc-code="7" op-status="0" 
interval="0" last-run="1350073642" last-rc-change="1350073642" 
exec-time="10" queue-time="0" op-digest="f7c04bf4ee3eb60cf4f9e841449a0fe6"/>
</lrm_resource>
<lrm_resource id="stonith-ha2" type="external/riloe" class="stonith">
<lrm_rsc_op id="stonith-ha2_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="5:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:7;5:0:7:4f352feb-6be6-4
af1-bb9e-af8b7ca28f58" call-id="3" rc-code="7" op-status="0" 
interval="0" last-run="1350073643" last-rc-change="1350073643" 
exec-time="0" queue-time="0" op-digest="c8773e5caa7ae2a75f71a79adc582d54"/>
<lrm_rsc_op id="stonith-ha2_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="8:0:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:0;8:0:0:4f352feb-6be6-4af1-
bb9e-af8b7ca28f58" call-id="6" rc-code="0" op-status="0" interval="0" 
last-run="1350073643" last-rc-change="1350073643" exec-time="10920" 
queue-time="0" op-digest="c8773e5caa7ae2a75f71a79adc582d54"/>
</lrm_resource>
<lrm_resource id="mysqld" type="mysqld" class="lsb">
<lrm_rsc_op id="mysqld_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="7:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:7;7:0:7:4f352feb-6be6-4af1-b
b9e-af8b7ca28f58" call-id="5" rc-code="7" op-status="0" interval="0" 
last-run="1350073642" last-rc-change="1350073642" exec-time="40" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
<lrm_rsc_op id="mysqld_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="8:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:0;8:1:0:4f352feb-6be6-4af1-bb9e-
af8b7ca28f58" call-id="9" rc-code="0" op-status="0" interval="0" 
last-run="1350073688" last-rc-change="1350073688" exec-time="0" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
</lrm_resource>
<lrm_resource id="MysqlIP" type="IPaddr2" class="ocf" provider="heartbeat">
<lrm_rsc_op id="MysqlIP_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="6:0:7:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:7;6:0:7:4f352feb-6be6-4af1-
bb9e-af8b7ca28f58" call-id="4" rc-code="7" op-status="0" interval="0" 
last-run="1350073642" last-rc-change="1350073642" exec-time="150" 
queue-time="0" op-digest="9611b7026c2dc135fbd13d3537b42d16"/>
<lrm_rsc_op id="MysqlIP_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="6:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:0;6:1:0:4f352feb-6be6-4af1-bb9e
-af8b7ca28f58" call-id="7" rc-code="0" op-status="0" interval="0" 
last-run="1350073687" last-rc-change="1350073687" exec-time="30" 
queue-time="0" op-digest="9611b7026c2dc135fbd13d3537b42d16"/>
<lrm_rsc_op id="MysqlIP_monitor_10000" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.1" 
transition-key="7:1:0:4f352feb-6be6-4af1-bb9e-af8b7ca28f58" 
transition-magic="0:0;7:1:0:4f352feb-6be6-4
af1-bb9e-af8b7ca28f58" call-id="8" rc-code="0" op-status="0" 
interval="10000" last-run="1350073687" last-rc-change="1350073687" 
exec-time="10" queue-time="0" op-digest="caf3454e2233540200e7a6a5319c5362"/>
</lrm_resource>
</lrm_resources>
</lrm>
<transient_attributes id="1b48f410-44d1-4e89-8b52-ff23b32db1bc">
<instance_attributes id="status-1b48f410-44d1-4e89-8b52-ff23b32db1bc">
<nvpair id="status-1b48f410-44d1-4e89-8b52-ff23b32db1bc-probe_complete" 
name="probe_complete" value="true"/>
</instance_attributes>
</transient_attributes>
</node_state>
<node_state id="9790fe6e-67b2-4817-abf4-966b5aa6948c" uname="ha2" 
ha="dead" in_ccm="false" crmd="offline" join="down" expected="down" 
crm-debug-origin="send_stonith_update"/>
</status>
</cib>

-- 

Dave Parker
Systems Administrator
Utica College
Integrated Information Technology Services
(315) 792-3229
Registered Linux User #408177





More information about the Pacemaker mailing list