[ClusterLabs] GlusterFS all apache instances stopped after failover

lukas lukas.kostyan at gmail.com
Thu May 7 13:23:50 CEST 2015


Hi,

have a config with 2 nodes where glusterfs is being mounted to root dir 
of apache. After producing a kernel panic by echo c > 
/proc/sysrq-trigger on node2, node 2 is turned off by stonith. ping 
should work further on node1 but it stops since apache is stopped on 
node1 as well. So why is apache stopped on node1, has someone any ideas. 
Below the logs
following setup: pacemaker 1.1.7, corosync 1.4



Online: [ vm-1 vm-2 ]

  Clone Set: cl_gluster_mnt [p_gluster_mnt]
      Started: [ vm-1 vm-2 ]
  Clone Set: cl_apache [p_apache]
      Started: [ vm-1 vm-2 ]
  Clone Set: cl_IP [IP] (unique)
      IP:0       (ocf::heartbeat:IPaddr2):       Started vm-1
      IP:1       (ocf::heartbeat:IPaddr2):       Started vm-2
p_fence_N1      (stonith:external/libvirt):     Started vm-2
p_fence_N2      (stonith:external/libvirt):     Started vm-1

root at vm-1:~# crm configure show
node vm-1 \
     attributes standby="off"
node vm-2 \
     attributes standby="off"
primitive IP ocf:heartbeat:IPaddr2 \
     params ip="192.168.122.200" nic="eth0" 
clusterip_hash="sourceip-sourceport" \
     op monitor interval="10s"
primitive p_apache ocf:heartbeat:apache \
     params configfile="/etc/apache2/apache2.conf" 
statusurl="http://localhost/server-status" \
     op monitor interval="60" timeout="20" \
     op start interval="0" timeout="40s" start-delay="0" \
     meta is-managed="true"
primitive p_fence_N1 stonith:external/libvirt \
     params hostlist="vm-1:N1" 
hypervisor_uri="qemu+tcp://192.168.122.1/system" pcmk_reboot_action="off" \
     op monitor interval="60" \
     meta target-role="Started"
primitive p_fence_N2 stonith:external/libvirt \
     params hostlist="vm-2:N2" 
hypervisor_uri="qemu+tcp://192.168.122.1/system" pcmk_reboot_action="off" \
     op monitor interval="60"
primitive p_gluster_mnt ocf:heartbeat:Filesystem \
     params device="localhost:/gvolrep" directory="/var/www/html" 
fstype="glusterfs" \
     op monitor interval="10"
clone cl_IP IP \
     meta globally-unique="true" clone-max="2" clone-node-max="2" \
     params resource-stickiness="0"
clone cl_apache p_apache \
     meta target-role="Started"
clone cl_gluster_mnt p_gluster_mnt \
     meta target-role="Started"
location l_fence_N1 p_fence_N1 -inf: vm-1
location l_fence_N2 p_fence_N2 -inf: vm-2
colocation c_apache_gluster inf: cl_IP cl_gluster_mnt
colocation c_ip_apache inf: cl_apache cl_IP
order o_apache inf: cl_gluster_mnt cl_IP cl_apache
property $id="cib-bootstrap-options" \
     dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
     cluster-infrastructure="openais" \
     expected-quorum-votes="2" \
     stonith-enabled="true" \
     no-quorum-policy="ignore" \
     last-lrm-refresh="1430996556"
rsc_defaults $id="rsc-options" \
     resource-stickiness="100"
op_defaults $id="op-options" \
     timeout="240s"

############################
root at vm-1:~# tail -f /var/log/syslog
May  7 13:19:58 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 27: monitor p_apache:1_monitor_60000 on vm-2
May  7 13:19:58 vm-1 lrmd: [3036]: info: operation monitor[54] on 
p_apache:0 for client 3039: pid 27310 exited with return code 0
May  7 13:19:58 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation p_apache:0_monitor_60000 (call=54, rc=0, cib-update=322, 
confirmed=false) ok
May  7 13:19:58 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 30 
(Complete=34, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-input-1906.bz2): Complete
May  7 13:19:58 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
May  7 13:20:12 vm-1 stonith-ng: [3035]: info: stonith_command: 
Processed st_execute from lrmd: rc=-1
May  7 13:20:12 vm-1 external/libvirt[27429]: [27444]: notice: 
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May  7 13:20:13 vm-1 stonith: [27422]: info: external/libvirt device OK.
May  7 13:20:13 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: Performing: stonith -t external/libvirt -S
May  7 13:20:13 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: success:  0
May  7 13:21:08 vm-1 corosync[2971]:   [TOTEM ] A processor failed, 
forming new configuration.
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] notice: 
pcmk_peer_update: Transitional membership event on ring 7024: memb=1, 
new=0, lost=1
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: pcmk_peer_update: 
memb: vm-1 2138745024
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: pcmk_peer_update: 
lost: vm-2 930785472
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] notice: 
pcmk_peer_update: Stable membership event on ring 7024: memb=1, new=0, 
lost=0
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: pcmk_peer_update: 
MEMB: vm-1 2138745024
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node vm-2 was not seen in the previous transition
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: update_member: 
Node 930785472/vm-2 is now: lost
May  7 13:21:12 vm-1 corosync[2971]:   [pcmk  ] info: 
send_member_notification: Sending membership update 7024 to 2 children
May  7 13:21:12 vm-1 cib: [3034]: notice: ais_dispatch_message: 
Membership 7024: quorum lost
May  7 13:21:12 vm-1 cib: [3034]: info: crm_update_peer: Node vm-2: 
id=930785472 state=lost (new) addr=r(0) ip(192.168.122.183)  votes=1 
born=7020 seen=7020 proc=00000000000000000000000000111312
May  7 13:21:12 vm-1 crmd: [3039]: notice: ais_dispatch_message: 
Membership 7024: quorum lost
May  7 13:21:12 vm-1 crmd: [3039]: info: ais_status_callback: status: 
vm-2 is now lost (was member)
May  7 13:21:12 vm-1 corosync[2971]:   [TOTEM ] A processor joined or 
left the membership and a new membership was formed.
May  7 13:21:12 vm-1 crmd: [3039]: info: crm_update_peer: Node vm-2: 
id=930785472 state=lost (new) addr=r(0) ip(192.168.122.183)  votes=1 
born=7020 seen=7020 proc=00000000000000000000000000111312
May  7 13:21:12 vm-1 corosync[2971]:   [CPG   ] chosen downlist: sender 
r(0) ip(192.168.122.127) ; members(old:2 left:1)
May  7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/323, 
version=0.502.55): ok (rc=0)
May  7 13:21:12 vm-1 corosync[2971]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
May  7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section cib (origin=local/crmd/325, 
version=0.502.57): ok (rc=0)
May  7 13:21:12 vm-1 crmd: [3039]: info: crmd_ais_dispatch: Setting 
expected votes to 2
May  7 13:21:12 vm-1 crmd: [3039]: WARN: match_down_event: No match for 
shutdown action on vm-2
May  7 13:21:12 vm-1 crmd: [3039]: info: te_update_diff: 
Stonith/shutdown of vm-2 not matched
May  7 13:21:12 vm-1 crmd: [3039]: info: abort_transition_graph: 
te_update_diff:234 - Triggered transition abort (complete=1, 
tag=node_state, id=vm-2, magic=NA, cib=0.502.56) : Node failure
May  7 13:21:12 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
May  7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section crm_config (origin=local/crmd/327, 
version=0.502.58): ok (rc=0)
May  7 13:21:12 vm-1 pengine: [3038]: notice: unpack_config: On loss of 
CCM Quorum: Ignore
May  7 13:21:12 vm-1 pengine: [3038]: WARN: pe_fence_node: Node vm-2 
will be fenced because it is un-expectedly down
May  7 13:21:12 vm-1 pengine: [3038]: WARN: determine_online_status: 
Node vm-2 is unclean
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action 
p_gluster_mnt:1_stop_0 on vm-2 is unrunnable (offline)
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node 
vm-2 unclean
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action 
p_apache:1_stop_0 on vm-2 is unrunnable (offline)
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node 
vm-2 unclean
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action 
IP:1_stop_0 on vm-2 is unrunnable (offline)
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node 
vm-2 unclean
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action 
p_fence_N1_stop_0 on vm-2 is unrunnable (offline)
May  7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node 
vm-2 unclean
May  7 13:21:12 vm-1 pengine: [3038]: WARN: stage6: Scheduling Node vm-2 
for STONITH
May  7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop 
p_gluster_mnt:1#011(vm-2)
May  7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Restart 
p_apache:0#011(Started vm-1)
May  7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop 
p_apache:1#011(vm-2)
May  7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Move 
IP:1#011(Started vm-2 -> vm-1)
May  7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop 
p_fence_N1#011(vm-2)
May  7 13:21:12 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
May  7 13:21:12 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph 
31 (ref=pe_calc-dc-1430997672-228) derived from 
/var/lib/pengine/pe-warn-661.bz2
May  7 13:21:12 vm-1 crmd: [3039]: notice: te_fence_node: Executing 
reboot fencing operation (36) on vm-2 (timeout=60000)
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: 
initiate_remote_stonith_op: Initiating remote operation reboot for vm-2: 
9a25d1de-b42d-44b1-9d49-368f36fa57fd
May  7 13:21:12 vm-1 pengine: [3038]: WARN: process_pe_message: 
Transition 31: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-661.bz2
May  7 13:21:12 vm-1 pengine: [3038]: notice: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: 
can_fence_host_with_device: Refreshing port list for p_fence_N2
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: 
can_fence_host_with_device: p_fence_N2 can fence vm-2: dynamic-list
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: call_remote_stonith: 
Requesting that vm-1 perform op reboot vm-2
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: 
can_fence_host_with_device: p_fence_N2 can fence vm-2: dynamic-list
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: stonith_fence: Found 1 
matching devices for 'vm-2'
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: stonith_command: 
Processed st_fence from vm-1: rc=-1
May  7 13:21:12 vm-1 stonith-ng: [3035]: info: make_args: Substituting 
action 'off' for requested operation 'reboot'
May  7 13:21:12 vm-1 external/libvirt[27944]: [27957]: notice: Domain N2 
was stopped
May  7 13:21:13 vm-1 stonith-ng: [3035]: notice: log_operation: 
Operation 'reboot' [27936] (call 0 from 
c5982dd8-7639-49be-a9f9-c8488bea6091) for host 'vm-2' with device 
'p_fence_N2' returned: 0
May  7 13:21:13 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: Performing: stonith -t external/libvirt -T off vm-2
May  7 13:21:13 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: success: vm-2 0
May  7 13:21:13 vm-1 stonith-ng: [3035]: notice: remote_op_done: 
Operation reboot of vm-2 by vm-1 for 
vm-1[c5982dd8-7639-49be-a9f9-c8488bea6091]: OK
May  7 13:21:13 vm-1 crmd: [3039]: info: tengine_stonith_callback: 
StonithOp <st-reply st_origin="stonith_construct_async_reply" 
t="stonith-ng" st_op="reboot" 
st_remote_op="9a25d1de-b42d-44b1-9d49-368f36fa57fd" 
st_clientid="c5982dd8-7639-49be-a9f9-c8488bea6091" st_target="vm-2" 
st_device_action="st_fence" st_callid="0" st_callopt="0" st_rc="0" 
st_output="Performing: stonith -t external/libvirt -T off 
vm-2#012success: vm-2 0#012" src="vm-1" seq="16" state="2" />
May  7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting 
xpath: //node_state[@uname='vm-2']/lrm
May  7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting 
xpath: //node_state[@uname='vm-2']/transient_attributes
May  7 13:21:13 vm-1 crmd: [3039]: notice: crmd_peer_update: Status 
update: Client vm-2/crmd now has status [offline] (DC=true)
May  7 13:21:13 vm-1 crmd: [3039]: notice: tengine_stonith_notify: Peer 
vm-2 was terminated (reboot) by vm-1 for vm-1: OK 
(ref=9a25d1de-b42d-44b1-9d49-368f36fa57fd)
May  7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN 
cause=C_FSA_INTERNAL origin=check_join_state ]
May  7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph: 
do_te_invoke:169 - Triggered transition abort (complete=0) : Peer Halt
May  7 13:21:13 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 31 
(Complete=3, Pending=0, Fired=0, Skipped=15, Incomplete=5, 
Source=/var/lib/pengine/pe-warn-661.bz2): Stopped
May  7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph: 
do_te_invoke:169 - Triggered transition abort (complete=1) : Peer Halt
May  7 13:21:13 vm-1 crmd: [3039]: info: join_make_offer: Making join 
offers based on membership 7024
May  7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_offer_all: join-11: 
Waiting on 1 outstanding join acks
May  7 13:21:13 vm-1 crmd: [3039]: info: update_dc: Set DC to vm-1 (3.0.6)
May  7 13:21:13 vm-1 crmd: [3039]: info: cib_fencing_updated: Fencing 
update 329 for vm-2: complete
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_delete for section //node_state[@uname='vm-2']/lrm 
(origin=local/crmd/330, version=0.502.60): ok (rc=0)
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_delete for section 
//node_state[@uname='vm-2']/transient_attributes (origin=local/crmd/331, 
version=0.502.61): ok (rc=0)
May  7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED 
cause=C_FSA_INTERNAL origin=check_join_state ]
May  7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_finalize: join-11: 
Syncing the CIB from vm-1 to the rest of the cluster
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_sync for section 'all' (origin=local/crmd/334, 
version=0.502.62): ok (rc=0)
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/335, 
version=0.502.63): ok (rc=0)
May  7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_ack: join-11: 
Updating node state to member for vm-1
May  7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting 
xpath: //node_state[@uname='vm-1']/lrm
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_delete for section //node_state[@uname='vm-1']/lrm 
(origin=local/crmd/336, version=0.502.64): ok (rc=0)
May  7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED 
cause=C_FSA_INTERNAL origin=check_join_state ]
May  7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph: 
do_te_invoke:162 - Triggered transition abort (complete=1) : Peer Cancelled
May  7 13:21:13 vm-1 attrd: [3037]: notice: attrd_local_callback: 
Sending full refresh (origin=crmd)
May  7 13:21:13 vm-1 attrd: [3037]: notice: attrd_trigger_update: 
Sending flush op to all hosts for: last-failure-p_apache:0 (1430996422)
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/338, 
version=0.502.66): ok (rc=0)
May  7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation 
complete: op cib_modify for section cib (origin=local/crmd/340, 
version=0.502.68): ok (rc=0)
May  7 13:21:13 vm-1 attrd: [3037]: notice: attrd_trigger_update: 
Sending flush op to all hosts for: probe_complete (true)
May  7 13:21:13 vm-1 pengine: [3038]: notice: unpack_config: On loss of 
CCM Quorum: Ignore
May  7 13:21:13 vm-1 pengine: [3038]: notice: LogActions: Restart 
p_apache:0#011(Started vm-1)
May  7 13:21:13 vm-1 pengine: [3038]: notice: LogActions: Start 
IP:1#011(vm-1)
May  7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
May  7 13:21:13 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph 
32 (ref=pe_calc-dc-1430997673-233) derived from 
/var/lib/pengine/pe-input-1907.bz2
May  7 13:21:13 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 14: stop p_apache:0_stop_0 on vm-1 (local)
May  7 13:21:13 vm-1 lrmd: [3036]: info: cancel_op: operation 
monitor[54] on p_apache:0 for client 3039, its parameters: 
CRM_meta_start_delay=[0] CRM_meta_timeout=[20000] 
CRM_meta_name=[monitor] crm_feature_set=[3.0.6] 
CRM_meta_clone_node_max=[1] configfile=[/etc/apache2/apache2.conf] 
CRM_meta_clone=[0] CRM_meta_interval=[60000] CRM_meta_clone_max=[2] 
CRM_meta_notify=[false] statusurl=[http://localhost/server-status] 
CRM_meta_globally_unique=[false]  cancelled
May  7 13:21:13 vm-1 lrmd: [3036]: info: rsc:p_apache:0 stop[55] (pid 27992)
May  7 13:21:13 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 22: start IP:1_start_0 on vm-1 (local)
May  7 13:21:13 vm-1 lrmd: [3036]: info: rsc:IP:1 start[56] (pid 27993)
May  7 13:21:13 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation p_apache:0_monitor_60000 (call=54, status=1, cib-update=0, 
confirmed=true) Cancelled
May  7 13:21:13 vm-1 pengine: [3038]: notice: process_pe_message: 
Transition 32: PEngine Input stored in: /var/lib/pengine/pe-input-1907.bz2
May  7 13:21:13 vm-1 stonith-ng: [3035]: info: stonith_command: 
Processed st_execute from lrmd: rc=-1
May  7 13:21:13 vm-1 external/libvirt[28069]: [28084]: notice: 
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May  7 13:21:14 vm-1 IPaddr2[27993]: INFO: /usr/lib/heartbeat/send_arp 
-i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.122.200 eth0 
192.168.122.200 8bbd5f0b558f not_used not_used
May  7 13:21:14 vm-1 lrmd: [3036]: info: operation start[56] on IP:1 for 
client 3039: pid 27993 exited with return code 0
May  7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation IP:1_start_0 (call=56, rc=0, cib-update=342, confirmed=true) ok
May  7 13:21:14 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 23: monitor IP:1_monitor_10000 on vm-1 (local)
May  7 13:21:14 vm-1 lrmd: [3036]: info: rsc:IP:1 monitor[57] (pid 28104)
May  7 13:21:14 vm-1 lrmd: [3036]: info: operation monitor[57] on IP:1 
for client 3039: pid 28104 exited with return code 0
May  7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation IP:1_monitor_10000 (call=57, rc=0, cib-update=343, 
confirmed=false) ok
May  7 13:21:14 vm-1 lrmd: [3036]: info: RA output: 
(p_apache:0:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/apache: 440: 
kill:
May  7 13:21:14 vm-1 lrmd: [3036]: info: RA output: 
(p_apache:0:stop:stderr) No such process
May  7 13:21:14 vm-1 lrmd: [3036]: info: RA output: 
(p_apache:0:stop:stderr)
May  7 13:21:14 vm-1 apache[27992]: INFO: Killing apache PID 27235
May  7 13:21:14 vm-1 apache[27992]: INFO: apache stopped.
May  7 13:21:14 vm-1 lrmd: [3036]: info: operation stop[55] on 
p_apache:0 for client 3039: pid 27992 exited with return code 0
May  7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation p_apache:0_stop_0 (call=55, rc=0, cib-update=344, 
confirmed=true) ok
May  7 13:21:14 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 15: start p_apache:0_start_0 on vm-1 (local)
May  7 13:21:14 vm-1 lrmd: [3036]: info: rsc:p_apache:0 start[58] (pid 
28166)
May  7 13:21:14 vm-1 stonith: [28062]: info: external/libvirt device OK.
May  7 13:21:14 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: Performing: stonith -t external/libvirt -S
May  7 13:21:14 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: success:  0
May  7 13:21:54 vm-1 lrmd: [3036]: WARN: p_apache:0:start process (PID 
28166) timed out (try 1).  Killing with signal SIGTERM (15).
May  7 13:21:54 vm-1 lrmd: [3036]: WARN: operation start[58] on 
p_apache:0 for client 3039: pid 28166 timed out
May  7 13:21:54 vm-1 crmd: [3039]: ERROR: process_lrm_event: LRM 
operation p_apache:0_start_0 (58) Timed Out (timeout=40000ms)
May  7 13:21:54 vm-1 crmd: [3039]: WARN: status_from_rc: Action 15 
(p_apache:0_start_0) on vm-1 failed (target: 0 vs. rc: -2): Error
May  7 13:21:54 vm-1 crmd: [3039]: WARN: update_failcount: Updating 
failcount for p_apache:0 on vm-1 after failed start: rc=-2 
(update=INFINITY, time=1430997714)
May  7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph: 
match_graph_event:277 - Triggered transition abort (complete=0, 
tag=lrm_rsc_op, id=p_apache:0_last_failure_0, 
magic=2:-2;15:32:0:da4997d2-70ea-45a7-9617-42963ea29e42, cib=0.502.74) : 
Event failed
May  7 13:21:54 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 32 
(Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0, 
Source=/var/lib/pengine/pe-input-1907.bz2): Stopped
May  7 13:21:54 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=notify_crmd ]
May  7 13:21:54 vm-1 attrd: [3037]: notice: attrd_trigger_update: 
Sending flush op to all hosts for: fail-count-p_apache:0 (INFINITY)
May  7 13:21:54 vm-1 attrd: [3037]: notice: attrd_perform_update: Sent 
update 94: fail-count-p_apache:0=INFINITY
May  7 13:21:54 vm-1 pengine: [3038]: notice: unpack_config: On loss of 
CCM Quorum: Ignore
May  7 13:21:54 vm-1 pengine: [3038]: WARN: unpack_rsc_op: Processing 
failed op p_apache:0_last_failure_0 on vm-1: unknown exec error (-2)
May  7 13:21:54 vm-1 attrd: [3037]: notice: attrd_trigger_update: 
Sending flush op to all hosts for: last-failure-p_apache:0 (1430997714)
May  7 13:21:54 vm-1 pengine: [3038]: notice: LogActions: Recover 
p_apache:0#011(Started vm-1)
May  7 13:21:54 vm-1 attrd: [3037]: notice: attrd_perform_update: Sent 
update 96: last-failure-p_apache:0=1430997714
May  7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph: 
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair, 
id=status-vm-1-fail-count-p_apache.0, name=fail-count-p_apache:0, 
value=INFINITY, magic=NA, cib=0.502.75) : Transient attribute: update
May  7 13:21:54 vm-1 crmd: [3039]: info: handle_response: pe_calc 
calculation pe_calc-dc-1430997714-238 is obsolete
May  7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph: 
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair, 
id=status-vm-1-last-failure-p_apache.0, name=last-failure-p_apache:0, 
value=1430997714, magic=NA, cib=0.502.76) : Transient attribute: update
May  7 13:21:54 vm-1 pengine: [3038]: notice: process_pe_message: 
Transition 33: PEngine Input stored in: /var/lib/pengine/pe-input-1908.bz2
May  7 13:21:54 vm-1 pengine: [3038]: notice: unpack_config: On loss of 
CCM Quorum: Ignore
May  7 13:21:54 vm-1 pengine: [3038]: WARN: unpack_rsc_op: Processing 
failed op p_apache:0_last_failure_0 on vm-1: unknown exec error (-2)
May  7 13:21:54 vm-1 pengine: [3038]: WARN: common_apply_stickiness: 
Forcing cl_apache away from vm-1 after 1000000 failures (max=1000000)
May  7 13:21:54 vm-1 pengine: [3038]: WARN: common_apply_stickiness: 
Forcing cl_apache away from vm-1 after 1000000 failures (max=1000000)
May  7 13:21:54 vm-1 pengine: [3038]: notice: LogActions: Stop 
p_apache:0#011(vm-1)
May  7 13:21:54 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
May  7 13:21:54 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph 
34 (ref=pe_calc-dc-1430997714-239) derived from 
/var/lib/pengine/pe-input-1909.bz2
May  7 13:21:54 vm-1 crmd: [3039]: info: te_rsc_command: Initiating 
action 4: stop p_apache:0_stop_0 on vm-1 (local)
May  7 13:21:54 vm-1 lrmd: [3036]: info: rsc:p_apache:0 stop[59] (pid 28650)
May  7 13:21:54 vm-1 pengine: [3038]: notice: process_pe_message: 
Transition 34: PEngine Input stored in: /var/lib/pengine/pe-input-1909.bz2
May  7 13:21:54 vm-1 apache[28650]: INFO: apache is not running.
May  7 13:21:54 vm-1 apache[28650]: INFO: apache children were signalled 
(SIGTERM)
May  7 13:21:56 vm-1 apache[28650]: INFO: apache children were signalled 
(SIGHUP)
May  7 13:21:57 vm-1 apache[28650]: INFO: apache children were signalled 
(SIGKILL)
May  7 13:21:58 vm-1 lrmd: [3036]: info: operation stop[59] on 
p_apache:0 for client 3039: pid 28650 exited with return code 0
May  7 13:21:58 vm-1 crmd: [3039]: info: process_lrm_event: LRM 
operation p_apache:0_stop_0 (call=59, rc=0, cib-update=349, 
confirmed=true) ok
May  7 13:21:58 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 34 
(Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-input-1909.bz2): Complete
May  7 13:21:58 vm-1 crmd: [3039]: notice: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
May  7 13:22:14 vm-1 stonith-ng: [3035]: info: stonith_command: 
Processed st_execute from lrmd: rc=-1
May  7 13:22:15 vm-1 external/libvirt[28943]: [28958]: notice: 
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May  7 13:22:16 vm-1 stonith: [28935]: info: external/libvirt device OK.
May  7 13:22:16 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: Performing: stonith -t external/libvirt -S
May  7 13:22:16 vm-1 stonith-ng: [3035]: info: log_operation: 
p_fence_N2: success:  0

Online: [ vm-1 ]
OFFLINE: [ vm-2 ]

  Clone Set: cl_gluster_mnt [p_gluster_mnt]
      Started: [ vm-1 ]
      Stopped: [ p_gluster_mnt:1 ]
  Clone Set: cl_IP [IP] (unique)
      IP:0       (ocf::heartbeat:IPaddr2):       Started vm-1
      IP:1       (ocf::heartbeat:IPaddr2):       Started vm-1
p_fence_N2      (stonith:external/libvirt):     Started vm-1

Failed actions:
     p_apache:0_start_0 (node=vm-1, call=58, rc=-2, status=Timed Out): 
unknown exec error




More information about the Users mailing list