[Pacemaker] heartbeat stop hangs sometimes
Markus M.
adrock0905 at alice.de
Mon Feb 22 12:00:29 UTC 2010
Hello,
sometimes "heartbeat stop" seems to hang (latest packets from
clusterlabs.org, RHEL5 x86_64, 2-node cluster with only one node running).
The last lines from ha-debug are like this:
Feb 22 12:52:48 dbprod21 ccm: [24053]: info: client (pid=24058) removed
from ccm
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_ha_control:
Disconnected from Heartbeat
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_cib_control:
Disconnecting CIB
Feb 22 12:52:48 dbprod21 cib: [24054]: info: cib_process_readwrite: We
are now in R/O mode
Feb 22 12:52:48 dbprod21 crmd: [24058]: info:
crmd_cib_connection_destroy: Connection to the CIB terminated...
Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: send_ipc_message: IPC
Channel to 24058 is not connected
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_exit: Performing
A_EXIT_0 - gracefully exiting the CRMd
Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: send_via_callback_channel:
Delivery of reply to client 24058/d9c9c281-4f38-46d8-b83e-54135f6c75e9
failed
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: free_mem: Dropping
I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ]
Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: do_local_notify: A-Sync
reply to crmd failed: reply failed
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_exit: [crmd] stopped (0)
Feb 22 12:52:48 dbprod21 heartbeat: [24040]: info: killing
/usr/lib64/heartbeat/attrd process group 24057 with signal 15
# ps -efw | grep heart
root 24040 1 0 12:49 ? 00:00:00 heartbeat: master
control process
root 24043 24040 0 12:49 ? 00:00:00 heartbeat: FIFO reader
root 24044 24040 0 12:49 ? 00:00:00 heartbeat: write: ucast eth0
root 24045 24040 0 12:49 ? 00:00:00 heartbeat: read: ucast eth0
root 24046 24040 0 12:49 ? 00:00:00 heartbeat: write: ucast eth0
root 24047 24040 0 12:49 ? 00:00:00 heartbeat: read: ucast eth0
root 24048 24040 0 12:49 ? 00:00:00 heartbeat: write: serial
/dev/ttyS0
root 24049 24040 0 12:49 ? 00:00:00 heartbeat: read: serial
/dev/ttyS0
101 24053 24040 0 12:50 ? 00:00:00 /usr/lib64/heartbeat/ccm
101 24054 24040 0 12:50 ? 00:00:00 /usr/lib64/heartbeat/cib
root 24055 24040 0 12:50 ? 00:00:00 /usr/lib64/heartbeat/lrmd -r
root 24056 24040 0 12:50 ? 00:00:00
/usr/lib64/heartbeat/stonithd
101 24057 24040 0 12:50 ? 00:00:00 /usr/lib64/heartbeat/attrd
root 24366 22245 0 12:52 pts/2 00:00:00 /bin/sh
/etc/init.d/heartbeat stop
root 24377 24366 0 12:52 pts/2 00:00:00 heartbeat
What could be the problem leading to this behaviour? Of course it's
possible to kill the processes manually but that's not what i really like...
Regards
Markus
More information about the Pacemaker
mailing list