[Pacemaker] Failed in restart of Corosync.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Mon Oct 19 02:05:08 UTC 2009
Hi,
I understand that a combination is not official in Corosync and Pacemaker.
However, I contributed it because I thought that it was important that I reported a problem.
I started next combination Corosync.(on Redhat5.4(x86))
* corosync trunk 2530
* Cluster-Resource-Agents-6d652f7cf9d8
* Reusable-Cluster-Components-4edc8f99701c
* Pacemaker-1-0-de2a3778ace7
I stopped service(corosync) next.
But, I did KILL of a process because a process of Pacemaker did not stop well.
------------------------------------------------------------------------------------
[root at rh54-1 ~]# service Corosync stop
Stopping Corosync Cluster Engine (corosync): [ OK ]
Waiting for services to unload: [ OK ]
[root at rh54-1 ~]# ps -ef |grep coro
root 5263 4617 0 10:54 pts/0 00:00:00 grep coro
[root at rh54-1 ~]# ps -ef |grep heartbeat
root 4882 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/stonithd
500 4883 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/cib
root 4884 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/lrmd
500 4885 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/attrd
500 4886 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/pengine
500 4887 1 0 10:52 ? 00:00:00 /usr/lib/heartbeat/crmd
root 5278 4617 0 10:54 pts/0 00:00:00 grep heartbeat
[root at rh54-1 ~]# kill -9 4882 4883 4884 4885 4886 4887
[root at rh54-1 ~]# ps -ef |grep heartbeat
root 5310 4617 0 10:54 pts/0 00:00:00 grep heartbeat
------------------------------------------------------------------------------------
I started Corosync again.
But, a cib process of Pacemaker seems not to be able to communicate with Corosync.
------------------------------------------------------------------------------------
Oct 19 10:55:29 rh54-1 cib: [5354]: info: startCib: CIB Initialization completed successfully
Oct 19 10:55:29 rh54-1 cib: [5354]: info: crm_cluster_connect: Connecting to OpenAIS
Oct 19 10:55:29 rh54-1 cib: [5354]: info: init_ais_connection: Creating connection to our AIS plugin
Oct 19 10:55:30 rh54-1 mgmtd: [5359]: info: login to cib live: 1, ret:-10
Oct 19 10:55:30 rh54-1 crmd: [5358]: info: do_cib_control: Could not connect to the CIB service:
connection failed
Oct 19 10:55:30 rh54-1 crmd: [5358]: WARN: do_cib_control: Couldn't complete CIB registration 1
times... pause and retry
Oct 19 10:55:30 rh54-1 crmd: [5358]: info: crmd_init: Starting crmd's mainloop
Oct 19 10:55:31 rh54-1 mgmtd: [5359]: info: login to cib live: 2, ret:-10
Oct 19 10:55:32 rh54-1 mgmtd: [5359]: info: login to cib live: 3, ret:-10
Oct 19 10:55:32 rh54-1 crmd: [5358]: info: crm_timer_popped: Wait Timer (I_NULL) just popped!
Oct 19 10:55:33 rh54-1 mgmtd: [5359]: info: login to cib live: 4, ret:-10
Oct 19 10:55:33 rh54-1 crmd: [5358]: info: do_cib_control: Could not connect to the CIB service:
connection failed
Oct 19 10:55:33 rh54-1 crmd: [5358]: WARN: do_cib_control: Couldn't complete CIB registration 2
times... pause and retry
------------------------------------------------------------------------------------
On this account it does not start definitely even if Pacemaker waits till when.
As for the problem, Corosync seems to fail in poll(?) somehow or other.
However, possibly the cause may depend on the failure of the first stop.
------------------------------------------------------------------------------------
[root at rh54-1 ~]# ps -ef |grep coro
root 5348 1 0 10:55 ? 00:00:00 /usr/sbin/corosync
root 5400 4617 0 10:56 pts/0 00:00:00 grep coro
[root at rh54-1 ~]# strace -p 5348
Process 5348 attached - interrupt to quit
futex(0x805c8c0, FUTEX_WAIT_PRIVATE, 2, NULL
------------------------------------------------------------------------------------
Is there a method with the avoidance of this phenomenon what it is?
Can I evade a problem by deleting some file?
* I hope it so that a combination of Corosync and Pacemaker becomes the practical use early.
Best Regards,
Hideo Yamauchi.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rh54-1-message.zip
Type: application/x-zip-compressed
Size: 9947 bytes
Desc: 3324550729-rh54-1-message.zip
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20091019/43d9cf75/attachment-0003.bin>
More information about the Pacemaker
mailing list