[Pacemaker] heartbeat 3.0 start error
Andrew Beekhof
andrew at beekhof.net
Thu Jul 25 08:25:47 CEST 2013
On 25/07/2013, at 2:36 PM, claire huang <claire.huang at utstarcom.cn> wrote:
> Andrew,
> Hi!there is one question to ask for your help.
> 1、My os is-----
> [root at 2U_222 cluster]# lsb_release -a
> LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
> Distributor ID: CentOS
> Description: CentOS release 6.2 (Final)
> Release: 6.2
> Codename: Final
> 2、I install heartbeat 3.0 on my server,
Ok, but you're not using it. You're using corosync instead.
> which is corosync+pacemaker+heartbeat。But,when I start the command “corosync”,
> I hope the correct situation is:
> ps aufx
> root 5001 0.3 0.0 553648 6764 ? Ssl 03:13 0:00 corosync
> root 5006 0.0 0.0 86032 2308 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/stonithd
> 499 5007 0.7 0.0 92444 5228 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/cib
> root 5008 0.0 0.0 74004 2316 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/lrmd
> 499 5009 0.0 0.0 94656 2592 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/attrd
> 499 5010 0.0 0.0 86712 1956 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/pengine
> 499 5011 0.2 0.0 94868 3232 ? S 03:13 0:00 \_ /usr/lib64/heartbeat/crmd
> But, in fact , it sometimes is
> Ps aufx
> root 6542 0.0 0.0 483772 4504 ? Ssl 19:06 0:00 corosync
> root 6548 0.0 0.0 194256 2236 ? S 19:06 0:00 \_ corosync
This process would have become /usr/lib64/heartbeat/stonithd, but fork() does not play well with threads.
So if you must continue using the plugin, at least use "ver: 1" in combination with the mcp:
http://blog.clusterlabs.org/blog/2010/introducing-the-pacemaker-master-control-process-for-corosync-based-clusters/
But on RHEL6/CentOS6 you are far better of NOT using the plugin:
http://blog.clusterlabs.org/blog/2013/pacemaker-on-rhel6-dot-4/
> 499 6549 0.0 0.0 92300 5032 ? S 19:06 0:00 \_ /usr/lib64/heartbeat/cib
> root 6550 0.0 0.0 74008 2384 ? S 19:06 0:00 \_ /usr/lib64/heartbeat/lrmd
> 499 6551 0.0 0.0 94656 2600 ? S 19:06 0:00 \_ /usr/lib64/heartbeat/attrd
> 499 6552 0.0 0.0 87560 3828 ? S 19:06 0:00 \_ /usr/lib64/heartbeat/pengine
> 499 6553 0.0 0.0 95352 3816 ? S 19:06 0:00 \_ /usr/lib64/heartbeat/crmd
> Then,I open the debug,and find some information like that:
> Vi /var/log/cluster/corosync.log:
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/st_command
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: init_client_ipc_comms_nodispatch: Could not init comms on: /var/run/crm/st_command
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: stonith_api_signon: Connection to command channel failed
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/st_callback
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: init_client_ipc_comms_nodispatch: Could not init comms on: /var/run/crm/st_callback
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: stonith_api_signon: Connection to callback channel failed
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: stonith_api_signon: Connection to STONITH failed: Not connected
> Jul 25 03:17:01 2U_222 crmd: [16867]: debug: stonith_api_signoff: Signing out of the STONITH Service
> Jul 25 03:17:01 2U_222 crmd: [16867]: ERROR: te_connect_stonith: Sign-in failed: triggered a retry
> Jul 25 03:17:01 2U_222 crmd: [16867]: info: te_connect_stonith: Attempting connection to fencing daemon...
> And then,I find the socket st_command can not init correct. but, there is no log about socket init failed error.
> Could you help me about this? Thank you very much.
>
> best wishes,
> claire huang
> Email: claire.huang at utstarcom.cn
More information about the Pacemaker
mailing list