[Pacemaker] Corosync + Pacemaker New Install: Corosync Fails Without Error Message

Eliot Gable egable at broadvox.com
Fri Jun 18 16:42:16 UTC 2010


I don’t have an “aisexec” section at all. I simply copied the sample file, which did not have one.

I did figure out why it wasn’t logging. It was set to AMF mode and ‘mode’ was ‘disabled’ in the AMF configuration section. After changing that to ‘enabled’, I now have logging. That allowed me to figure out that I needed to set rrp_mode to something other than ‘none’, because I have two interfaces to run the totem protocol over. However, with it set to ‘passive’ or ‘active’, corosync tries to start, then seg faults:

Jun 18 07:33:23 corosync [MAIN  ] Corosync Cluster Engine ('1.2.2'): started and ready to provide service.
Jun 18 07:33:23 corosync [MAIN  ] Corosync built-in features: nss rdma
Jun 18 07:33:23 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Jun 18 07:33:23 corosync [TOTEM ] Token Timeout (1000 ms) retransmit timeout (238 ms)
Jun 18 07:33:23 corosync [TOTEM ] token hold (180 ms) retransmits before loss (4 retrans)
Jun 18 07:33:23 corosync [TOTEM ] join (50 ms) send_join (0 ms) consensus (1200 ms) merge (200 ms)
Jun 18 07:33:23 corosync [TOTEM ] downcheck (1000 ms) fail to recv const (50 msgs)
Jun 18 07:33:23 corosync [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402
Jun 18 07:33:23 corosync [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Jun 18 07:33:23 corosync [TOTEM ] send threads (0 threads)
Jun 18 07:33:23 corosync [TOTEM ] RRP token expired timeout (238 ms)
Jun 18 07:33:23 corosync [TOTEM ] RRP token problem counter (2000 ms)
Jun 18 07:33:23 corosync [TOTEM ] RRP threshold (10 problem count)
Jun 18 07:33:23 corosync [TOTEM ] RRP mode set to passive.
Jun 18 07:33:23 corosync [TOTEM ] heartbeat_failures_allowed (0)
Jun 18 07:33:23 corosync [TOTEM ] max_network_delay (50 ms)
Jun 18 07:33:23 corosync [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Jun 18 07:33:23 corosync [TOTEM ] Initializing transport (UDP/IP).
Jun 18 07:33:23 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 18 07:33:23 corosync [TOTEM ] Initializing transport (UDP/IP).
Jun 18 07:33:23 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 18 07:33:23 corosync [IPC   ] you are using ipc api v2
Jun 18 07:33:23 corosync [TOTEM ] Receive multicast socket recv buffer size (262142 bytes).
Jun 18 07:33:23 corosync [TOTEM ] Transmit multicast socket send buffer size (262142 bytes).
Jun 18 07:33:23 corosync [TOTEM ] The network interface is down.
Jun 18 07:33:23 corosync [TOTEM ] Created or loaded sequence id 0.127.0.0.1 for this ring.
Jun 18 07:33:23 corosync [pcmk  ] info: process_ais_conf: Reading configure
Jun 18 07:33:23 corosync [pcmk  ] info: config_find_init: Local handle: 2013064636357672962 for logging
Jun 18 07:33:23 corosync [pcmk  ] info: config_find_next: Processing additional logging options...
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Found 'on' for option: debug
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Defaulting to 'off' for option: to_file
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Found 'yes' for option: to_syslog
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
Jun 18 07:33:23 corosync [pcmk  ] info: config_find_init: Local handle: 4730966301143465987 for service
Jun 18 07:33:23 corosync [pcmk  ] info: config_find_next: Processing additional service options...
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_logd
Jun 18 07:33:23 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
Jun 18 07:33:23 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Jun 18 07:33:23 corosync [pcmk  ] Logging: Initialized pcmk_startup
Jun 18 07:33:23 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Segmentation fault

(gdb) where full
#0  0x000000332de797c0 in strlen () from /lib64/libc.so.6
No symbol table info available.
#1  0x00002aaaaacefb9b in logsys_worker_thread (data=<value optimized out>) at logsys.c:760
        rec = 0x2aaaaaef0c28
        dropped = 0
#2  0x000000332e60673d in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x000000332ded3d1d in clone () from /lib64/libc.so.6
No symbol table info available.
(gdb)

Downgrading again back to 1.2.1-1.el5 seems to resolve the issue, and Corosync runs.




Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115

Direct: 216-373-4808
Fax: 216-373-4657
egable at broadvox.net<mailto:egable at broadvox.net>

[cid:image001.gif at 01CB0EE2.9FCAB1D0]
CONFIDENTIAL COMMUNICATION.  This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, please call me immediately.  BROADVOX is a registered trademark of Broadvox, LLC.

From: Gianluca Cecchi [mailto:gianluca.cecchi at gmail.com]
Sent: Friday, June 18, 2010 11:35 AM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Corosync + Pacemaker New Install: Corosync Fails Without Error Message

On Fri, Jun 18, 2010 at 5:25 PM, Eliot Gable <egable at broadvox.com<mailto:egable at broadvox.com>> wrote:
I am trying to set up Corosync + Pacemaker on a new CentOS 5.5 x86_64 install, but when I try to start corosync, it just says [FAILED] and does not provide any further information. I created the authkey using corosync-keygen and created a corosync.conf file. The log file remains empty and no errors are displayed on the console when it fails to start. I tried downgrading to 1.2.1-1.el5, but that did not resolve the issue either. So I have re-upgraded back to 1.2.2-1.1.el5.

What are the contents of your /etc/corosync/corosync.conf for the logging section and for the aisexec section?

do you have for example something like this:
aisexec {
        user: root
        group: root
}

when you say "log file" you mean the one indicated in /etc/corosync/corosync.conf or /var/log/messages or both?

Gianluca

________________________________
CONFIDENTIAL. This e-mail and any attached files are confidential and should be destroyed and/or returned if you are not the intended and proper recipient.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100618/8fa8f3ce/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 2308 bytes
Desc: image001.gif
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100618/8fa8f3ce/attachment-0004.gif>


More information about the Pacemaker mailing list