[Pacemaker] WARN: do_lrm_control: Failed to sign on to the LRM 1 (30 max) times
chajo
srichandu2007 at yahoo.co.in
Fri Apr 9 12:49:28 UTC 2010
Hi
we have corosync(1.2.1) running on pacemkaer 1.0.6 on RHEL x86_64
while building the code there were errors related to pointer types
(GPOINTER_TO_INT in pacemaker/lib/common/remote.c :295) i changed references
from /usr/lib/glib-2.0/include to /usr/lib64/glib-2.0/include to get rid of
compilation errors
after starting corosync crmd is failing and local node is always shown as
offline in two cluster node. and following error is logged repeatedly in
var/log/message file
crmd: [3180]: info: do_cib_control: CIB connection established
.
.
.
.
.
crmd: [3180]: WARN:lrm_signon: can not initiate connection
crmd: [3180]: WARN: do_lrm_control: Failed to sign on to the LRM 3 (30 max)
times
.
crmd is getting restarted after 30 tries
debugging crmd i found the connect() api is returning -1 while connecting to
socket file /usr/var/run/heartbeat/lrm_cmd_soc
fileName:: ./lib/clplumbing/ipcsocket.c < Reusable-Cluster-Components-
6c8645d6a4c2 Cluster Glue>
line Number: 962
connect(<fd>,
{sun_family = 1, sun_path
= "/usr/var/run/heartbeat/lrm_cmd_sock", '\0' <repeats 72 times>}
)
for this the api is returning -1
further info
# ls -l /usr/var/run/heartbeat/lrm_cmd_sock
srwxrwxrwx 1 root root 0 Apr 9 19:49 /usr/var/run/heartbeat/lrm_cmd_sock
# cat /etc/passwd | grep hacluster
hacluster:x:501:501::/home/hacluster:/bin/bash
[root at IbHost common]# cat /etc/group | grep ha
haldaemon:x:68:
hacluster:x:501:
haclient:x:502:hacluster
to find out why local node is is being shown offline using <crm status>
command any help would be appreciated?
thanks
chajo
More information about the Pacemaker
mailing list