[Pacemaker] Solved: [Linux-HA] SLES 11 HAE SP1 Signon to CIB Failed
Darren.Mansell at opengi.co.uk
Darren.Mansell at opengi.co.uk
Thu Feb 3 10:47:32 UTC 2011
On Fri, Jan 28, 2011 at 1:06 PM, <Darren.Mansell at opengi.co.uk> wrote:
> Hi all, this seems like it should be an easy one to fix, I'll raise a
> support call with Novell if required.
>
>
>
> Base install of SLES 11 32 bit SP1 with HAE SP1 and crm_mon gives
> 'signon to CIB failed'. Same thing with the CRM shell etc.
Too many open file descriptors?
lsof might show something interesting
-----------
Unfortunately not.
It seems that corosync doesn't spawn anything else, which is causing
this issue:
>From a SLES 11 HAE install:
root 7342 5.6 0.2 166048 38924 ? SLl 2010 5685:08
aisexec
root 7349 0.0 0.0 67768 10516 ? SLs 2010 3:02 \_
/usr/lib64/heartbeat/stonithd
90 7350 0.0 0.0 65028 4656 ? S 2010 7:43 \_
/usr/lib64/heartbeat/cib
nobody 7351 0.0 0.0 61600 1832 ? S 2010 8:24 \_
/usr/lib64/heartbeat/lrmd
90 7352 0.0 0.0 66284 2320 ? S 2010 0:00 \_
/usr/lib64/heartbeat/attrd
90 7353 0.0 0.0 67536 3588 ? S 2010 1:24 \_
/usr/lib64/heartbeat/pengine
90 7354 0.0 0.0 72392 3712 ? S 2010 6:01 \_
/usr/lib64/heartbeat/crmd
root 7355 0.0 0.0 75148 2504 ? S 2010 2:25 \_
/usr/lib64/heartbeat/mgmtd
root 4040 0.0 0.0 0 0 ? Z 2010 0:00 \_
[aisexec] <defunct>
root 4059 0.0 0.0 0 0 ? Z 2010 0:00 \_
[aisexec] <defunct>
>From a SLES 11 SP1 HAE install:
root 9109 0.0 0.4 13308 2288 tty1 Ss+ Feb02 0:00 \_
-bash
root 8989 0.0 0.1 4344 744 tty2 Ss+ Feb02 0:00
/sbin/mingetty tty2
root 8990 0.0 0.1 4344 752 tty3 Ss+ Feb02 0:00
/sbin/mingetty tty3
root 8991 0.0 0.1 4344 748 tty4 Ss+ Feb02 0:00
/sbin/mingetty tty4
root 8992 0.0 0.1 4344 748 tty5 Ss+ Feb02 0:00
/sbin/mingetty tty5
root 8993 0.0 0.1 4344 744 tty6 Ss+ Feb02 0:00
/sbin/mingetty tty6
root 24883 0.0 0.8 89808 4424 ? Ssl Feb02 0:34
/usr/sbin/corosync
lookup-01:~ #
So I compared the /etc/ais/openais.conf in non-sp1 with
/etc/corosync/corosync.conf from sp1 and found this bit missing which
could be quite useful...
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
use_mgmtd: yes
use_logd: yes
}
Added it and it works. Doh.
It seems the example corosync.conf that is shipped won't start
pacemaker, I'm not sure if that's on purpose or not, but I found it a
bit confusing after being used to it 'just working' previously.
More information about the Pacemaker
mailing list