[Pacemaker] Seeking for advice after cluster freeze

Patrick Zwahlen paz at navixia.com
Thu Feb 18 15:29:43 EST 2010


Hi Dejan, and thanks for the super quick response.

> Please upgrade to 1.0.3. Not sure, but those versions you have
> may have a bad bug.

I didn't want to do too many changes before having some people have a
look at
it. I agree there are some new pacemaker/cluster-glue revisions, though.

> > corosync.x86_64            1.2.0-1.el5
> > corosynclib.x86_64         1.2.0-1.el5
> > heartbeat.x86_64           3.0.1-1.el5
> > heartbeat-libs.x86_64      3.0.1-1.el5
> 
> You don't need both heartbeat and corosync.

I think this comes from the RPM dependencies. If I try to remove
heartbeat
using 'yum', then it also wants to remove pacemaker. I am making sure
that
heartbeat doesn't start, though. Only corosync is configured to start at
system boot.

> Anything in logs? Or is that the log attached?

The attached logs (messages.2) show what happened just before and right
after
the freeze. The last log entry is at 17:26:59. The freeze lasts until
17:41:46. During that time, we should at a minimum have logs for drbd
monitoring (crm_attribute...).

> Feb  4 17:41:54 nfs2a lrmd: [3072]: info: RA output:
(res_drbd:1:start:stderr)
0 : Failure: (124) Device
> is attached to a disk (use detach first) 
> Feb  4 17:41:54 nfs2a lrmd: [3072]: info: RA output:
(res_drbd:1:start:stderr)
Command 'drbdsetup 0 disk
> /dev/sdb /dev/sdb internal --set-defaults --create-device
--fencing=resource-only
> --on-io-error=detach' terminated with exit code 10 
> Feb  4 17:41:54 nfs2a drbd[3243]: ERROR: nfs: Called drbdadm -c
/etc/drbd.conf
--peer nfs2b.test.local
> up nfs
> Feb  4 17:41:54 nfs2a drbd[3243]: ERROR: nfs: Exit code 1
> 
> That's what I could find in the logs.

This happens after the freeze and manual reboot. I am not sure why I get
this
error, but for sure after the other node came back up, everything worked
fine
again.

Thanks again, - Patrick -


**************************************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager. postmaster at navixia.com
**************************************************************************************




More information about the Pacemaker mailing list