[Pacemaker] known problem with corosync 1.4.1 on centos64 ?
andreas graeper
agraeper at googlemail.com
Fri Jun 21 08:56:29 UTC 2013
hi,
when only i remove or add resources, corosync starts to eat up all cpu.
drbd 8.4.1 (build from source)
corosync 1.4.1
pacemaker 1.1.8
crmsh 1.2.5 (this from extra repo, cause crm is missing in pacemaker-cli ?!
but it is not reason for trouble ! i use pcs except crm_mon )
pcs 0.9.26
when
pcs resource stop xxx
pcs resource delete xxx
i often need to cleanup to remove the 'failed actions' ( monitor of that
resource xxx )
when a resource gets stopped, the monitor should get cancelled ? and every
old failed action should be forgotten ?
but other resources gets stopped and restarted, too
and their monitors fail with timeout or unknown error, though crm_mon shows
them running / started.
now drbd on master was stopped, corosync 100% cpu
and other node does not take over:
drbd:0 (slave n1) unmanaged FAILED
drbd:
connected primary diskless (n2, corosync stopped)
connected secondary uptodate (n1, corosync ok ?)
when drbd is stopped ( i would excpect similiar to : )
1) primary -> secondary
2) disconnect => cs:standalone
3) detach => ds:diskless
what went wrong ?
what can i do ?
is there a chance that centos63, corosync 1.4.1, pacemaker 1.1.7 is running
more stable ?
when two nodes n1(master) n2(slave) and on n2 corosync is stopped. then in
cib
n2.standby="off"
does a `corosync stop` not report n2 as offline / standby ?
what can be the reason for lrmd and pengine still running after corosync
was stopped,
pacemaker ( ? parent of lrmd and pengine ) does not run anymore ?
thanks in advance
andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130621/bd0b3fe5/attachment-0003.html>
More information about the Pacemaker
mailing list