[Pacemaker] problems with shutdown order of master/slave resources
Nikola Ciprich
extmaillist at linuxbox.cz
Sun Oct 31 11:16:57 UTC 2010
Hello Andrew et al.
I'm using pacemaker in quite complex setup, and I'm experiencing some
strange behaviour. I guess it's caused by mistakes in my configuration,
but I'm having trouble solving those, therefore I'd like to kindly ask
for help...
Strange thing is that all seems to be working OK on my testing cluster,
but on the production (and fairly large one) I have problems. This leads
me to idea the maybe I'm hitting some bug..
To describe my setup:
- two nodes
- multiple Primary/Primary DRBDs
- DLM clone
- clustered LVM (therefore clone too)
- O2CB clone for OCFS2 cluster filesystem
- lots of virtual machines as primitive resources
I have order+colocation constraints set that resources should be started in
this order:
1) promote drbd0 drbd1 ... drbdn
2) start dlm
3) start o2cb, clvmd
4) start fs-ocfs
5) start vm1, vm2, ... vmn
and stopped in reverse order.
My first problem is, that when I try to shutdown node, crmd immediately starts trying
to demote drbds, without waiting for depending resources to stop first.
I believe I have all mandatory ordering and colocation constraints set properly but
of course I might be wrong.. (hb_report is attached).
Interesting thing is, that it seems like when drbd demote fails, THEN crmd stoppes
clvmd, and others. Seems like order constraint doesn't work for shutdown properly.
Second problem worries me even more, it's been asked here by few other people already,
but none of those seems to be my case..
The problem is, that when some of the VM resources fails for some reason (startup fail,
migration fail, whatever), crmd immediately tries to shutdown (restart) underlying resources
including drbd, etc. which of course causes total mess...
I guess it would be good to first solve first problem, and if it doesn't solve second
one, then search for solution of second one too.
I've uploaded hb_report to http://nelide.cz/downloads/report.tar.bz2 could somebody please have a look on it?
I'm using pacemaker 1.0.9.1 (tried today with latest stable-1.0 tip and 1.1.3 too) on latest
centos 5, x86_64, 2.6.35 kernel.
Thanks really a lot in advance!
with best regards
nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava
tel.: +420 596 603 142
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz
-------------------------------------
More information about the Pacemaker
mailing list