[Pacemaker] [BUG] Clone + group = orphan(s) ?
Thomas Guthmann
tguthmann at iseek.com.au
Mon Jan 4 06:25:53 UTC 2010
Hi,
I noticed weird stuff with pacemaker when I ask it to clone a group.
Let's say that I have a group containing 4 primitives (1 named process +
3 IPAddr2 to load on lo). I want to clone the group tom-DNS twice.
crm configure clone tom-DNS-clone tom-DNS meta clone-max=2
If the group is running and I add on the fly a clone it's very often
that I will have : tom-DNS:0, tom-DNS:1 and tom-DNS:2 with one of them
an orphan (I can have more orphans).
Then, if I want to get rid of the orphan, I can try a :
crm_resource -r tom-DNS-clone -C.
That "cleanup" usually just makes the things even worse. It generates
new orphans and pacemaker will move (stop then start) the 2 running
cloned groups to one of 4 groups I have now in the clone resource. I
don't really follow the logic and the log is so verbose I don't know
where I should start or what to find. At the end I usually have a broken
state. Half of the group is running and the other part not (like 2 IPs
only and not the third one nor named).
You will find a log in attachement. Tags :
- GO : I have just commit the clone line (above)
- DONE: everything seems to be stable now
- EOF : we have 9 tom-DNS without touching anything since GO :)
So cloned groups are not fun and side effects are random :) I will do
more tests without IPAddr2 which seems a bit fancy and dodgy.
My question is : is it a bug or is it a possible constraints issue... ?
I also notice an increased load (from 0.5 to 3) when pacemaker has
orphans but I don't know what it is doing. CPU is not very high. Cib
process uses 15% of cpu and we have no disk IOs. I didn't check maybe
it's using a lot the network... All nodes in the ring are impacted with
that.
Tell me if you need additional information I can provide.
Cheers,
Thomas
--
* pacemaker 1.0.6
* corosync 1.1.2
* centos 5.4
* 4 nodes (2 of them are DNS nodes)
* asymmetrical cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clone-commit-after.log.gz
Type: application/x-gzip
Size: 21471 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100104/8397f51d/attachment-0001.bin>
More information about the Pacemaker
mailing list