[Pacemaker] Pacemaker very often STONITHs other node

Mon Dec 9 11:58:33 UTC 2013

> Hello,
>
> Thank you for your answer. I have two drbd - /dev/drbd1 and 
> /dev/drbd2. And I use them as PVs for LVM which has one Volume Group 
> hosting all the VMs.
>
> So should I have as many DRBDs as VMs and get rid off LVM at all?
>
> PS. If it is not a secret what are you recommended timeouts?
>
> Thank you!
>
First, you must eliminate Timed Out on stop operation. May be increasing 
timeout on stop will help.
My timeouts isn't a secret, of course. I use 5 minutes stopping timeout 
per VM and define longer for VM's that can't stop gracefully in 5 minutes.
I also use KVM virtual domains that always kicked off by VirtualDomain 
ocf agent without any errors (except infomessage in logs).
IMHO, one drbd per VM gives more flexibility in configuration, for 
example you can spread VM's among nodes in cluster, it's especially 
useful in big clusters(6-8 nodes or so).
This setup also helps with drbd write-after-write behavior. It gives 
more simultaneous writethrough operations  and decrease io latency in VM.
For example, if you have many nodes you can create dedicated replication 
network between nodes and get better network throughput.