[Pacemaker] Pacemaker very often STONITHs other node

Mon Dec 9 11:01:38 UTC 2013

W dniu 09.12.2013 11:34, Nikita Staroverov pisze:
>
> So, what happens? :)
> Rivendell-B tried to stop XEN-acsystemy01, but couldn't do that due to
> time out of operation. Failure on stop operation is fatal by default and
> leading to stonith.
> Rivendell-A caught this and fence rivendell-B.
> You also have got some other problems, like clone-LVM not running (but
> it is'nt fatal).
>
> I think your servers is overloaded due to one DRBD for all VM's. You
> must increase timeout of operations or do something with cluster
> configuration.
> As for me, i use configuration with one drbd per virtual machine drive,
> moderated timeouts and 802.3ad bonding configuration without problems.
>

Hello,

Thank you for your answer. I have two drbd - /dev/drbd1 and /dev/drbd2. 
And I use them as PVs for LVM which has one Volume Group hosting all the 
VMs.

So should I have as many DRBDs as VMs and get rid off LVM at all?

PS. If it is not a secret what are you recommended timeouts?

Thank you!

-- 
Michał Margula, alchemyx at uznam.net.pl, http://alchemyx.uznam.net.pl/
"W życiu piękne są tylko chwile" [Ryszard Riedel]