[Pacemaker] DRBD monitor time out in high I/O situations

Sat Jul 16 14:55:33 UTC 2011

 Hi!

 On 12.07.2011, at 12:05, Lars Marowsky-Bree wrote:

>> [unexplained, sporadic monitor timeouts]
> drbd's monitor operation is not that heavy-weight; I can't 
> immediately
> see why the IO load on the file system it hosts should affect it so
> badly.

 Contrary to my first assumption, the problem does not seem to be 
 triggered
 by high I/O primarily.
 We've witnessed some STONITH-shootouts in the last few days while
 the active node was mainly idle and we've had situations with high I/O
 that did not show any unexpected behavior.

 I noticed that after rebooting a machine, the status of our the second
 Corosync ring is always displayed "FAULTY" by corosync-cfgtool, whereas
 the first ring always is reported working. Since the first ring is a
 direct connection between both nodes and the second one runs on a 
 bonded
 interface utilizing two redundant cables and different switches, I 
 thought
 that this might be caused by the bonding driver being configured later 
 in
 the boot process. I could always issue a "corosync-cfgtool -r" manually
 after booting and both rings' state switched to "no faults" and stayed 
 that
 way until the next reboot.

 Further investigation showed that we have been using identical 
 multicast
 port numbers for both rings (different IP adresses, though) and that 
 this
 might not be the best idea (I've learned, that the multicast port 
 numbers
 are supposed to differ by at least 2) and I have corrected this now.

 Could this have caused our problem?
 Is there a way to change the port number for the second ring in a 
 running
 cluster or does it require a complete restart of corosync on all (2) 
 nodes?
 If the second ring is marked faulty (which is the state I currently 
 left it in),
 will that prevent corosync from using that ring or will it eventually 
 re-enable
 that ring?
 It's probably safer to run everything over a single, working, direct 
 connection
 for a while than over a faulty redundant ring-pair.

 Other changes we've tried so far and that didn't solve the issue:
 - increasing the number of threads used for message en/decryption from 
 2 to 16.
 - disabling time stamps for cluster messages
 - increasing various monitor timeouts/intervals

 Thanks again for helping!

 BTW: does anyone know if there's a pre-configured $Linux (whatever 
 flavour)
 virtual machine image for Pacemaker that could be used to quickly set 
 up a
 virtual cluster test environment with 2 or three nodes?

-- 
 Sebastian