[Pacemaker] DRBD monitor time out in high I/O situations
Sebastian Kaps
sebastian.kaps at imail.de
Sat Jul 16 15:55:33 CET 2011
Hi!
On 12.07.2011, at 12:05, Lars Marowsky-Bree wrote:
>> [unexplained, sporadic monitor timeouts]
> drbd's monitor operation is not that heavy-weight; I can't
> immediately
> see why the IO load on the file system it hosts should affect it so
> badly.
Contrary to my first assumption, the problem does not seem to be
triggered
by high I/O primarily.
We've witnessed some STONITH-shootouts in the last few days while
the active node was mainly idle and we've had situations with high I/O
that did not show any unexpected behavior.
I noticed that after rebooting a machine, the status of our the second
Corosync ring is always displayed "FAULTY" by corosync-cfgtool, whereas
the first ring always is reported working. Since the first ring is a
direct connection between both nodes and the second one runs on a
bonded
interface utilizing two redundant cables and different switches, I
thought
that this might be caused by the bonding driver being configured later
in
the boot process. I could always issue a "corosync-cfgtool -r" manually
after booting and both rings' state switched to "no faults" and stayed
that
way until the next reboot.
Further investigation showed that we have been using identical
multicast
port numbers for both rings (different IP adresses, though) and that
this
might not be the best idea (I've learned, that the multicast port
numbers
are supposed to differ by at least 2) and I have corrected this now.
Could this have caused our problem?
Is there a way to change the port number for the second ring in a
running
cluster or does it require a complete restart of corosync on all (2)
nodes?
If the second ring is marked faulty (which is the state I currently
left it in),
will that prevent corosync from using that ring or will it eventually
re-enable
that ring?
It's probably safer to run everything over a single, working, direct
connection
for a while than over a faulty redundant ring-pair.
Other changes we've tried so far and that didn't solve the issue:
- increasing the number of threads used for message en/decryption from
2 to 16.
- disabling time stamps for cluster messages
- increasing various monitor timeouts/intervals
Thanks again for helping!
BTW: does anyone know if there's a pre-configured $Linux (whatever
flavour)
virtual machine image for Pacemaker that could be used to quickly set
up a
virtual cluster test environment with 2 or three nodes?
--
Sebastian
More information about the Pacemaker
mailing list