[Pacemaker] Pacemaker very often STONITHs other node
Michał Margula
alchemyx at uznam.net.pl
Mon Nov 25 17:43:19 UTC 2013
W dniu 25.11.2013 18:25, Digimer pisze:
> I'd like to see the full logs, starting from a little before the issue
> started.
>
Here are logs since Nov 17 until Nov 24 (my pastebin is too small to
handle them):
Node A - https://www.dropbox.com/sh/dj08fbckj9zo104/Ew1QpdRq9A/A.log
Node B - https://www.dropbox.com/sh/dj08fbckj9zo104/p9ldlBkGkG/B.log
> It looks though like, for whatever reason, a stop was called, failed, so
> the node was fenced. This would mean that congestion, as you suggested,
> is not the likely cause.
>
> Out of curiosity though; what bonding mode are you using? My testing
> showed that only mode=1 was reliable. Since I tested, corosync added
> support for mode=0 and mode=2, but I've not re-tested them. When I was
> doing my bonding tests, I found all other modes to break communications
> in some manner of use or failure/recovery testing.
>
>
I use 802.3ad mode (so it is mode 4):
auto bond0
iface bond0 inet static
slaves eth4 eth5
bond-mode 802.3ad
bond-lacp_rate fast
bond-miimon 100
bond-downdelay 200
bond-updelay 200
address 10.0.0.1
netmask 255.255.255.0
broadcast 10.0.0.255
Do you think that it could be the reason - I mean wrong mode and some
communication issues because of that?
Thank you once more!
--
Michał Margula, alchemyx at uznam.net.pl, http://alchemyx.uznam.net.pl/
"W życiu piękne są tylko chwile" [Ryszard Riedel]
More information about the Pacemaker
mailing list