[Pacemaker] sbd fencing race

emmanuel segura emi2fast at gmail.com
Wed Nov 26 09:44:49 EST 2014


I think pacemaker doesn't care about the sbd resource status when it
needs to make a fencing call, that what i think, but i hope some one,
will give me some more information.

Thanks


2014-11-26 15:11 GMT+01:00 Dejan Muhamedagic <dejanmm at fastmail.fm>:
> On Wed, Nov 26, 2014 at 11:13:41AM +0100, emmanuel segura wrote:
>> But i would like to know if pacemaker needs to start sbd on the node
>> where sbd resource isnt running to fence the other nodes, because i
>> don't see any start action in the second node:
>
> That's strange. I'd expect that a stonith resource needs to be
> started (enabled) first. Perhaps that changed, as it seems to be
> the case judging by the logs below. I cannot offer any more
> advice here, but would still like to know the circumstances and
> how it happened that the nodes shot each other.
>
> Thanks,
>
> Dejan
>
>
>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>
>> message_2cd.txt:Nov 23 11:43:28 node01 sbd: [69794]: WARN: CIB: We do
>> NOT have quorum!
>> message_2cd.txt:Nov 23 11:43:28 node01 sbd: [69791]: WARN: Pacemaker
>> health check: UNHEALTHY
>> message_2cd.txt:Nov 23 11:43:28 node01 pengine: [69823]: notice:
>> LogActions: Leave   stonith-sbd    (Started node01)
>> message_2ch.txt:Nov 23 11:43:28 s02srv002ch sbd: [97640]: WARN: CIB:
>> We do NOT have quorum!
>>
>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>
>> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [97640]: WARN: CIB: We do
>> NOT have quorum!
>> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [97637]: WARN: Pacemaker
>> health check: UNHEALTHY
>> message_2ch.txt:Nov 23 11:43:28 node02 pengine: [97679]: WARN:
>> custom_action: Action stonith-sbd_stop_0 on node01 is unrunnable
>> (offline)
>> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Delivery
>> process handling /dev/mapper/SBD01B0298700230
>> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Writing
>> reset to node slot node01
>> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Messaging delay: 40
>>
>> ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>
>> Thanks
>>
>> 2014-11-26 10:26 GMT+01:00 Dejan Muhamedagic <dejanmm at fastmail.fm>:
>> > Hi,
>> >
>> > On Tue, Nov 25, 2014 at 04:20:32PM +0100, emmanuel segura wrote:
>> >> Hi list,
>> >>
>> >> The last night, i had a cluster in fencing race using sbd as stonith
>> >
>> > Can you give a bit more details.
>> >
>> >> device, i would like to know what is the effect to use start-delay in
>> >> my stonith resource in this way:
>> >>
>> >> primitive stonith-sbd stonith:external/sbd \
>> >>         params sbd_device="/dev/mapper/SBD \
>> >>         op start interval="0" start-delay="5"
>> >
>> > Yes, that could help with a stonith deathmatch. Normally, you
>> > have a stonith resource running on one node. On split brain, the
>> > other node also starts the resource in order to shoot the first
>> > node. That's where start-delay comes into play.
>> >
>> > Ultimate resource for the issue: http://ourobengr.com/ha/
>> >
>> > Cheers,
>> >
>> > Dejan
>> >
>> >> Thanks
>> >>
>> >> _______________________________________________
>> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >>
>> >> Project Home: http://www.clusterlabs.org
>> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >> Bugs: http://bugs.clusterlabs.org
>> >
>> > _______________________________________________
>> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >
>> > Project Home: http://www.clusterlabs.org
>> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> > Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-- 
esta es mi vida e me la vivo hasta que dios quiera




More information about the Pacemaker mailing list