[Pacemaker] STONITH Deathmatch Explained
Tim Serong
tim at wirejunkie.com
Mon Aug 10 10:41:42 EDT 2009
I wrote:
>>> I've written up a brief document entitled "STONITH Deathmatch Explained
>>> (and Some Hints for Resource Agent Authors and Systems Engineers)":
>>>
>>> http://ourobengr.com/ha
>>>
>>> ...
Then Dejan Muhamedagic wrote:
>> ...
>>
>> - in "Causes ..." you missed to mention split-brain (no
>> communication channels working) and, at the same time, to
>> stress how important it is to have redundant communications :)
>>
>> - even though you mention that in the title, I'd still move the
>> resource agent intricacies into another document; they are all
>> very valuable advice, but of no concern to cluster
>> administrators; it's also good to keep the focus on our little
>> problem; then you'll have to find other "Things You Didn't
>> Think Of" (or just keep the title and leave the section empty:
>> it is important; or insert another illustration)
>>
>> - devote more space/thought to the part on how to avoid a
>> "deathmatch"; there's only a mention on chkconfig within
>> "Debugging ..." (or one can also use the "poweroff" fencing
>> operation); also, note that this occurs only in cases reboot
>> doesn't fix a problem (e.g. split-brain)
And Joe Armstrong wrote:
> ...You might want to also add a possibility
> to avoid the situation. Don't allow heartbeat to be started by
> the RC scripts. Once a machine has been STONITH'd you can consider
> that it is untrustworthy until the admin inspects the reason for
> the failure and manually allows the node back into the cluster.
> This same thinking is why I hate auto-failback...
For the record, I've made a couple of minor updates based on the above:
- Split-brain is added as a cause of STONITH.
- There's now a small section "Avoiding STONITH Deathmatch", which
mentions ensuring redundant comms, not starting the cluster at boot
time, and trying stonith-action=poweroff.
- There's a mention of the document still being applicable if you're
using OpenAIS instead of Heartbeat.
I haven't moved RA specifics into another document yet. I have a nasty
feeling this might result in something larger that rattles on about the
importance of ensuring correct semantics for all operations (e.g.: the
"start" op shouldn't return success if the resource isn't really, truly,
actually, completely started yet, or you can wind up in one of those
wacky start[ok]->monitor[fail]->stop->start[ok]->monitor[fail]->stop
cycles).
Tim
--
tim at wirejunkie.com
http://www.wirejunkie.com/
More information about the Pacemaker
mailing list