[Pacemaker] [Linux-HA] new doc about stonith/fencing
Ryan Steele
ryans at aweber.com
Fri May 29 18:31:37 UTC 2009
Jan Kalcic wrote:
> Really interesting. I would have appreciated some more example (they are
> always welcome) but still very interesting.
>
> Thanks,
> Jan
>
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> Trying to make it a bit less mysterious, I wrote something about
>> fencing and stonith quite a while ago and then forgot to share
>> the link. Sorry about that.
>>
>> Here it is:
>>
>> http://www.clusterlabs.org/mediawiki/images/f/f2/Crm_fencing.pdf
>>
>> As usual, constructive criticism/suggestions/etc are welcome.
>> I won't be able to read your impressions for the next two weeks,
>> but will sure look forward to see them afterwards.
>>
>> Cheers,
>>
>> Dejan
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
I found this to be informative as well, Dejan - thanks for taking the time to write this. However, I too agree with Jan
in that some examples using more recommended non-testing STONITH devices would be great, since SSH, null, and other
network-based tests are apparently frowned upon in production environments (based on comments by Andrew and the article
here which he referenced: http://theclusterguy.clusterlabs.org/post/113230399/highly-available-data-corruption). For
example, I have Raritan 30A PDU's in my cabs, but I didn't see anything in the output of 'stonith -L' except an APC
switched rack PDU.
Now I know that a document like this can't be expected to cover every single type of STONITH device in existence, but
some instructions on writing custom STONITH plugins might be useful so that folks can write them for their particular
STONITH device (PDU or IPMI card or what have you) and contribute back to the community which will in turn help others.
I've looked at both the clusterlabs.org and linux-ha.org sites, but didn't see any documentation on rolling your own
at either site, and the Novell docs on this topic were GUI-centric which unfortunately aren't as helpful to those of use
sticking with the CLI.
The other thing that might be helpful is to know what the goal is in terms of recovering from a STONITH action. If one
has a node that STONITH powers off at the PDU outlet because it's lost networking, and then networking is subsequently
restored, how are we do get the node back in action?
Thanks and Regards,
Ryan
More information about the Pacemaker
mailing list