[Pacemaker] Starting openais one legged, STONITH activates

Digimer linux at alteeve.com
Mon Feb 28 01:18:56 UTC 2011


On 02/27/2011 07:58 PM, David Morton wrote:
> I'm pretty sure the behavior outlined below is by design (and it does
> make sense logically) but I am wondering if there are additional checks
> that can be put in place to change the behavior.
> 
> Situation:
> - Two node cluster with IPMI STONITH configured
> - Both servers running but with openais / pacemaker shutdown
> - Start openais on one server only
> - Server that starts executes a STONITH reset of the other node
> 
> I imagine this is due to an indeterminate state / no comms between
> nodes, the only way to move to a known state is then to bounce the other
> node. Is this correct ?
> 
> Is there any way to configure alternate means of confirming the openais
> / pacemaker service is not started and avoid a hard reset on the 'other'
> node ? ie: log in via ssh and enquire on service state, maybe even check
> key resources etc ?
> 
> Is the preferred method to always run openais / pacemaker on all nodes
> and manipulate rules to determine where resources run ? typically i
> would just shutdown openais to force all resources to one node or the
> other to simplify config creation and testing etc.

I can't address openais or pacemaker directly, but in corosync/rhcs
(similar foundation) there is an option called 'post_join_delay'. When
set to -1, the node will never fire a fence (stonith), but instead the
node will wait forever.

Perhaps there is a similar option in openais/pacemaker?

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org




More information about the Pacemaker mailing list