[Pacemaker] Two-Nodes Cluster fencing : Best Practices

Bruno MACADRÉ bruno.macadre at univ-rouen.fr
Fri Jul 26 05:03:13 EDT 2013


Thanks for your answer, It will confirm all of my doubts

I've tried this yesterday and the 2 nodes was unpowered instantly (this 
cause some troubles on reboot with FS unmounted uncleanly), is there a 
way to do a clean shutdown instead of a poweroff ?

After some reflexion, I've decided to put a third node (simple 
workstation) as an arbiter with only fencing primitives on it. Is that a 
good idea ? Is this solution reliable ?

Regards,
Bruno

Le 25/07/2013 16:53, Digimer a écrit :
> With two-node clusters, quorum can't be used. This is fine *if* you 
> have good fencing. If the nodes partition (ie: network failure), both 
> will try to fence the other. In theory, the faster node will power off 
> the other node before the slower node can kill the faster node. In 
> practice, this isn't always the case.
>
> IPMI (and iDRAC, etc) are independent devices. So it is possible for 
> both nodes to initiate a power-down on the other before either dies. 
> To avoid this, you will want to set a delay for the primary/active 
> node's fence primitive.
>
> Say "node1" is your active node and "node2" is your backup. You would 
> set a delay of, say, 15 seconds against "node1". Now if there is a 
> partition, node1 would look up how to fence node2 and immediately 
> initiate power off. Node 2, however, would look up how to fence node1, 
> see a 15 second delay, and start a timer before calling the power-off. 
> Of course, node2 will die before the timer expires.
>
> You can also disabled acpid on the nodes, too. With that disabled, 
> "pressing the power button" will result in a near-instant off. If you 
> do this, reducing your delay to 5 seconds would probably be plenty.
>
> There is another issue to be aware of; "Fence loops". The problem with 
> two node clusters and not using quorum is that a single node can fence 
> the other. So lets continue our example above...
>
> Node 2 will eventually reboot. If you have pacemaker set to start on 
> boot, it will start, wait to connect to node1 (which it can't because 
> the network failure remains), call a fence to put node1 into a known 
> state, pause for 15 seconds and then initiate a power off. Node 1 dies 
> and the services recover on Node 2. Now, node1 boots back up, starts 
> it's pacemaker.... Endless loop of fence -> recover until the network 
> is fixed.
>
> To avoid this, simple do not start pacemaker on boot.
>
> As to the specifics, you can test fencing configurations easily by 
> directly calling the fence agent at the command line. I do not use 
> DRAC, so I can't speak to specifics. I think you need to set lanplus 
> and possibly define the console prompt to expect.
>
> Using a generic IPMI as an example;
>
> fence_ipmilan -a 192.168.100.1 -l ipmiuser -p ipmipwd -o status
> fence_ipmilan -a 192.168.100.2 -l ipmiuser -p ipmipwd -o status
>
> If this returns the power state, then it is simple to convert to a 
> pacemaker config.
>
> configure primitive pStN1 stonith:fence_ipmilan params \
>  ipaddr=192.168.100.1 login=ipmiuser passwd=ipmipwd delay=15 \
>  op monitor interval=60s
> configure primitive pStN2 stonith:fence_ipmilan params \
>  ipaddr=192.168.100.2 login=ipmiuser passwd=ipmipwd \
>  op monitor interval=60s
>
> Again, I *think* you need to set a couple extra options for DRAC. 
> Experiment at the command line before moving to the pacemaker config. 
> Once you have the command line version working, you should be able to 
> set it up in pacemaker. If you have trouble though, share the CLI call 
> and we can help with the pacemaker config.
>
> On 25/07/13 05:39, Bruno MACADRÉ wrote:
>> Some modifications about my first mail :
>>
>> After some researches I found that external/ipmi isn't available on my
>> system, so I must use fence-agents.
>>
>> My second question must be modified to relfect this changes like this :
>>
>>      configure primitive pStN1 stonith:fence_ipmilan params
>> ipaddr=192.168.100.1 login=ipmiuser passwd=ipmipwd
>>      configure primitive pStN2 stonith:fence_ipmilan params
>> ipaddr=192.168.100.2 login=ipmiuser passwd=ipmipwd
>>
>> Regards,
>> Bruno
>>
>> Le 25/07/2013 10:39, Bruno MACADRÉ a écrit :
>>> Hi,
>>>
>>>     I've just made a two-nodes Active/Passive cluster to have an iSCSI
>>> Failover SAN.
>>>
>>>     Some details about my configuration :
>>>
>>>         - I've two nodes with 2 bonds : 1 for DRBD replication and 1
>>> for communication
>>>         - iSCSI Target, iSCSI Lun and VirtualIP are constraints
>>> together to start on Master DRBD node
>>>
>>>     All work fine, but now, I need to configure fencing. I've 2 DELL
>>> PowerEdge servers with iDRAC6.
>>>
>>>     First question, is 'external/drac5' compatible with iDrac6 (I've
>>> read all and nothing about this...) ?
>>>
>>>     Second question, is that configuration sufficient (with ipmi) ?
>>>
>>>         configure primitive pStN1 stonith:external/ipmi hostname=node1
>>> ipaddr=192.168.100.1 userid=ipmiuser passwd=ipmipwd interface=lan
>>>         configure primitive pStN2 stonith:external/ipmi hostname=node2
>>> ipaddr=192.168.100.2 userid=ipmiuser passwd=ipmipwd interface=lan
>>>         location lStN1 pStN1 inf: node1
>>>         location lStN2 pStN2 inf: node2
>>>
>>>         And after all :
>>>         configure property stonith-enabled=true
>>>         configure property stonith-action=poweroff
>>>
>>>     Third (and last) question, what about quorum ? At the moment I've
>>> 'no-quorum-policy="ignore"' but it's a risk isn't it ?
>>>
>>>     Don't hesitate to request me for more information if needed,
>>>
>>>     Regards,
>>>     Bruno.
>>>
>>
>
>

-- 

Bruno MACADRE
-------------------------------------------------------------------
  Ingénieur Systèmes et Réseau     | Systems and Network Engineer
  Département Informatique         | Department of computer science
  Responsable Info SER             | SER IT Manager
  Université de Rouen              | University of Rouen
-------------------------------------------------------------------
Coordonnées / Contact :
	Université de Rouen
	Faculté des Sciences et Techniques - Madrillet
	Avenue de l'Université - BP12
	76801 St Etienne du Rouvray CEDEX
	FRANCE

	Tél : +33 (0)2-32-95-51-86
-------------------------------------------------------------------





More information about the Pacemaker mailing list