[ClusterLabs] Fencing with a 3-node (1 for quorum only) cluster
Dan Swartzendruber
dswartz at druber.com
Sun Aug 7 00:22:06 UTC 2016
On 2016-08-06 19:46, Digimer wrote:
> On 06/08/16 07:33 PM, Dan Swartzendruber wrote:
>>
>> Okay, I almost have this all working. fence_ipmilan for the
>> supermicro
>> host. Had to specify lanplus for it to work. fence_drac5 for the
>> R905.
>> That was failing to complete due to timeout. Found a couple of
>> helpful
>> posts that recommended increase the retry count to 3 and the timeout
>> to
>> 60. That worked also. The only problem now, is that it takes well
>> over
>> a minute to complete the fencing operation. In that interim, the
>> fenced
>> host shows as UNCLEAN (offline), and because the fencing operation
>> hasn't completed, the other node has to wait to import the pool and
>> share out the filesystem. This causes the vsphere hosts to declare
>> the
>> NFS datastore down. I hadn't gotten exact timing, but I think the
>> fencing operation took a little over a minute. I'm wondering if I
>> could
>> change the timeout to a smaller value, but increase the retries? Like
>> back to the default 20 second timeout, but change retries from 1 to 5?
>
> Did you try the fence_ipmilan against the DRAC? It *should* work. Would
> be interesting to see if it had the same issue. Can you check the
> DRAC's
> host's power state using ipmitool directly without delay?
Yes, I did try fence_ipmilan, but it got the timeout waiting for power
off (or whatever). I have to admit, I switched to fence_drac and had
the same issue, but after increasing the timeout and retries, got it to
work, so it is possible, that fence_ipmilan is okay. They both seemed
to take more than 60 seconds to complete the operation. I have to say
that when I do a power cycle through the drac web interface, it takes
awhile, so that might be normal. I think I will try again with 20
seconds and 5 retries and see how that goes...
More information about the Users
mailing list