[Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Digimer lists at alteeve.ca
Fri Jun 28 12:35:10 EDT 2013


On 06/28/2013 11:45 AM, Lars Marowsky-Bree wrote:
> On 2013-06-28T11:20:32, Digimer <lists at alteeve.ca> wrote:
> 
>> Yes, a failed "on" action would then fail the method. This is
>> sub-optimal as FenceAgentAPI says that only the "off" portion of
>> "reboot" needs to succeed. However, I don't consider this a show stopper
>> because "on" action of PDUs simply means "re-energize the outlet". If
>> the node blew up, it won't boot, but the "on" will still succeed, so the
>> overall method would succeed.
> 
> Wait. If the "failed on" would fail this method, the fence will still be
> considered "failed", right? Hence the cluster would block?

Yes, but this is exceedingly unlikely to happen. If the PDU has lost
control of the relay for an outlet, the off will fail as well. And in
such a case, the IPMI interface should still be working as the other PDU
will still be powering the node.

The only realistic way for this to fail is if there are two simultaneous
failures.

>> Again though, with all this said, I will be happy with just keeping this
>> existing functionality. It works.
> 
> Sure, given enough thrust, pigs achieve amazing things ;-)
> 
>> In fact, I'll write 'fence_apc_multi' as a proof of concept. Give me the
>> weekend to do this.
> 
> But why? Why not spend that time on fixing it properly, if you're going
> to fix it? It's a horrible hack!
> 
> I'll shut up now.

I am not a C programmer, so I can't work on pacemaker. I can write fence
agents though, and have written a few already. So this is the only
mechanism I can contribute.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the Pacemaker mailing list