[Pacemaker] Problem with dual-PDU fencing node with redundant PSUs

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Jun 27 07:02:42 EDT 2013


Hi,

On Wed, Jun 26, 2013 at 03:52:00PM -0400, Digimer wrote:
> This question appears to be the same issue asked here:
> 
> http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018650.html
> 
> In my case, I have two fence methods per node; IPMI first with
> action="reboot" and, if that fails, two PDUs (one backing each side of
> the node's redundant PSUs).
> 
> Initially I setup the PDUs as action "reboot" figuring that the
> fence_toplogy tied them together, so pacemaker would call "pdu1:port1;
> off -> pdu2:port1; off; (verify both are off) -> pdu1:port1; on ->
> pdu2:port1; on".
> 
> This didn't happen though. It called 'pdu1:port1; reboot' then
> "pdu2:port1; reboot", so the first PSU in the node had it's power back
> before the second PSU lost power, meaning the node never powered off.

I'm not sure if that's supported.

> So next I tried;
> 
> pdu1:port1; off -> pdu2:port1; off -> pdu1:port1; on -> pdu1:port1; on
> 
> However, this seemed to have actually done;
> 
> pdu1:port1; reboot -> pdu2:port1; reboot -> pdu1:port1; reboot ->
> pdu1:port1; reboot
> 
> So again, the node never lost power to both PSUs at the same time, so
> the node didn't power off.
> 
> This makes PDU fencing unreliable. I know beekhof said:
> 
>   "My point would be that action=off is not the correct way to configure
> what you're trying to do."
> 
> in the other thread, but there was no elaborating on what *is* the right
> way. So if neither approach works, what is the proper way for configure
> PDU fencing when you have two different PDUs backing either PSU?

The fence action needs to be defined in the cluster properties
(crm_config/cluster_property_set in XML):

# crm configure property stonith-action=off

See the output of:

$ crm ra info pengine

for the PE metadata and explanation of properties.

>   I don't want to disable "reboot" globally because I still want the
> IPMI based fencing to do action="reboot".

I don't think it is possible to define a per-resource fencing
action.

> If I just do "off", then the
> node will not power back on after a successful fence. This is better
> than nothing, but still quite sub-optimal.

Yes, if you want to start the cluster stack automatically on
reboot. Anyway, I think that it would be preferred to let a human
check why the node got fenced before letting it join the cluster
again. In that case, one just needs to boot the host manually.

Thanks,

Dejan

> -- 
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list