[Pacemaker] Problem with dual-PDU fencing node with redundant PSUs
Dejan Muhamedagic
dejanmm at fastmail.fm
Thu Jun 27 14:52:02 UTC 2013
On Thu, Jun 27, 2013 at 09:54:13AM -0400, Digimer wrote:
> On 06/27/2013 07:02 AM, Dejan Muhamedagic wrote:
> > Hi,
> >
> > On Wed, Jun 26, 2013 at 03:52:00PM -0400, Digimer wrote:
> >> This question appears to be the same issue asked here:
> >>
> >> http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018650.html
> >>
> >> In my case, I have two fence methods per node; IPMI first with
> >> action="reboot" and, if that fails, two PDUs (one backing each side of
> >> the node's redundant PSUs).
> >>
> >> Initially I setup the PDUs as action "reboot" figuring that the
> >> fence_toplogy tied them together, so pacemaker would call "pdu1:port1;
> >> off -> pdu2:port1; off; (verify both are off) -> pdu1:port1; on ->
> >> pdu2:port1; on".
> >>
> >> This didn't happen though. It called 'pdu1:port1; reboot' then
> >> "pdu2:port1; reboot", so the first PSU in the node had it's power back
> >> before the second PSU lost power, meaning the node never powered off.
> >
> > I'm not sure if that's supported.
>
> Unless I am misunderstood, beekhof indicated that it is/should be.
I'm pretty sure that it's not, but perhaps things changed in the
meantime. At least it wasn't when we discussed the
implementation.
> >> So next I tried;
> >>
> >> pdu1:port1; off -> pdu2:port1; off -> pdu1:port1; on -> pdu1:port1; on
> >>
> >> However, this seemed to have actually done;
> >>
> >> pdu1:port1; reboot -> pdu2:port1; reboot -> pdu1:port1; reboot ->
> >> pdu1:port1; reboot
> >>
> >> So again, the node never lost power to both PSUs at the same time, so
> >> the node didn't power off.
> >>
> >> This makes PDU fencing unreliable. I know beekhof said:
> >>
> >> "My point would be that action=off is not the correct way to configure
> >> what you're trying to do."
> >>
> >> in the other thread, but there was no elaborating on what *is* the right
> >> way. So if neither approach works, what is the proper way for configure
> >> PDU fencing when you have two different PDUs backing either PSU?
> >
> > The fence action needs to be defined in the cluster properties
> > (crm_config/cluster_property_set in XML):
> >
> > # crm configure property stonith-action=off
> >
> > See the output of:
> >
> > $ crm ra info pengine
> >
> > for the PE metadata and explanation of properties.
>
> In irc last night, beekhof mentioned that action="..." is ignored and
> replaced. However, it would appear that pcmk_reboot_action="..." should
> force the issue. I'm planning to test this today.
Yes, true, though it's a bit of a kludge
(pcmk_reboot_action="off" if I got that right).
> >> I don't want to disable "reboot" globally because I still want the
> >> IPMI based fencing to do action="reboot".
> >
> > I don't think it is possible to define a per-resource fencing
> > action.
> >
> >> If I just do "off", then the
> >> node will not power back on after a successful fence. This is better
> >> than nothing, but still quite sub-optimal.
> >
> > Yes, if you want to start the cluster stack automatically on
> > reboot. Anyway, I think that it would be preferred to let a human
> > check why the node got fenced before letting it join the cluster
> > again. In that case, one just needs to boot the host manually.
> >
> > Thanks,
> >
> > Dejan
>
> I don't want the cluster stack to start on boot, so I disable
> pacemaker/corosync. However, I do want the node to power back on so that
> I can log into it when the alarms go off. Yes, I could log into the good
> node, manually unfence/boot it and then log in, but this adds minutes to
> the MTTR that I would realllly like to avoid.
Certainly it adds a bit of time, but only to the node's MTTR,
not the cluster's MTTR. Anyway, if pacemaker can turn off the
node, then a short script can also turn it on.
Cheers,
Dejan
> cheers
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list