[Pacemaker] some questions about STONITH
Lars Marowsky-Bree
lmb at suse.com
Tue Nov 19 18:19:49 UTC 2013
On 2013-11-19T22:10:29, Andrey Groshev <greenx at yandex.ru> wrote:
First, like digimer wrote, clearly stonith-by-ssh is useless for
production since you can't fence nodes that are having problems. But for
testing, it's worth a try.
Note that cluster-glue actually does include an external/ssh script.
You're reinventing the wheel ;-)
> Make next test:
> #stonith_admin --reboot=dev-cluster2-node2
> Node reboot, but resource don't start.
> In crm_mon status - Node dev-cluster2-node2 (172793105): pending.
> And it will be hung.
That is *probably* a race - the node reboots too fast, or still
communicates for a bit after the fence has supposedly completed (if it's
not a reboot -nf, but a mere reboot). We have had problems here in the
past.
You may want to file a proper bug report with crm_report included, and
preferably corosync/pacemaker debugging enabled.
> 2.
> There is a slight discrepancy in the Pacemaker Expl. and stonith_admin --help.
> stonith_admin --reboot nodename.
> In one case, the sign of equality is, in other - no.
> Not very important, because operate both.
Yeah, like you said, both work. So it's not actually a problem.
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Pacemaker
mailing list