[Pacemaker] Fencing in Pacemaker with Cyclades
Martin Steigerwald
ms at teamix.de
Wed Aug 18 14:41:32 UTC 2010
Hi,
I have a working fencing setup with heartbeat-1:
somehost1:~# grep ^stonith /etc/ha.d/ha.cf
stonith_host * cyclades 172.21.101.79 root 10
So thats a cyclades stonith plugin, the IP adress of the Cyclades Alterpath, login name for SSH login, and the serial port of the IPDU that should powercycle the node to be fenced.
Now when I want to configure a stonith plugin in corosync/pacemaker, I can't set the serial port.
There is simply no such parameter in that resource agent shown in pacemaker:
-------------------------------------------------
crm(live)# ra info cyclades stonith
<!-- no value --> (stonith:cyclades)
Cyclades AlterPath PM series power switches (via TS/ACS/KVM).
Parameters (* denotes required, [] the default):
ipaddr* (string): IP Address
The IP address of the STONITH device
login* (string): Login
The username used for logging in to the STONITH device
stonith-timeout (time, [60s]):
How long to wait for the STONITH action to complete. Overrides the stonith-timeout cluster property
priority (integer, [0]):
The priority of the stonith resource. The lower the number, the higher the priority.
Operations' defaults (advisory minimum):
start timeout=60
stop timeout=15
status timeout=60
monitor_0 interval=3600 timeout=60
-------------------------------------------------
So I tried with just:
primitive fencing stonith:cyclades \
params ipaddr="172.21.101.79" login="root" \
op monitor interval="15s" timeout="60s"
But it doesn't work:
Failed actions:
fencing:0_start_0 (node=somenode2, call=6, rc=1, status=complete): unknown error
fencing:1_start_0 (node=somenode1, call=7, rc=1, status=complete): unknown error
I find no hint except that "unknown error" in syslog or crm shell. And pacemaker raised fail count to infinite after AFAIR the second attempt. I did not find a single hint via Google on how to configure stonith with cyclades for a corosync / pacemaker setup either.
How should it know which serial port I connected the IPDU too?
Related: Why does pacemaker raise failure count to infinity so quickly? In our old heartbeat setup heartbeat tried stonithing the other hosts for lots of attempts and didn't give up that quickly. Well usually it should work on the first attempt, but with shared storage it can be very dangerous if a cluster partner takes over resources, when fencing the unresponsive node did not work.
I am using corosync 1.2.1-1~bpo50+1 and pacemaker 1.0.9.1-2~bpo50+1 from lenny-backports.[1]
[1] http://www.backports.org
Ciao,
--
Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90
More information about the Pacemaker
mailing list