[Pacemaker] Howto write a STONITH agent
Christoph Herrmann
C.Herrmann at science-computing.de
Fri Jan 14 16:10:17 UTC 2011
-----Ursprüngliche Nachricht-----
Von: Dejan Muhamedagic <dejanmm at fastmail.fm>
Gesendet: Fr 14.01.2011 12:31
An: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>;
Betreff: Re: [Pacemaker] Howto write a STONITH agent
> Hi,
>
> On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
> > Hi,
> >
> > I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition
> 1.81 ...)
> > But I'm not able to connect with them via the external/riloe agent.
> > When i try:
> >
> > stonith -t external/riloe -p "hostlist=node1 ilo_hostname=ilo1
> ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0
> ilo_powerdown_method=power" -S
>
> Try this:
>
> stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1 ilo_user=ilouser
> ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0
> ilo_powerdown_method=power -S
thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. I've tested it about 10 times
and 5 times it hangs. That's not what I want.
Finally I will use my own ssh-ilo agent. It's very simple (KISS) and reliable. The external/riloe agent did not
look to simple.
So my questions still remain. Is there a HOWTO for writing stonith agents.
Is it usefull to write (to run) a stonith agent as cloned resource?
What should the status check do with a cloned stonith resource. Is it usefull in any way? (As long as I have 4 different nodes with 4 different ilo boards.)
Cheers,
Christoph &:-)
> Thanks,
>
> Dejan
>
> >
> > I get the following answer:
> >
> > external/riloe[14317]: ERROR: unknown power method %s, setting to "power"
> > external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while
> talking to ilo_hostname=ilo1
> >
> > ** (process:14315): CRITICAL **: external_run_cmd: Calling
> '/usr/lib64/stonith/plugins/external/riloe status' returned 1
> >
> > ** (process:14315): CRITICAL **: external_status: 'riloe status' failed with
> rc 1
> > stonith: external/riloe device not accessible.
> >
> >
> > But I can access ilo1 with http, https and ssh. The easiest way to reset a
> node is to run:
> >
> > ssh -i ilo-sshkey ilouser at ilo1 reset system1
> >
> > I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than
> debugging the existing one. But I'm looking for a short howto. I've read some
> STONITH agents, but they are not completely self-explaining and I have some
> questions. Is there a short howto write a stonith agent manual which google and
> I were not able to find?
> > Or should I post all questions to the list?
> > here we go:
> >
> > 1. (and most important): What does the status check do, if you have an agent
> which runs as cloned resource (my ssh-ilo agent should run as a cloned
> resource). Does it check all nodes? Is it possible to check the status of a
> single node?
> > 2. What are the expected return codes?
> >
> > more to follow ;-)
> >
> >
> >
> >
> > regards
> >
> >
> > Christoph &:-)
--
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier,
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Michel Lepert
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
More information about the Pacemaker
mailing list