[Pacemaker] Howto write a STONITH agent
Dejan Muhamedagic
dejanmm at fastmail.fm
Fri Jan 14 16:18:06 UTC 2011
On Fri, Jan 14, 2011 at 05:10:17PM +0100, Christoph Herrmann wrote:
> -----Ursprüngliche Nachricht-----
> Von: Dejan Muhamedagic <dejanmm at fastmail.fm>
> Gesendet: Fr 14.01.2011 12:31
> An: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>;
> Betreff: Re: [Pacemaker] Howto write a STONITH agent
>
> > Hi,
> >
> > On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
> > > Hi,
> > >
> > > I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition
> > 1.81 ...)
> > > But I'm not able to connect with them via the external/riloe agent.
> > > When i try:
> > >
> > > stonith -t external/riloe -p "hostlist=node1 ilo_hostname=ilo1
> > ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0
> > ilo_powerdown_method=power" -S
> >
> > Try this:
> >
> > stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1 ilo_user=ilouser
> > ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0
> > ilo_powerdown_method=power -S
>
> thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. I've tested it about 10 times
> and 5 times it hangs. That's not what I want.
Did you try to find out why did it hang?
> Finally I will use my own ssh-ilo agent. It's very simple (KISS) and reliable. The external/riloe agent did not
> look to simple.
Right. Let's everybody roll our own ;->
> So my questions still remain. Is there a HOWTO for writing stonith agents.
No.
> Is it usefull to write (to run) a stonith agent as cloned resource?
Sometimes. There are quite some resources. You can take a look
at clusterlabs.org.
> What should the status check do with a cloned stonith resource. Is it usefull in any way? (As long as I have 4 different nodes with 4 different ilo boards.)
The status should check for the device status, not nodes.
Thanks,
Dejan
>
>
> Cheers,
>
>
> Christoph &:-)
>
>
> > Thanks,
> >
> > Dejan
> >
> > >
> > > I get the following answer:
> > >
> > > external/riloe[14317]: ERROR: unknown power method %s, setting to "power"
> > > external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while
> > talking to ilo_hostname=ilo1
> > >
> > > ** (process:14315): CRITICAL **: external_run_cmd: Calling
> > '/usr/lib64/stonith/plugins/external/riloe status' returned 1
> > >
> > > ** (process:14315): CRITICAL **: external_status: 'riloe status' failed with
> > rc 1
> > > stonith: external/riloe device not accessible.
> > >
> > >
> > > But I can access ilo1 with http, https and ssh. The easiest way to reset a
> > node is to run:
> > >
> > > ssh -i ilo-sshkey ilouser at ilo1 reset system1
> > >
> > > I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than
> > debugging the existing one. But I'm looking for a short howto. I've read some
> > STONITH agents, but they are not completely self-explaining and I have some
> > questions. Is there a short howto write a stonith agent manual which google and
> > I were not able to find?
> > > Or should I post all questions to the list?
> > > here we go:
> > >
> > > 1. (and most important): What does the status check do, if you have an agent
> > which runs as cloned resource (my ssh-ilo agent should run as a cloned
> > resource). Does it check all nodes? Is it possible to check the status of a
> > single node?
> > > 2. What are the expected return codes?
> > >
> > > more to follow ;-)
> > >
> > >
> > >
> > >
> > > regards
> > >
> > >
> > > Christoph &:-)
> --
> Vorstand/Board of Management:
> Dr. Bernd Finkbeiner, Dr. Roland Niemeier,
> Dr. Arno Steitz, Dr. Ingrid Zech
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Michel Lepert
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list