[Pacemaker] crm_simulate a resource failure

Andrew Beekhof andrew at beekhof.net
Wed Oct 24 21:53:42 UTC 2012


On Thu, Oct 25, 2012 at 1:47 AM, Jake Smith <jsmith at argotec.com> wrote:
>
> ----- Original Message -----
>
>> From: "Cal Heldenbrand" <cal at fbsdata.com>
>> To: "The Pacemaker cluster resource manager"
>> <pacemaker at oss.clusterlabs.org>
>> Sent: Wednesday, October 24, 2012 10:37:27 AM
>> Subject: Re: [Pacemaker] crm_simulate a resource failure
>
>> Thanks Andrew! My first few attempts at playing around with the
>> failure states are working as expected.
>
>> A few follow-ups below:
>
>> > --op-fail isn't the command you want though.
>>
>> > From the man page:
>>
>
>> > -i, --op-inject=value
>>
>> > $rsc_$task_$interval@$node=$rc - Inject the specified
>>
>> > task before running the simulation
>>
>
>> > -F, --op-fail=value
>>
>> > $rsc_$task_$interval@$node=$rc - Fail the specified task
>>
>> > while running the simulation
>>
>
>> > Note the difference between the two descriptions: before vs. while.
>>
>> > --op-inject is the one you want. It is mostly useful for pretending
>> > a
>>
>> > recurring monitor failed and seeing what the cluster would do about
>>
>> > it.
>>
>
>> > --op-fail on the other hand, is used for pretending that part of
>> > the
>>
>> > recovery process failed.
>>
>
>> Your follow up description here is great, and makes more sense. I was
>> reading "Fail the specified task" as literally, "here's my task,
>> fail it and show me the results" I'd suggest to add a little
>> paragraph in the man page to elaborate these points too. Also, can
>> you tell me what all of the return codes are? Do I have to use
>> integers, or do strings like "error" work?
>
> I second this (and to answer your question Andrew) I think what you wrote would be a great addition to the man page and would help make those commands much clearer.
>
>> While we're on the subject of documentation / usability, I would also
>> suggest to split out these two features into more parameters. (What
>> would happen if I named my resource with an underscore?) Maybe
>> something like:
>
> I have the same question about underscores since ALL of my resources/contraints etc have them ;-)

Ooops.  Missed this question the first time.
Underscores in resource names already work correctly.

We use parse_op_key() which works backwards to find the interval then
action name.
Every else, including underscores, is the resource name.

>
>> --op-pre-resource=[primitive name]
>> --op-pre-task=[monitor|start|stop]
>> --op-pre-interval=[integer]
>> --op-pre-node=[hostname]
>> --op-pre-rc=[error|timeout|other stuff]
>
>> Then have similar --op-post-* parameters. Or whatever verbs make the
>> most sense in the spirit of Pacemaker vocabulary. (pre/post,
>> before/after, inject/fail, input/output, etc) And, examples are
>> always awesome in man pages too.
>
>> Of course, this is all great future version stuff, but that doesn't
>> help all of the RedHat 6 people that will be using pacemaker 1.1
>> packages for the next ~10 years until RedHat 7 comes out. So I
>> suppose documenting the old code in the online docs is a Good Thing.
>> :-)
>
>> Thanks again!
>
>> --Cal
>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list