[Pacemaker] crm_simulate a resource failure

Cal Heldenbrand cal at fbsdata.com
Wed Oct 24 10:37:27 EDT 2012


Thanks Andrew!  My first few attempts at playing around with the failure
states are working as expected.

A few follow-ups below:


--op-fail isn't the command you want though.
> From the man page:
>
>        -i, --op-inject=value
>               $rsc_$task_$interval@$node=$rc - Inject the specified
> task before running the simulation
>
>        -F, --op-fail=value
>               $rsc_$task_$interval@$node=$rc - Fail the specified task
> while running the simulation
>
> Note the difference between the two descriptions: before vs. while.
> --op-inject is the one you want.  It is mostly useful for pretending a
> recurring monitor failed and seeing what the cluster would do about
> it.
>
> --op-fail on the other hand, is used for pretending that part of the
> recovery process failed.
>

Your follow up description here is great, and makes more sense.  I was
reading "Fail the specified task" as literally, "here's my task, fail it
and show me the results"  I'd suggest to add a little paragraph in the man
page to elaborate these points too.  Also, can you tell me what all of the
return codes are?  Do I have to use integers, or do strings like "error"
work?

While we're on the subject of documentation / usability, I would also
suggest to split out these two features into more parameters.  (What would
happen if I named my resource with an underscore?)  Maybe something like:

--op-pre-resource=[primitive name]
--op-pre-task=[monitor|start|stop]
--op-pre-interval=[integer]
--op-pre-node=[hostname]
--op-pre-rc=[error|timeout|other stuff]

Then have similar --op-post-* parameters.  Or whatever verbs make the most
sense in the spirit of Pacemaker vocabulary.  (pre/post, before/after,
inject/fail, input/output, etc)  And, examples are always awesome in man
pages too.

Of course, this is all great future version stuff, but that doesn't help
all of the RedHat 6 people that will be using pacemaker 1.1 packages for
the next ~10 years until RedHat 7 comes out.  So I suppose documenting the
old code in the online docs is a Good Thing.  :-)

Thanks again!

--Cal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20121024/6d1ec299/attachment-0003.html>


More information about the Pacemaker mailing list