[Pacemaker] wrong device in stonith_admin -l

Andrew Beekhof andrew at beekhof.net
Tue Dec 18 20:46:50 UTC 2012


On Wed, Dec 19, 2012 at 4:38 AM,  <laurent+pacemaker at u-picardie.fr> wrote:
> laurent+pacemaker at u-picardie.fr writes:
>
>> David Vossel <dvossel at redhat.com> writes:
>>
>>>> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]:   notice:
>>>> dynamic_list_search_cb: Disabling port list queries for
>>>> stonith-xen-eddu (1): failed:  255
>>>
>>> We discover what hosts a agent can fence by running this command internally in stonith.
>>>
>>> # agent -o list
>>>
>>>>From there we expect a exit-code of 0 and the list of node to be in the output.
>>> https://fedorahosted.org/cluster/wiki/FenceAgentAPI
>>>
>>> Looking at your logs, stonith-xen-eddu is returning -1 (255) as the return code when we issue the 'list' action.  That means we don't try to get the dynamic list again, we assume the 'list' action isn't supported. From there we fall back to using the 'status' action to dynamically determine if agent can fence a particular host.  I'm guessing the 'status' action is returning true (return codes 0 or 2) for hosts you wouldn't expect the agent to be able to fence for some reason.
>>
>> Hi,
>>
>> Ok it makes sense.
>> The FenceAgentAPI doc gives extra information on top of this one:
>> http://hg.linux-ha.org/glue/file/67224d37df80/doc/stonith/README.external
>>
>> returning 1 when hostlist is empty does the trick (gethosts action)
>> so does returning 1 to the status action.
>>
>> So I guess that's the explanation to both of my issues :
>> - after the timeout issue, the port list queries were disabled,
>>   failing back to the status action that was always returning rc=0
>> - gethosts returning rc=0 with an empty hostlist also disables the
>>   port list queries
>>
>> so I guess there's no need to fill a new ticket :)
>> Thanks,
>
> Hmm it still feels like there's something funny with this issue.
> is the FenceAgentAPI relevant with pacemaker ?
>
> I don't see why the fencing agent should return 1 when called with
> "gethosts", it's reachable and working properly. It's just returning
> an empty hostlist.

Agreed.

>
> as for the status action, it also feels like it should return 0 (or 2
> if pacemaker supports it) as the device is reachable.
>
> In the end I'm going to fill a bug.
>
> --
> Laurent
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list