[Pacemaker] stonith_admin does not work as expected

andreas graeper agraeper at googlemail.com
Wed Nov 13 07:52:36 EST 2013


i stopped/started the resource and now stonith_admin kann see it again .

 pcs resource stop fence_1
 pcs resource start fence_1

but how can it get lost ?

thanks
andreas

2013/11/13, andreas graeper <agraeper at googlemail.com>:
> hi,
> pacemaker version is 1.1.7
>
> the fence-agent (i thought was one of the standards) calls
>  snmpget -a <ipaddr>:<udpport> -c <comunity> oid
>  snmpset -a <ipaddr>:<udpport> -c <comunity> oid i 0|1
>
> therefor it needs/uses commandline arguments
>  -o action
>  -n port (slot-index)
>  -a ipaddr
>  -c community
>  (udpport is not necessary, cause fix == 161)
>
> or (as logs tell me) the fence-agent gets its parameters from stdin
>  fence_ifmib <<EOF
>   action=
>   port=
>   ipaddr=
>   comunity=
>  EOF
> another unvalid 'nodename=xyz' is given.
> the fence-agents was written for another device, and cause our device
> does not support
> a function (OID_PORT used to get port-index from port-name) we have to
> use port- numbers. but except other tiny limitations it works great
>
>
>      <primitive class="stonith" id="fence_1" type="fence_ifmib_epc8212">
>         <instance_attributes id="fence_1-instance_attributes">
>           <nvpair id="fence_1-instance_attributes-ipaddr"
> name="ipaddr" value="172.27.51.33"/>
>           <nvpair id="fence_1-instance_attributes-community"
> name="community" value="xxx"/>
>           <nvpair id="fence_1-instance_attributes-port" name="port"
> value="1"/>
>           <nvpair id="fence_1-instance_attributes-action"
> name="action" value="off"/>
>           <nvpair
> id="fence_1-instance_attributes-pcmk_poweroff_action"
> name="pcmk_poweroff_action" value="off"/>
>           <nvpair id="fence_1-instance_attributes-pcmk_host_list"
> name="pcmk_host_list" value="lisel1"/>
>           <nvpair id="fence_1-instance_attributes-pcmk_host_check"
> name="pcmk_host_check" value="static-list"/>
>           <nvpair id="fence_1-instance_attributes-verbose"
> name="verbose" value="true"/>
>
>       <primitive class="stonith" id="fence_2" type="fence_ifmib_epc8212">
>         <instance_attributes id="fence_2-instance_attributes">
>           <nvpair id="fence_2-instance_attributes-ipaddr"
> name="ipaddr" value="172.27.51.33"/>
>           <nvpair id="fence_2-instance_attributes-community"
> name="community" value="xxx"/>
>           <nvpair id="fence_2-instance_attributes-port" name="port"
> value="2"/>
>           <nvpair id="fence_2-instance_attributes-action"
> name="action" value="off"/>
>           <nvpair
> id="fence_2-instance_attributes-pcmk_poweroff_action"
> name="pcmk_poweroff_action" value="off"/>
>           <nvpair id="fence_2-instance_attributes-pcmk_host_list"
> name="pcmk_host_list" value="lisel2"/>
>           <nvpair id="fence_2-instance_attributes-pcmk_host_check"
> name="pcmk_host_check" value="static-list"/>
>           <nvpair id="fence_2-instance_attributes-verbose"
> name="verbose" value="true"/>
>
>       <rsc_location id="location-fence_1-lisel1--INFINITY"
> node="lisel1" rsc="fence_1" score="-INFINITY"/>
>       <rsc_location id="location-fence_2-lisel2--INFINITY"
> node="lisel2" rsc="fence_2" score="-INFINITY"/>
>
>
> old master is back now as slave.
> now on (new) master stonith_admin does not see the device/fence-agent.
> (see last message)
>
> how can i repair this ?
>
> thanks
> andreas
>
>
>
>
>
> 2013/11/11, Andrew Beekhof <andrew at beekhof.net>:
>> Impossible to comment without knowing the pacemaker version, full config,
>> and how fence_ifmib works (I assume its a custom agent?)
>>
>> On 12 Nov 2013, at 1:21 am, andreas graeper <agraeper at googlemail.com>
>> wrote:
>>
>>> hi,
>>> two nodes.
>>> n1 (slave) fence_2:stonith:fence_ifmib
>>> n2 (master) fence_1:stonith:fence_ifmib
>>>
>>> n1 was fenced cause suddenly not reachable. (reason still unknown)
>>>
>>> n2 > stonith_admin -L -> 'fence_1'
>>> n2 > stonith_admin -U fence_1       timed out
>>> n2 > stonith_admin -L -> 'no devices found'
>>>
>>> crm_mon shows fence_1 is running
>>>
>>> after manual unfencing n1 with smnpset the slave n1 is up again, but
>>> still
>>> stonith_admin -L tells 'no devices found' on n2
>>> same on n1: 'fence_2 \n 1 devices found'
>>>
>>> what went wrong with stonith_admin ?
>>>
>>> when calling crm_mon -rA1 at the end 'Node Attributes' are listed :
>>>
>>> * Node lisel1:
>>>    + master-p_drbd_r0:0              	: 5
>>> * Node lisel2:
>>>    + master-p_drbd_r0:0              	: 5
>>>    + master-p_drbd_r0:1              	: 5
>>>
>>> looks strange ? resources are
>>> ms_drbd_r0 on primary
>>>  p_drbd_r0 on secondary
>>> ?! or how this is to interpret ?
>>>
>>> thanks in advance
>>> andreas
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>




More information about the Pacemaker mailing list