[Pacemaker] stonith_admin does not work as expected

andreas graeper agraeper at googlemail.com
Wed Nov 13 07:33:09 EST 2013


hi,
pacemaker version is 1.1.7

the fence-agent (i thought was one of the standards) calls
 snmpget -a <ipaddr>:<udpport> -c <comunity> oid
 snmpset -a <ipaddr>:<udpport> -c <comunity> oid i 0|1

therefor it needs/uses commandline arguments
 -o action
 -n port (slot-index)
 -a ipaddr
 -c community
 (udpport is not necessary, cause fix == 161)

or (as logs tell me) the fence-agent gets its parameters from stdin
 fence_ifmib <<EOF
  action=
  port=
  ipaddr=
  comunity=
 EOF
another unvalid 'nodename=xyz' is given.
the fence-agents was written for another device, and cause our device
does not support
a function (OID_PORT used to get port-index from port-name) we have to
use port- numbers. but except other tiny limitations it works great


     <primitive class="stonith" id="fence_1" type="fence_ifmib_epc8212">
        <instance_attributes id="fence_1-instance_attributes">
          <nvpair id="fence_1-instance_attributes-ipaddr"
name="ipaddr" value="172.27.51.33"/>
          <nvpair id="fence_1-instance_attributes-community"
name="community" value="xxx"/>
          <nvpair id="fence_1-instance_attributes-port" name="port" value="1"/>
          <nvpair id="fence_1-instance_attributes-action"
name="action" value="off"/>
          <nvpair
id="fence_1-instance_attributes-pcmk_poweroff_action"
name="pcmk_poweroff_action" value="off"/>
          <nvpair id="fence_1-instance_attributes-pcmk_host_list"
name="pcmk_host_list" value="lisel1"/>
          <nvpair id="fence_1-instance_attributes-pcmk_host_check"
name="pcmk_host_check" value="static-list"/>
          <nvpair id="fence_1-instance_attributes-verbose"
name="verbose" value="true"/>

      <primitive class="stonith" id="fence_2" type="fence_ifmib_epc8212">
        <instance_attributes id="fence_2-instance_attributes">
          <nvpair id="fence_2-instance_attributes-ipaddr"
name="ipaddr" value="172.27.51.33"/>
          <nvpair id="fence_2-instance_attributes-community"
name="community" value="xxx"/>
          <nvpair id="fence_2-instance_attributes-port" name="port" value="2"/>
          <nvpair id="fence_2-instance_attributes-action"
name="action" value="off"/>
          <nvpair
id="fence_2-instance_attributes-pcmk_poweroff_action"
name="pcmk_poweroff_action" value="off"/>
          <nvpair id="fence_2-instance_attributes-pcmk_host_list"
name="pcmk_host_list" value="lisel2"/>
          <nvpair id="fence_2-instance_attributes-pcmk_host_check"
name="pcmk_host_check" value="static-list"/>
          <nvpair id="fence_2-instance_attributes-verbose"
name="verbose" value="true"/>

      <rsc_location id="location-fence_1-lisel1--INFINITY"
node="lisel1" rsc="fence_1" score="-INFINITY"/>
      <rsc_location id="location-fence_2-lisel2--INFINITY"
node="lisel2" rsc="fence_2" score="-INFINITY"/>


old master is back now as slave.
now on (new) master stonith_admin does not see the device/fence-agent.
(see last message)

how can i repair this ?

thanks
andreas





2013/11/11, Andrew Beekhof <andrew at beekhof.net>:
> Impossible to comment without knowing the pacemaker version, full config,
> and how fence_ifmib works (I assume its a custom agent?)
>
> On 12 Nov 2013, at 1:21 am, andreas graeper <agraeper at googlemail.com>
> wrote:
>
>> hi,
>> two nodes.
>> n1 (slave) fence_2:stonith:fence_ifmib
>> n2 (master) fence_1:stonith:fence_ifmib
>>
>> n1 was fenced cause suddenly not reachable. (reason still unknown)
>>
>> n2 > stonith_admin -L -> 'fence_1'
>> n2 > stonith_admin -U fence_1       timed out
>> n2 > stonith_admin -L -> 'no devices found'
>>
>> crm_mon shows fence_1 is running
>>
>> after manual unfencing n1 with smnpset the slave n1 is up again, but
>> still
>> stonith_admin -L tells 'no devices found' on n2
>> same on n1: 'fence_2 \n 1 devices found'
>>
>> what went wrong with stonith_admin ?
>>
>> when calling crm_mon -rA1 at the end 'Node Attributes' are listed :
>>
>> * Node lisel1:
>>    + master-p_drbd_r0:0              	: 5
>> * Node lisel2:
>>    + master-p_drbd_r0:0              	: 5
>>    + master-p_drbd_r0:1              	: 5
>>
>> looks strange ? resources are
>> ms_drbd_r0 on primary
>>  p_drbd_r0 on secondary
>> ?! or how this is to interpret ?
>>
>> thanks in advance
>> andreas
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>




More information about the Pacemaker mailing list