[Pacemaker] stonith
Andreas Kurz
andreas.kurz at gmail.com
Sun Apr 19 12:23:27 UTC 2015
On 2015-04-17 12:36, Thomas Manninger wrote:
> Hi list,
>
> i have a pacemaker/corosync2 setup with 4 nodes, stonith configured over
> ipmi interface.
>
> My problem is, that sometimes, a wrong node is stonithed.
> As example:
> I have 4 servers: node1, node2, node3, node4
>
> I start a hardware- reset on node node1, but node1 and node3 will be
> stonithed.
You have to tell pacemaker exactly what stonith-resource can fence what
node if the stonith agent you are using does not support the "list" action.
Do this by adding "pcmk_host_check=static-list" and "pcmk_host_list" to
every stonith-resource like:
primitive p_stonith_node3 stonith:external/ipmi \
op monitor interval=3s timeout=20s \
params hostname=node3 ipaddr=10.100.0.6 passwd_method=file
passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
priv=OPERATOR \
pcmk_host_check="static-list" pcmk_host_list="node3"
... see "man stonithd".
Best regards,
Andreas
>
> In the cluster.log, i found following entry:
> Apr 17 11:02:41 [20473] node2 stonithd: debug:
> stonith_action_create: Initiating action reboot for agent
> fence_legacy (target=node1)
> Apr 17 11:02:41 [20473] node2 stonithd: debug: make_args:
> Performing reboot action for node 'node1' as 'port=node1'
> Apr 17 11:02:41 [20473] node2 stonithd: debug:
> internal_stonith_action_execute: forking
> Apr 17 11:02:41 [20473] node2 stonithd: debug:
> internal_stonith_action_execute: sending args
> Apr 17 11:02:41 [20473] node2 stonithd: debug:
> stonith_device_execute: Operation reboot for node node1 on
> p_stonith_node3 now running with pid=113092, timeout=60s
>
> node1 will be reseted with the stonith primitive of node3 ?? Why??
>
> my stonith config:
> primitive p_stonith_node1 stonith:external/ipmi \
> params hostname=node1 ipaddr=10.100.0.2 passwd_method=file
> passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> priv=OPERATOR \
> op monitor interval=3s timeout=20s \
> meta target-role=Started failure-timeout=30s
> primitive p_stonith_node2 stonith:external/ipmi \
> op monitor interval=3s timeout=20s \
> params hostname=node2 ipaddr=10.100.0.4 passwd_method=file
> passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> priv=OPERATOR \
> meta target-role=Started failure-timeout=30s
> primitive p_stonith_node3 stonith:external/ipmi \
> op monitor interval=3s timeout=20s \
> params hostname=node3 ipaddr=10.100.0.6 passwd_method=file
> passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> priv=OPERATOR \
> meta target-role=Started failure-timeout=30s
> primitive p_stonith_node4 stonith:external/ipmi \
> op monitor interval=3s timeout=20s \
> params hostname=node4 ipaddr=10.100.0.8 passwd_method=file
> passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> priv=OPERATOR \
> meta target-role=Started failure-timeout=30s
>
> Somebody can help me??
> Thanks!
>
> Regards,
> Thomas
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20150419/78e39db7/attachment-0004.sig>
More information about the Pacemaker
mailing list