[Pacemaker] Configuration for fence_kdump

Junko IKEDA tsukishima.ha at gmail.com
Sun Aug 5 23:07:02 EDT 2012


Hi,

Thank you for your kind explanation!
I tried the latest fence-agents-3.1.9.

# rpm -e fence-agents-3.1.5-10.el6.x86_64
# wget https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-3.1.9.tar.gz
# tar zxf fence-agents-3.1.9.tar.gz
# cd fence-agents-3.1.9
# ./configure --prefix=/usr --libdir=/usr/lib64 --sysconfdir=/etc
--localstatedir=/var
# make install

# echo "option=metadata" > foo
# cat foo | fence_kdump
[error]: action 'off' requires nodename

# echo "action=metadata" > foo
# cat foo | fence_kdump
<?xml version="1.0" ?>
<resource-agent name="fence_kdump" shortdesc="Fence agent for use with kdump">
<longdesc>The fence_kdump agent is intended to be used with with kdump
service.</longdesc>
....

fence_baytech which you mentioned on Bugzilla supports "action" now.

# echo "action=metadata" > foo
# cat foo | fence_baytech
<?xml version="1.0" ?>
<resource-agent name="fence_baytech" shortdesc="I/O Fencing agent for
Baytech RPC switches in combination with a Cyclades Terminal Server" >
<longdesc>
...


and changed the value of STONITH_ATTR_ACTION_OP to "action" manually for now.
I think it works well :)

# cd ../beekhof/
# git pull
# git show

commit ca505c05b11e2931764653bf675ce948feccce5e
Author: Andrew Beekhof <andrew at beekhof.net>
Date:   Fri Aug 3 12:34:16 2012 +1000

    Low: PE: Supress 'multi active' error for fencing devices on unclean nodes

# vim ./include/crm/fencing/internal.h

//#define STONITH_ATTR_ACTION_OP   "option" /* To be replaced by
'action' at some point */
#define STONITH_ATTR_ACTION_OP   "action" /* To be replaced by
'action' at some point */

# make install

# rm -f /var/lib/pacemaker/cib/*
# rm -f /var/lib/pacemaker/pengine/*
# logrotate -f /etc/logrotate.conf
# service corosync start
# service pacemaker start

# cat /home/crm/trac2051-kdump.crm

property no-quorum-policy="ignore" \
        stonith-enabled="true" \
        startup-fencing="false" \
        stonith-timeout="120s" \
        crmd-transition-delay="2s"

rsc_defaults \
        resource-stickiness="INFINITY" \
        migration-threshold="1"

primitive stonith-1 stonith:fence_kdump \
        params \
        pcmk_host_check="static-list" \
        pcmk_host_list="bl460g6c" \
        pcmk_reboot_action="off" \
        pcmk_monitor_action="metadata" \
        nodename=bl460g6c \
        timeout=180

primitive stonith-2 stonith:fence_kdump \
        params \
        pcmk_host_check="static-list" \
        pcmk_host_list="bl460g6d" \
        pcmk_reboot_action="off" \
        pcmk_monitor_action="metadata" \
        nodename=bl460g6d \
        timeout=180

location location-1 stonith-1 \
        rule -INFINITY: #uname eq bl460g6c
location location-2 stonith-2 \
        rule -INFINITY: #uname eq bl460g6d





# crm configure load update trac2051-kdump.crm

# crm_mon -1
============
Last updated: Mon Aug  6 11:14:18 2012
Last change: Mon Aug  6 11:13:18 2012 via cibadmin on bl460g6c
Stack: corosync
Current DC: bl460g6d (2) - partition with quorum
Version: 1.1.7-e986274
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ bl460g6c bl460g6d ]

 stonith-1      (stonith:fence_kdump):  Started bl460g6d
 stonith-2      (stonith:fence_kdump):  Started bl460g6c





# ls -l /var/crash/; date
total 0
Mon Aug  6 11:13:57 JST 2012

# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger

# tail -f /var/log/ha-log
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning: pe_fence_node: Node
bl460g6c will be fenced because the node is no longer part of the
cluster
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning:
determine_online_status: Node bl460g6c is unclean
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning: custom_action:
Action stonith-2_stop_0 on bl460g6c is unrunnable (offline)
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning: custom_action:
Action stonith-2_stop_0 on bl460g6c is unrunnable (offline)
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning: stage6: Scheduling
Node bl460g6c for STONITH
Aug  6 11:14:50 bl460g6d pengine[3605]:   notice: LogActions: Stop
stonith-2 (bl460g6c)
Aug  6 11:14:50 bl460g6d pengine[3605]:  warning: process_pe_message:
Transition 2: WARNINGs found during PE processing. PEngine Input
stored in: /var/lib/pacemaker/pengine/pe-warn-0.bz2
Aug  6 11:14:50 bl460g6d crmd[3606]:   notice: te_fence_node:
Executing reboot fencing operation (9) on bl460g6c (timeout=120000)
Aug  6 11:16:20 bl460g6d stonith-ng[3602]:   notice: log_operation:
Operation 'reboot' [3644] (call 0 from
ebe2612f-0451-4d6a-bf29-9f8323005b2b) for host 'bl460g6c' with device
'stonith-1' returned: 0
Aug  6 11:16:20 bl460g6d stonith-ng[3602]:   notice: remote_op_done:
Operation reboot of bl460g6c by bl460g6d for
bl460g6d[ebe2612f-0451-4d6a-bf29-9f8323005b2b]: OK

# ls -l /var/crash/; date
total 4
drwxr-xr-x 2 root root 4096 Aug  6 11:16 127.0.0.1-2012-08-06-11:16:19
Mon Aug  6 11:20:08 JST 2012



Thanks,
Junko




More information about the Pacemaker mailing list