[Pacemaker] Configuration for fence_kdump
Andrew Beekhof
andrew at beekhof.net
Fri Aug 3 06:56:21 UTC 2012
On Thu, Aug 2, 2012 at 4:17 PM, Junko IKEDA <tsukishima.ha at gmail.com> wrote:
> Hi,
>
> I'm trying to run fence_kdump with Pacemaker 1.1.7.
> There are only two actions, off/metadata, for fence_kdump,
> so I set pcmk_monitor_action="metadata" to substitute metadata for monitor.
>
> # fence_kdump -o metadata
There are certainly some things about the RHCS fencing agents that are
not ideal.
One of those problems is consistency with which the action is specified.
Humans set it with -o, but the way fenced (and stonithd) specify it is
with name/value pairs passed via stdin.
Ie. action=metadata
Except that some agents only support 'action=' and some only support
the older 'option='.
I found this out the hard way recently (
https://bugzilla.redhat.com/show_bug.cgi?id=837174 ) and hopefully the
fix will make its way into a release soon.
Unfortunately for you, Pacemaker tries to use 'option=' (because my
understanding was that all agents supported this) which fence_kdump
doesn't support.
You can see the problem by running it how pacemaker does:
echo "option=metadata" > foo
cat foo | fence_kdump
If you want to teach Pacemaker to use action=, change the value of
STONITH_ATTR_ACTION_OP to "action".
I'll make the same change for 1.1.8
> <?xml version="1.0" ?>
> <resource-agent name="fence_kdump" shortdesc="Fence agent for use with kdump">
> <longdesc>The fence_kdump agent is intended to be used with with kdump
> service.</longdesc>
> <parameters>
> <parameter name="nodename" unique="1" required="0">
> <getopt mixed="-n, --nodename" />
> <content type="string" />
> <shortdesc lang="en">Name or IP address of node to be
> fenced</shortdesc>
> </parameter>
>
> <snip>
>
> <parameter name="usage" unique="1" required="0">
> <getopt mixed="-h, --help" />
> <content type="boolean" />
> <shortdesc lang="en">Print usage</shortdesc>
> </parameter>
> </parameters>
> <actions>
> <action name="off" />
> <action name="metadata" />
> </actions>
> </resource-agent>
>
>
> Here is my configuration;
>
> # cat fence_kdump.crm
> property no-quorum-policy="ignore" \
> stonith-enabled="true" \
> startup-fencing="false" \
> stonith-timeout="120s" \
> crmd-transition-delay="2s"
>
> rsc_defaults \
> resource-stickiness="INFINITY" \
> migration-threshold="1"
>
> primitive stonith-1 stonith:fence_kdump \
> params \
> pcmk_host_check="dinamic-list" \
> pcmk_monitor_action="metadata" \
> nodename=bl460g6c \
> timeout=10
>
> primitive stonith-2 stonith:fence_kdump \
> params \
> pcmk_host_check="dinamic-list" \
> pcmk_monitor_action="metadata" \
> nodename=bl460g6d \
> timeout=10
>
> location location-1 stonith-1 \
> rule -INFINITY: #uname eq bl460g6c
> location location-2 stonith-2 \
> rule -INFINITY: #uname eq bl460g6d
>
>
>
> Unfortunately, fence_kdump has failed at its start procedure.
>
> # crm_mon -1
> ============
> Last updated: Thu Aug 2 14:52:30 2012
> Last change: Thu Aug 2 14:50:27 2012 via cibadmin on bl460g6c
> Stack: corosync
> Current DC: bl460g6d (2) - partition with quorum
> Version: 1.1.7-e986274
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> ============
>
> Online: [ bl460g6c bl460g6d ]
>
>
> Failed actions:
> stonith-2_start_0 (node=bl460g6c, call=12, rc=1, status=Error):
> unknown error
> stonith-1_start_0 (node=bl460g6d, call=12, rc=1, status=Error):
> unknown error
>
>
>
> # grep stonith-ng /var/log/ha-log
> Aug 2 14:49:45 bl460g6d stonith-ng[26177]: notice: crm_log_args:
> crm_log_args: Invoked: /usr/libexec/pacemaker/stonithd
> Aug 2 14:49:45 bl460g6d stonith-ng[26177]: info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Aug 2 14:49:45 bl460g6d stonith-ng[26177]: notice:
> crm_cluster_connect: Connecting to cluster infrastructure: corosync
> Aug 2 14:49:46 bl460g6d stonith-ng[26177]: notice: setup_cib:
> Watching for stonith topology changes
> Aug 2 14:50:30 bl460g6d stonith-ng[26177]: notice:
> stonith_device_register: Added 'stonith-1' to the device list (1
> active devices)
> Aug 2 14:50:40 bl460g6d stonith-ng[26177]: notice: log_operation:
> Operation 'monitor' [26201] for device 'stonith-1' returned: -1001
> Aug 2 14:50:40 bl460g6d stonith-ng[26177]: warning: log_operation:
> stonith-1: [debug]: waiting for message from '192.168.133.11'
> Aug 2 14:50:40 bl460g6d stonith-ng[26177]: warning: log_operation:
> stonith-1: [debug]: timeout after 10 seconds
>
>
> It seems that default "off" action is called at the start (monitor_0) operation.
> Is there any misunderstanding in my configuration, especially around
> "pcmk_monitor_action"?
> I was wondering if you could give me some advice.
>
>
> By the way, I created cluster.conf manually.
>
> # cat /etc/cluster/cluster.conf
> <?xml version="1.0" ?>
> <cluster name="ossvert" config_version="1" >
> <clusternodes>
> <clusternode name="bl460g6c" nodeid="1">
> <fence>
> </fence>
> </clusternode>
> <clusternode name="bl460g6d" nodeid="2">
> <fence>
> </fence>
> </clusternode>
> </clusternodes>
> <fencedevices>
> <fencedevice name="kdump" agent="fence_kdump" />
> </fencedevices>
> <rm>
> </rm>
> </cluster>
>
> # rpm -qa | grep fence-agents
> fence-agents-3.1.5-10.el6.x86_64
>
> # cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 6.2 (Santiago)
>
> Regard,
> Junko IKEDA
>
> NTT DATA INTELLILINK CORPORATION
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list