[Pacemaker] Problem with configuring stonith rcd_serial

Eberhard Kuemmerle E.Kuemmerle at fz-juelich.de
Thu Oct 28 13:19:07 UTC 2010

On 27 Oct 2010 11:52, Dejan Muhamedagic wrote:
> Hi,
> On Tue, Oct 26, 2010 at 09:33:17AM +0200, Eberhard Kuemmerle wrote:
>> Hi,
>> I try to configure stonith and get an error message that I don't understand:
>> crm(live)# configure primitive stonith-P stonith::rcd_serial params
>> hostlist="node1 node2" ttydev="/dev/ttyS0" msduration="2000"
>> dtr|rts="rts"  op monitor interval="60s"
>> element nvpair: Relax-NG validity error : Type ID doesn't allow value
>> 'stonith-P-instance_attributes-dtr|rts'
>> Relax-NG validity error : Element nvpair failed to validate attributes
>> element nvpair: Relax-NG validity error : Invalid attribute id for
>> element nvpair
>> Relax-NG validity error : Extra element nvpair in interleave
>> element nvpair: Relax-NG validity error : Element instance_attributes
>> failed to validate content
>> element cib: Relax-NG validity error : Element cib failed to validate
>> content
>> crm_verify[5810]: 2010/10/26_09:20:31 ERROR: main: CIB did not pass
>> DTD/schema validation
>> Errors found during check: config not valid
>> If I remove the parameter dtr|rts="rts", the error is:
>> crm(live)# configure primitive stonith-P stonith::rcd_serial params
>> hostlist="node1 node2" ttydev="/dev/ttyS0" msduration="2000"
>> ERROR: stonith-P: required parameter dtr|rts not defined
>> so the parameter name dtr|rts seems to be ok.
> The shell builds ids (which you see up there) appending instance
> attribute names. The name dtr|rts contains an invalid character
> (|) for the XML ID attribute type. Need to fix that.
> In the meantime, you can either use cibadmin to define this
> primitive, or edit the xml after defining the stonith resource:
> crm(live)configure# primitive stonith-P stonith::rcd_serial params ...
> crm(live)configure# edit xml stonith-P
> Find the dtr|rts nvpair and replace "|" with "_" in the id
> attribute. The shell may ask you if you wanted to edit again,
> just answer no.
> Thanks,
> Dejan
Hi Dejan,

thank you for your answer. Configuring with cibadmin worked, the xml is:

      <clone id="stonith">
        <meta_attributes id="stonith-meta_attributes">
          <nvpair id="stonith-meta_attributes-globally-unique"
name="globally-unique" value="false"/>
          <nvpair id="stonith-meta_attributes-clone-max"
name="clone-max" value="2"/>
          <nvpair id="stonith-meta_attributes-clone-node-max"
name="clone-node-max" value="1"/>
          <nvpair id="stonith-meta_attributes-target-role"
name="target-role" value="Stopped"/>
        <primitive class="stonith" id="stonith-P" type="rcd_serial">
          <instance_attributes id="stonith-P-instance_attributes">
            <nvpair id="stonith-P-instance_attributes-hostlist"
name="hostlist" value="node1 node2"/>
            <nvpair id="stonith-P-instance_attributes-ttydev"
name="ttydev" value="/dev/ttyS0"/>
            <nvpair id="stonith-P-instance_attributes-dtr_rts"
name="dtr|rts" value="rts"/>
            <nvpair id="stonith-P-instance_attributes-msduration"
name="msduration" value="2000"/>
            <op id="stonith-P-monitor-60s" interval="60s" name="monitor"/>

But when I start the resource stonith, I get the following Errors in

Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error:
Entity: line 1: parsererror : Specification mandate value for attribute dtr
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error: se"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2" dtr
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error:
Entity: line 1: parsererror : attributes construct error
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error: se"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2" dtr
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error:
Entity: line 1: parsererror : Couldn't find end of Start Tag attributes
line 1
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML Error: se"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2" dtr
Oct 28 13:20:01 node1 crmd: [5229]: ERROR: crm_xml_err: XML
Oct 28 13:12:48 node1 crmd: [5229]: WARN: string2xml: Parsing failed
(domain=1, level=3, code=73): Couldn't find end of Start Tag attributes
line 1
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: string2xml: Couldn't fully
parse 3961 chars: <crm_xml><transition_graph cluster-delay="60s"
stonith-timeout="60s" failed-stop-offset="
INFINITY" failed-start-offset="INFINITY" batch-limit="30"
transition_id="39"><synapse id="0"><action_set><rsc_op id="136"
operation="start" operation_key="stonith-P:0_start_0" on
_node="node2" on_node_uuid="node2"><primitive id="stonith-P:0"
long-id="stonith:stonith-P:0" class="stonith"
type="rcd_serial"/><attributes CRM_meta_clone="0" CRM_meta_clone_ma
x="2" CRM_meta_clone_node_max="1" CRM_meta_globally_unique="false"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2"
dtr|rts="rts" hostlist="node1 node2"
ttydev="/dev/ttyS0"/></rsc_op></action_set><inputs><trigger><pseudo_event id="140"
operation="start" operation_key="stonith_start_0"/></trigger></inputs></syna
pse><synapse id="1"><action_set><rsc_op id="137" operation="monitor"
operation_key="stonith-P:0_monitor_60000" on_node="node2"
on_node_uuid="node2"><primitive id="stonith-P:0"
long-id="stonith:stonith-P:0" class="stonith"
type="rcd_serial"/><attributes CRM_meta_clone="0" CRM_meta_clone_max="2"
CRM_meta_clone_node_max="1" CRM_meta_globally_unique="false
" CRM_meta_interval="60000" CRM_meta_name="monitor"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2"
dtr|rts="rts" hostlist="node1 node2" msduration="20
00" ttydev="/dev/ttyS0"/></rsc_op></action_set><inputs><trigger><rsc_op
id="136" operation="start" operation_key="stonith-P:0_start_0"
on_node="node2" on_node_uuid="node2"/></t
rigger></inputs></synapse><synapse id="2"><action_set><rsc_op id="138"
operation="start" operation_key="stonith-P:1_start_0" on_node="node1"
on_node_uuid="node1"><primitive id=
"stonith-P:1" long-id="stonith:stonith-P:1" class="stonith"
type="rcd_serial"/><attributes CRM_meta_clone="1" CRM_meta_clone_max="2"
CRM_meta_clone_node_max="1" CRM_meta_globally
_unique="false" CRM_meta_notify="false" CRM_meta_timeout="20000"
crm_feature_set="3.0.2" dtr|
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial <crm_xml >
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial   <transition_graph cluster-delay="60s" stonith-timeout="60s"
failed-stop-offset="INFINITY" failed-start-offset="INFINITY"
batch-limit="30" transition_id="39" >
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial     <synapse id="0" >
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial       <action_set >
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial         <rsc_op id="136" operation="start"
operation_key="stonith-P:0_start_0" on_node="node2" on_node_uuid="node2" >
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial           <primitive id="stonith-P:0"
long-id="stonith:stonith-P:0" class="stonith" type="rcd_serial" />
Oct 28 13:12:48 node1 lrmd: [5226]: ERROR: crm_abort: crm_strdup_fn:
Triggered assert at utils.c:964 : src != NULL
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial           <attributes CRM_meta_clone="0" CRM_meta_clone_max="2"
CRM_meta_clone_node_max="1" CRM_meta_globally_unique="false"
CRM_meta_notify="false" CRM_meta_timeout="20000" crm_feature_set="3.0.2" />
Oct 28 13:12:48 node1 lrmd: [5226]: ERROR: crm_strdup_fn: Could not
perform copy at st_client.c:514 (stonith_api_device_metadata)
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial         </rsc_op>
Oct 28 13:12:48 node1 lrmd: [5226]: WARN: stonith_api_device_metadata:
no short description in rcd_serial's metadata.
Oct 28 13:12:48 node1 crmd: [5229]: ERROR: log_data_element: string2xml:
Partial       </action_set>

I also tried to edit /usr/lib64/stonith/plugins/stonith2/rcd_serial.so,
I replaced 'dtr|rts' by 'dtr_rts' there.
Then, I could configure AND start the resource with the following config
(with crm configure):

primitive stonith-P stonith:rcd_serial \
        params hostlist="node1 node2" ttydev="/dev/ttyS0" dtr_rts="rts"
msduration="2000" \
        op monitor interval="60s"
clone stonith stonith-P \
        meta globally-unique="false" clone-max="2" clone-node-max="1"

In xml, the dtr/rts is now:

            <nvpair id="stonith-P-instance_attributes-dtr_rts"
name="dtr_rts" value="rts"/>

With that, I got less errors but still the following:

Oct 28 10:54:23 node1 pengine: [5228]: notice: LogActions: Start
Oct 28 10:54:23 node1 pengine: [5228]: notice: LogActions: Start
Oct 28 10:54:23 node1 lrmd: [5226]: notice: lrmd_rsc_new(): No
lrm_rprovider field in message
Oct 28 10:54:23 node1 lrmd: [5226]: info: rsc:stonith-P:1:89: probe
Oct 28 10:54:23 node1 stonith-ng: [5224]: notice: stonith_device_action:
Device stonith-P:1 not found
Oct 28 10:54:23 node1 lrmd: [5226]: info: rsc:stonith-P:1:90: start
Oct 28 10:54:23 node1 lrmd: [5226]: ERROR: crm_abort: crm_strdup_fn:
Triggered assert at utils.c:964 : src != NULL
Oct 28 10:54:23 node1 lrmd: [5226]: ERROR: crm_strdup_fn: Could not
perform copy at st_client.c:514 (stonith_api_device_metadata)
Oct 28 10:54:23 node1 lrmd: [5226]: WARN: stonith_api_device_metadata:
no short description in rcd_serial's metadata.
Oct 28 10:54:23 node1 lrmd: [5226]: info: stonithRA plugin: got
metadata: <?xml version="1.0"?>#012<!DOCTYPE resource-agent SYSTEM
"ra-api-1.dtd">#012<resource-agent name="rcd_serial">#012
<version>1.0</version>#012  <longdesc lang="en">#012RC Delayed Serial
STONITH Device#012This device can be constructed cheaply from readily
available components,#012with sufficient expertise and testing.#012See
README.rcd_serial for circuit diagram.#012#012  </longdesc>#012
<shortdesc lang="en"><!-- no value
--></shortdesc>#012<parameters><parameter name="hostlist" unique="1"
required="1"><content type="string" />#012<shortdesc
lang="en">#012Hostlist</shortdesc>#012<longdesc lang="en">#012The list
of hosts that the STONITH device
controls</longdesc>#012</parameter>#012<parameter name="ttydev"
unique="1" required="1"><content type="string" />#012<shortdesc
lang="en">#012TTY Device</shortdesc>#012<longdesc lang="en">#012The TTY
device used for connecting to the STONITH
device</longdesc>#012</parameter>#012<parameter name="dtr_rts"
unique="1" required="1"><content type="string" />#012<shortdesc
lang="en">#012dtr_rts</shortdesc>#012<longdesc lang="en">#012The
hardware handshaking technique to use with ttydev("dtr" or
"rts")</longdesc>#012</parameter>#012<parameter name="msduration"
unique="1" required="1"><content type="string" />#012<shortdesc
lang="en">#012msduration</shortdesc>#012<longdesc lang="en">#012The
delay duration (in milliseconds) between the assertion of the control
signal on ttydev and the closing of the reset
switch</longdesc>#012</parameter>#012</parameters>#012  <actions>#012
<action name="start"   timeout="15" />#012    <action name="stop"
timeout="15" />#012    <action name="status"  timeout="15" />#012
<action name="monitor" timeout="15" interval="15" start-delay="15"
/>#012    <action name="meta-data"  timeout="15" />#012  </actions>#012
<special tag="heartbeat">#012    <version>2.0</version>#012
Oct 28 10:54:23 node1 lrmd: [5226]: info: rsc:stonith-P:1:91: monitor
Oct 28 10:54:23 node1 stonith: rcd_serial device OK.

Despite that, I tried a stonith reset with that config and the modified
rcd_serial.so, but it failed:

Oct 28 10:58:56 node1 stonith-ng: [5224]: WARN: parse_host_line: Could
not parse (0 2): ** (process:24232): DEBUG: rcd_serial_set_config:called
Oct 28 10:58:56 node1 stonith-ng: [5224]: WARN: parse_host_line: Could
not parse (3 19): (process:24232): DEBUG: rcd_serial_set_config:called
Oct 28 10:58:56 node1 stonith-ng: [5224]: WARN: parse_host_line: Could
not parse (0 0):
Oct 28 10:58:56 node1 stonith-ng: [5224]: ERROR: log_operation:
Operation 'reboot' [24240] for host 'node2' with device 'stonith-P:1'
returned: 1 (call 0 from (null))
Oct 28 10:58:56 node1 pengine: [5228]: WARN: process_pe_message:
Transition 14: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-0.bz2

What can I do?

