[Pacemaker] 'crm configure edit' failed with "Timer expired"

Thu Oct 18 13:25:25 EDT 2012

Both nodes can ssh to each other, selinux is disabled, and there are
currently no iptables rules in force.  So I'm not sure why the systems
can't communicate.

I looked at /var/log/cluster/corosync.log but didn't see any obvious
problems.  Would you like to see the logs from both nodes?

What else can I do to investigate & debug this?

thanks

On Wed, Oct 17, 2012 at 8:24 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> This is your resource:
>    Current DC: NONE
>
> All the "crm configure" commands try to talk to the DC, and you don't have one.
> Normally one would be elected quite quickly, you may have a
> network/filewall issue.
>
> On Thu, Oct 18, 2012 at 10:37 AM, Lonni J Friedman <netllama at gmail.com> wrote:
>> I'm running Fedora17, with pacemaker-1.18.  I just tried to make a
>> configuration change with crmsh, and it failed as follows:
>> ##########
>> # crm configure edit
>> Call cib_replace failed (-62): Timer expired
>> <null>
>> ERROR: could not replace cib
>> INFO: offending xml: <configuration>
>>         <crm_config>
>>                 <cluster_property_set id="cib-bootstrap-options">
>>                         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>> value="1.1.8-2.fc16-394e906"/>
>>                         <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="openais"/>
>>                         <nvpair id="cib-bootstrap-options-expected-quorum-votes"
>> name="expected-quorum-votes" value="2"/>
>>                         <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>>                         <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>>                 </cluster_property_set>
>>         </crm_config>
>>         <nodes>
>>                 <node id="farm-ljf0" uname="farm-ljf0">
>>                         <instance_attributes id="nodes-farm-ljf0">
>>                                 <nvpair id="nodes-farm-ljf0-standby" name="standby" value="off"/>
>>                         </instance_attributes>
>>                 </node>
>>                 <node id="farm-ljf1" uname="farm-ljf1"/>
>>         </nodes>
>>         <resources>
>>                 <master id="FS0_Clone">
>>                         <meta_attributes id="FS0_Clone-meta_attributes">
>>                                 <nvpair id="FS0_Clone-meta_attributes-master-max"
>> name="master-max" value="1"/>
>>                                 <nvpair id="FS0_Clone-meta_attributes-master-node-max"
>> name="master-node-max" value="1"/>
>>                                 <nvpair id="FS0_Clone-meta_attributes-clone-max" name="clone-max"
>> value="2"/>
>>                                 <nvpair id="FS0_Clone-meta_attributes-clone-node-max"
>> name="clone-node-max" value="1"/>
>>                                 <nvpair id="FS0_Clone-meta_attributes-notify" name="notify" value="true"/>
>>                         </meta_attributes>
>>                         <primitive class="ocf" id="FS0" provider="linbit" type="drbd">
>>                                 <instance_attributes id="FS0-instance_attributes">
>>                                         <nvpair id="FS0-instance_attributes-drbd_resource"
>> name="drbd_resource" value="r0"/>
>>                                 </instance_attributes>
>>                                 <operations>
>>                                         <op id="FS0-monitor-10s" interval="10s" name="monitor" role="Master"/>
>>                                         <op id="FS0-monitor-30s" interval="30s" name="monitor" role="Slave"/>
>>                                 </operations>
>>                         </primitive>
>>                 </master>
>>                 <group id="g_services">
>>                         <primitive class="ocf" id="ClusterIP" provider="heartbeat" type="IPaddr2">
>>                                 <instance_attributes id="ClusterIP-instance_attributes">
>>                                         <nvpair id="ClusterIP-instance_attributes-ip" name="ip"
>> value="10.31.99.8"/>
>>                                         <nvpair id="ClusterIP-instance_attributes-cidr_netmask"
>> name="cidr_netmask" value="22"/>
>>                                         <nvpair id="ClusterIP-instance_attributes-nic" name="nic" value="eth1"/>
>>                                 </instance_attributes>
>>                                 <operations>
>>                                         <op id="ClusterIP-monitor-10s" interval="10s" name="monitor"/>
>>                                 </operations>
>>                                 <meta_attributes id="ClusterIP-meta_attributes">
>>                                         <nvpair id="ClusterIP-meta_attributes-target-role"
>> name="target-role" value="Started"/>
>>                                 </meta_attributes>
>>                         </primitive>
>>                         <primitive class="ocf" id="FS0_drbd" provider="heartbeat" type="Filesystem">
>>                                 <instance_attributes id="FS0_drbd-instance_attributes">
>>                                         <nvpair id="FS0_drbd-instance_attributes-device" name="device"
>> value="/dev/drbd0"/>
>>                                         <nvpair id="FS0_drbd-instance_attributes-directory"
>> name="directory" value="/mnt/sdb1"/>
>>                                         <nvpair id="FS0_drbd-instance_attributes-fstype" name="fstype"
>> value="xfs"/>
>>                                 </instance_attributes>
>>                                 <meta_attributes id="FS0_drbd-meta_attributes">
>>                                         <nvpair id="FS0_drbd-meta_attributes-target-role"
>> name="target-role" value="Started"/>
>>                                 </meta_attributes>
>>                         </primitive>
>>                         <primitive class="systemd" id="FS0_nfs" type="nfs-server">
>>                                 <operations>
>>                                         <op id="FS0_nfs-monitor-10s" interval="10s" name="monitor"/>
>>                                 </operations>
>>                                 <meta_attributes id="FS0_nfs-meta_attributes">
>>                                         <nvpair id="FS0_nfs-meta_attributes-target-role"
>> name="target-role" value="Started"/>
>>                                 </meta_attributes>
>>                         </primitive>
>>                 </group>
>>         </resources>
>>         <constraints>
>>                 <rsc_colocation id="fs0_on_drbd" rsc="g_services" score="INFINITY"
>> with-rsc="FS0_Clone" with-rsc-role="Master"/>
>>                 <rsc_order first="FS0_Clone" first-action="promote"
>> id="FS0_drbd-after-FS0" score="INFINITY" then="g_services"
>> then-action="start"/>
>>         </constraints>
>> </configuration>
>> ##########
>>
>> I'm confused why this failed.  In fact, no matter what I try to
>> change, it always fails in the same fashion when I attempt to save the
>> changes.
>>
>> "crm status" currently shows the following:
>> #######
>> Last updated: Wed Oct 17 16:36:45 2012
>> Last change: Tue Oct 16 14:23:18 2012 via cibadmin on farm-ljf1
>> Stack: openais
>> Current DC: NONE
>> 2 Nodes configured, 2 expected votes
>> 5 Resources configured.
>>
>>
>> OFFLINE: [ farm-ljf0 farm-ljf1 ]
>> ######
>>
>> help?