[Pacemaker] Pacemaker/DRBD troubles

Tue Sep 24 15:30:18 UTC 2013

Thanks, Emmanuel.  After some trial and error, I ended up changing the
order constraints, and that seemed to solve the problem.  This new
configuration works:

    <constraints>
      <rsc_colocation id="drbd-nfs-ha" rsc="ms-drbd_r0" rsc-role="Master"
score="INFINITY" with-rsc="nfs_resources"/>
      <rsc_order first="ms-drbd_r0" first-action="promote"
id="drbd-before-nfs" score="INFINITY" then="nfs_fs" then-action="start"/>
      <rsc_order first="nfs_fs" first-action="start" id="fs-before-ip"
score="INFINITY" then="nfs_ip" then-action="start"/>
      <rsc_order first="nfs_ip" first-action="start" id="ip-before-nfs"
score="INFINITY" then="nfs" then-action="start"/>
    </constraints>

On Mon, Sep 23, 2013 at 6:06 PM, emmanuel segura <emi2fast at gmail.com> wrote:

> i'm not sure if this is the problem, but i think you only need one order
> constrain like this
>
> <rsc_order first="ms-drbd_r0" first-action="promote"
> id="drbd-before-nfsgroup" score="INFINITY" then="nfs_resources"
> then-action="start"/>
>
>
> 2013/9/23 David Parker <dparker at utica.edu>
>
>> Hello,
>>
>> I'm attempting to set up a simple NFS failover test using Pacemaker and
>> DRBD on 2 nodes.  The goal is to have one host be the DRBD master, and have
>> the volume mounted, the NFS server running, and a virtual IP address up.
>>  The other node is the DRBD slave with no NFS services or virtual IP
>> running.  The DRBD resource is configured in master-slave (not dual-master)
>> mode and seems to work fine when it's not being controlled by Pacemaker.
>>
>> The problem is that both nodes start out as DRBD slaves, and neither node
>> gets promoted:
>>
>> # crm_mon -1
>> ============
>> Last updated: Mon Sep 23 14:39:12 2013
>> Last change: Mon Sep 23 14:26:15 2013 via cibadmin on test-vm-1
>> Stack: openais
>> Current DC: test-vm-1 - partition with quorum
>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 2 Nodes configured, 2 expected votes
>> 5 Resources configured.
>> ============
>>
>> Online: [ test-vm-1 test-vm-2 ]
>>
>>  Master/Slave Set: ms-drbd_r0 [drbd_r0]
>>      Slaves: [ test-vm-1 test-vm-2 ]
>>
>> If I try to force a promotion with "crm resource promote ms-drbd_r0" I
>> get no output, and I see this line in the log:
>>
>> cib: [27320]: info: cib_process_request: Operation complete: op
>> cib_modify for section resources (origin=local/crm_resource/4,
>> version=0.65.43): ok (rc=0)
>>
>> However, "crm_mon -1" still shows that both nodes are slaves.  I have a
>> constraint such that the NFS resources will only run on the DRBD master,
>> and a node will only get promoted to master once the virtual IP is started
>> on it.  I suspect that the IP is not starting and that's holding up the
>> promotion, but I can't figure out why the IP wouldn't start.  Looking in
>> the log, I see a bunch of pending actions to start the IP, but they're not
>> actually firing:
>>
>> # grep 'nfs_ip' /var/log/cluster/corosync.log
>> Sep 23 14:28:24 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1 - blocked)
>> Sep 23 14:28:24 test-vm-1 crmd: [27325]: info: te_rsc_command: Initiating
>> action 6: monitor nfs_ip_monitor_0 on test-vm-1 (local)
>> Sep 23 14:28:24 test-vm-1 lrmd: [27322]: info: rsc:nfs_ip probe[4] (pid
>> 27398)
>> Sep 23 14:28:25 test-vm-1 lrmd: [27322]: info: operation monitor[4] on
>> nfs_ip for client 27325: pid 27398 exited with return code 7
>> Sep 23 14:28:25 test-vm-1 crmd: [27325]: info: process_lrm_event: LRM
>> operation nfs_ip_monitor_0 (call=4, rc=7, cib-update=28, confirmed=true)
>> not running
>> Sep 23 14:28:27 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 8]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:33 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1)
>> Sep 23 14:28:33 test-vm-1 crmd: [27325]: info: te_rsc_command: Initiating
>> action 7: monitor nfs_ip_monitor_0 on test-vm-2
>> Sep 23 14:28:36 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1)
>> Sep 23 14:28:37 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 9]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip  (test-vm-1)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 9]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem:     [Action
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem:      * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>>
>> Any help will be greatly appreciated.
>>
>> The relevant portion of my CIB is below:
>>
>>   <configuration>
>>     <crm_config>
>>       <cluster_property_set id="cib-bootstrap-options">
>>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>> value="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff"/>
>>         <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="openais"/>
>>         <nvpair id="cib-bootstrap-options-expected-quorum-votes"
>> name="expected-quorum-votes" value="2"/>
>>         <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>>         <nvpair id="cib-bootstrap-options-maintenance-mode"
>> name="maintenance-mode" value="false"/>
>>         <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>>       </cluster_property_set>
>>     </crm_config>
>>     <nodes>
>>       <node id="test-vm-1" type="normal" uname="test-vm-1"/>
>>       <node id="test-vm-2" type="normal" uname="test-vm-2"/>
>>     </nodes>
>>     <resources>
>>       <group id="nfs_resources">
>>         <meta_attributes id="nfs_resources-meta_attributes">
>>           <nvpair id="nfs_resources-meta_attributes-target-role"
>> name="target-role" value="Started"/>
>>         </meta_attributes>
>>         <primitive class="ocf" id="nfs_fs" provider="heartbeat"
>> type="Filesystem">
>>           <instance_attributes id="nfs_fs-instance_attributes">
>>             <nvpair id="nfs_fs-instance_attributes-device" name="device"
>> value="/dev/drbd1"/>
>>             <nvpair id="nfs_fs-instance_attributes-directory"
>> name="directory" value="/mnt/data/"/>
>>             <nvpair id="nfs_fs-instance_attributes-fstype" name="fstype"
>> value="ext3"/>
>>             <nvpair id="nfs_fs-instance_attributes-options"
>> name="options" value="noatime,nodiratime"/>
>>           </instance_attributes>
>>           <operations>
>>             <op id="nfs_fs-start-0" interval="0" name="start"
>> timeout="60"/>
>>             <op id="nfs_fs-stop-0" interval="0" name="stop"
>> timeout="120"/>
>>           </operations>
>>         </primitive>
>>         <primitive class="lsb" id="nfs" type="nfs-kernel-server">
>>           <operations>
>>             <op id="nfs-monitor-5s" interval="5s" name="monitor"/>
>>           </operations>
>>         </primitive>
>>         <primitive class="ocf" id="nfs_ip" provider="heartbeat"
>> type="IPaddr2">
>>           <instance_attributes id="nfs_ip-instance_attributes">
>>             <nvpair id="nfs_ip-instance_attributes-ip" name="ip"
>> value="192.168.25.205"/>
>>             <nvpair id="nfs_ip-instance_attributes-cidr_netmask"
>> name="cidr_netmask" value="32"/>
>>           </instance_attributes>
>>           <operations>
>>             <op id="nfs_ip-monitor-10s" interval="10s" name="monitor"/>
>>           </operations>
>>           <meta_attributes id="nfs_ip-meta_attributes">
>>             <nvpair id="nfs_ip-meta_attributes-is-managed"
>> name="is-managed" value="true"/>
>>            </meta_attributes>
>>         </primitive>
>>       </group>
>>       <master id="ms-drbd_r0">
>>         <meta_attributes id="ms-drbd_r0-meta_attributes">
>>           <nvpair id="ms-drbd_r0-meta_attributes-clone-max"
>> name="clone-max" value="2"/>
>>           <nvpair id="ms-drbd_r0-meta_attributes-notify" name="notify"
>> value="true"/>
>>           <nvpair id="ms-drbd_r0-meta_attributes-globally-unique"
>> name="globally-unique" value="false"/>
>>           <nvpair id="ms-drbd_r0-meta_attributes-target-role"
>> name="target-role" value="Master"/>
>>         </meta_attributes>
>>         <primitive class="ocf" id="drbd_r0" provider="heartbeat"
>> type="drbd">
>>           <instance_attributes id="drbd_r0-instance_attributes">
>>             <nvpair id="drbd_r0-instance_attributes-drbd_resource"
>> name="drbd_resource" value="r0"/>
>>           </instance_attributes>
>>           <operations>
>>             <op id="drbd_r0-monitor-59s" interval="59s" name="monitor"
>> role="Master" timeout="30s"/>
>>             <op id="drbd_r0-monitor-60s" interval="60s" name="monitor"
>> role="Slave" timeout="30s"/>
>>           </operations>
>>         </primitive>
>>       </master>
>>     </resources>
>>     <constraints>
>>       <rsc_colocation id="drbd-nfs-ha" rsc="ms-drbd_r0" rsc-role="Master"
>> score="INFINITY" with-rsc="nfs_resources"/>
>>       <rsc_order first="nfs_ip" first-action="start" id="ip-before-drbd"
>> score="INFINITY" then="ms-drbd_r0" then-action="promote"/>
>>       <rsc_order first="ms-drbd_r0" first-action="promote"
>> id="drbd-before-nfs" score="INFINITY" then="nfs_fs" then-action="start"/>
>>       <rsc_order first="nfs_fs" first-action="start" id="fs-before-nfs"
>> score="INFINITY" then="nfs" then-action="start"/>
>>     </constraints>
>>     <rsc_defaults>
>>       <meta_attributes id="rsc-options">
>>         <nvpair id="rsc-options-resource-stickiness"
>> name="resource-stickiness" value="100"/>
>>       </meta_attributes>
>>     </rsc_defaults>
>>     <op_defaults/>
>>   </configuration>
>>
>> --
>> Dave Parker
>> Systems Administrator
>> Utica College
>> Integrated Information Technology Services
>> (315) 792-3229
>> Registered Linux User #408177
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>

-- 
Dave Parker
Systems Administrator
Utica College
Integrated Information Technology Services
(315) 792-3229
Registered Linux User #408177
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130924/21502029/attachment-0001.htm>