[Pacemaker] Pacemaker/DRBD troubles
David Parker
dparker at utica.edu
Tue Sep 24 15:30:18 UTC 2013
Thanks, Emmanuel. After some trial and error, I ended up changing the
order constraints, and that seemed to solve the problem. This new
configuration works:
<constraints>
<rsc_colocation id="drbd-nfs-ha" rsc="ms-drbd_r0" rsc-role="Master"
score="INFINITY" with-rsc="nfs_resources"/>
<rsc_order first="ms-drbd_r0" first-action="promote"
id="drbd-before-nfs" score="INFINITY" then="nfs_fs" then-action="start"/>
<rsc_order first="nfs_fs" first-action="start" id="fs-before-ip"
score="INFINITY" then="nfs_ip" then-action="start"/>
<rsc_order first="nfs_ip" first-action="start" id="ip-before-nfs"
score="INFINITY" then="nfs" then-action="start"/>
</constraints>
On Mon, Sep 23, 2013 at 6:06 PM, emmanuel segura <emi2fast at gmail.com> wrote:
> i'm not sure if this is the problem, but i think you only need one order
> constrain like this
>
> <rsc_order first="ms-drbd_r0" first-action="promote"
> id="drbd-before-nfsgroup" score="INFINITY" then="nfs_resources"
> then-action="start"/>
>
>
> 2013/9/23 David Parker <dparker at utica.edu>
>
>> Hello,
>>
>> I'm attempting to set up a simple NFS failover test using Pacemaker and
>> DRBD on 2 nodes. The goal is to have one host be the DRBD master, and have
>> the volume mounted, the NFS server running, and a virtual IP address up.
>> The other node is the DRBD slave with no NFS services or virtual IP
>> running. The DRBD resource is configured in master-slave (not dual-master)
>> mode and seems to work fine when it's not being controlled by Pacemaker.
>>
>> The problem is that both nodes start out as DRBD slaves, and neither node
>> gets promoted:
>>
>> # crm_mon -1
>> ============
>> Last updated: Mon Sep 23 14:39:12 2013
>> Last change: Mon Sep 23 14:26:15 2013 via cibadmin on test-vm-1
>> Stack: openais
>> Current DC: test-vm-1 - partition with quorum
>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 2 Nodes configured, 2 expected votes
>> 5 Resources configured.
>> ============
>>
>> Online: [ test-vm-1 test-vm-2 ]
>>
>> Master/Slave Set: ms-drbd_r0 [drbd_r0]
>> Slaves: [ test-vm-1 test-vm-2 ]
>>
>> If I try to force a promotion with "crm resource promote ms-drbd_r0" I
>> get no output, and I see this line in the log:
>>
>> cib: [27320]: info: cib_process_request: Operation complete: op
>> cib_modify for section resources (origin=local/crm_resource/4,
>> version=0.65.43): ok (rc=0)
>>
>> However, "crm_mon -1" still shows that both nodes are slaves. I have a
>> constraint such that the NFS resources will only run on the DRBD master,
>> and a node will only get promoted to master once the virtual IP is started
>> on it. I suspect that the IP is not starting and that's holding up the
>> promotion, but I can't figure out why the IP wouldn't start. Looking in
>> the log, I see a bunch of pending actions to start the IP, but they're not
>> actually firing:
>>
>> # grep 'nfs_ip' /var/log/cluster/corosync.log
>> Sep 23 14:28:24 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1 - blocked)
>> Sep 23 14:28:24 test-vm-1 crmd: [27325]: info: te_rsc_command: Initiating
>> action 6: monitor nfs_ip_monitor_0 on test-vm-1 (local)
>> Sep 23 14:28:24 test-vm-1 lrmd: [27322]: info: rsc:nfs_ip probe[4] (pid
>> 27398)
>> Sep 23 14:28:25 test-vm-1 lrmd: [27322]: info: operation monitor[4] on
>> nfs_ip for client 27325: pid 27398 exited with return code 7
>> Sep 23 14:28:25 test-vm-1 crmd: [27325]: info: process_lrm_event: LRM
>> operation nfs_ip_monitor_0 (call=4, rc=7, cib-update=28, confirmed=true)
>> not running
>> Sep 23 14:28:27 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 8]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:27 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 7]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:33 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1)
>> Sep 23 14:28:33 test-vm-1 crmd: [27325]: info: te_rsc_command: Initiating
>> action 7: monitor nfs_ip_monitor_0 on test-vm-2
>> Sep 23 14:28:36 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1)
>> Sep 23 14:28:37 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 9]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:28:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 pengine: [27324]: notice: LogActions: Start
>> nfs_ip (test-vm-1)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 9]: Pending (id: nfs_ip_monitor_10000, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem: [Action
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>> Sep 23 14:43:37 test-vm-1 crmd: [27325]: WARN: print_elem: * [Input
>> 8]: Pending (id: nfs_ip_start_0, loc: test-vm-1, priority: 0)
>>
>> Any help will be greatly appreciated.
>>
>> The relevant portion of my CIB is below:
>>
>> <configuration>
>> <crm_config>
>> <cluster_property_set id="cib-bootstrap-options">
>> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>> value="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff"/>
>> <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="openais"/>
>> <nvpair id="cib-bootstrap-options-expected-quorum-votes"
>> name="expected-quorum-votes" value="2"/>
>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>> <nvpair id="cib-bootstrap-options-maintenance-mode"
>> name="maintenance-mode" value="false"/>
>> <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>> </cluster_property_set>
>> </crm_config>
>> <nodes>
>> <node id="test-vm-1" type="normal" uname="test-vm-1"/>
>> <node id="test-vm-2" type="normal" uname="test-vm-2"/>
>> </nodes>
>> <resources>
>> <group id="nfs_resources">
>> <meta_attributes id="nfs_resources-meta_attributes">
>> <nvpair id="nfs_resources-meta_attributes-target-role"
>> name="target-role" value="Started"/>
>> </meta_attributes>
>> <primitive class="ocf" id="nfs_fs" provider="heartbeat"
>> type="Filesystem">
>> <instance_attributes id="nfs_fs-instance_attributes">
>> <nvpair id="nfs_fs-instance_attributes-device" name="device"
>> value="/dev/drbd1"/>
>> <nvpair id="nfs_fs-instance_attributes-directory"
>> name="directory" value="/mnt/data/"/>
>> <nvpair id="nfs_fs-instance_attributes-fstype" name="fstype"
>> value="ext3"/>
>> <nvpair id="nfs_fs-instance_attributes-options"
>> name="options" value="noatime,nodiratime"/>
>> </instance_attributes>
>> <operations>
>> <op id="nfs_fs-start-0" interval="0" name="start"
>> timeout="60"/>
>> <op id="nfs_fs-stop-0" interval="0" name="stop"
>> timeout="120"/>
>> </operations>
>> </primitive>
>> <primitive class="lsb" id="nfs" type="nfs-kernel-server">
>> <operations>
>> <op id="nfs-monitor-5s" interval="5s" name="monitor"/>
>> </operations>
>> </primitive>
>> <primitive class="ocf" id="nfs_ip" provider="heartbeat"
>> type="IPaddr2">
>> <instance_attributes id="nfs_ip-instance_attributes">
>> <nvpair id="nfs_ip-instance_attributes-ip" name="ip"
>> value="192.168.25.205"/>
>> <nvpair id="nfs_ip-instance_attributes-cidr_netmask"
>> name="cidr_netmask" value="32"/>
>> </instance_attributes>
>> <operations>
>> <op id="nfs_ip-monitor-10s" interval="10s" name="monitor"/>
>> </operations>
>> <meta_attributes id="nfs_ip-meta_attributes">
>> <nvpair id="nfs_ip-meta_attributes-is-managed"
>> name="is-managed" value="true"/>
>> </meta_attributes>
>> </primitive>
>> </group>
>> <master id="ms-drbd_r0">
>> <meta_attributes id="ms-drbd_r0-meta_attributes">
>> <nvpair id="ms-drbd_r0-meta_attributes-clone-max"
>> name="clone-max" value="2"/>
>> <nvpair id="ms-drbd_r0-meta_attributes-notify" name="notify"
>> value="true"/>
>> <nvpair id="ms-drbd_r0-meta_attributes-globally-unique"
>> name="globally-unique" value="false"/>
>> <nvpair id="ms-drbd_r0-meta_attributes-target-role"
>> name="target-role" value="Master"/>
>> </meta_attributes>
>> <primitive class="ocf" id="drbd_r0" provider="heartbeat"
>> type="drbd">
>> <instance_attributes id="drbd_r0-instance_attributes">
>> <nvpair id="drbd_r0-instance_attributes-drbd_resource"
>> name="drbd_resource" value="r0"/>
>> </instance_attributes>
>> <operations>
>> <op id="drbd_r0-monitor-59s" interval="59s" name="monitor"
>> role="Master" timeout="30s"/>
>> <op id="drbd_r0-monitor-60s" interval="60s" name="monitor"
>> role="Slave" timeout="30s"/>
>> </operations>
>> </primitive>
>> </master>
>> </resources>
>> <constraints>
>> <rsc_colocation id="drbd-nfs-ha" rsc="ms-drbd_r0" rsc-role="Master"
>> score="INFINITY" with-rsc="nfs_resources"/>
>> <rsc_order first="nfs_ip" first-action="start" id="ip-before-drbd"
>> score="INFINITY" then="ms-drbd_r0" then-action="promote"/>
>> <rsc_order first="ms-drbd_r0" first-action="promote"
>> id="drbd-before-nfs" score="INFINITY" then="nfs_fs" then-action="start"/>
>> <rsc_order first="nfs_fs" first-action="start" id="fs-before-nfs"
>> score="INFINITY" then="nfs" then-action="start"/>
>> </constraints>
>> <rsc_defaults>
>> <meta_attributes id="rsc-options">
>> <nvpair id="rsc-options-resource-stickiness"
>> name="resource-stickiness" value="100"/>
>> </meta_attributes>
>> </rsc_defaults>
>> <op_defaults/>
>> </configuration>
>>
>> --
>> Dave Parker
>> Systems Administrator
>> Utica College
>> Integrated Information Technology Services
>> (315) 792-3229
>> Registered Linux User #408177
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
--
Dave Parker
Systems Administrator
Utica College
Integrated Information Technology Services
(315) 792-3229
Registered Linux User #408177
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130924/21502029/attachment-0001.htm>
More information about the Pacemaker
mailing list