[Pacemaker] Ordered resource is not restarting after migration if it's already started on new host

Mon Dec 17 20:07:09 EST 2012

On Tue, Dec 18, 2012 at 6:28 AM, Neal Peters <nealppeters at gmail.com> wrote:
>
> On Dec 16, 2012, at 7:29 PM, pacemaker-request at oss.clusterlabs.org wrote:
>
> Message: 5
> Date: Mon, 17 Dec 2012 14:23:15 +1100
> From: Andrew Beekhof <andrew at beekhof.net>
> To: The Pacemaker cluster resource manager
> <pacemaker at oss.clusterlabs.org>
> Subject: Re: [Pacemaker] Ordered resource is not restarting after
> migration if it's already started on new host
> Message-ID:
> <CAEDLWG35TfnGhMM_FuSSxedryAMSS5OwFxRdLG5Ytcmj7yxaWw at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
>
> On Sat, Dec 15, 2012 at 10:58 AM, Neal Peters <nealppeters at gmail.com> wrote:
>
> Hello-
>
>
> I'm running Pacemaker v. 1.1 (pacemaker-1.1.7-6.el6.x86_64) on CentOS 6.3
> and am observing behavior on my systems that differs from the behavior
> described in the manual.
>
>
> Basically, the desired behavior (and the behavior described in Pacemaker
> Explained Section 6.3.1) is that when a "first" resource in an ordered set
> is moved to a host where the "then" resource is already running, the "then"
> resource will be restarted.
>
>
> From Pacemaker Explained 6.3.1 Mandatory Ordering:
>
> -If the first resource is (re)started while the then resource is running,
> the then resource will be stopped and restarted.
>
>
> I am not seeing this behavior however.  I am seeing that the "then" resource
> is left running.
>
>
>
> I have 2 servers running a fairly basic setup that is fairly close to the
> one described in the Clusters from Scratch document. Config follows:
>
>
> node host2
>
> node host1
>
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>
>        params ip="192.168.0.225" cidr_netmask="32" \
>
>        op monitor interval="1s" \
>
>        meta target-role="Started"
>
> primitive DNSserver lsb:named \
>
>        op monitor interval="1s"
>
> colocation ip-with-DNSserver inf: DNSserver ClusterIP
>
> order DNS-server-after-ip inf: ClusterIP DNSserver
>
> property $id="cib-bootstrap-options" \
>
>        dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
>
>        cluster-infrastructure="openais" \
>
>        expected-quorum-votes="2" \
>
>        stonith-enabled="false" \
>
>        no-quorum-policy="ignore" \
>
>        last-lrm-refresh="1355268791"
>
> rsc_defaults $id="rsc-options" \
>
>        resource-stickiness="102"
>
>
> When the DNSserver resource is migrated from one node to the other and named
> is already started on the other node (for whatever reason), named is not
> restarted
>
>
> 1) Ordering constraints are behaving as expected, DNSserver is started
> after ClusterIP
> 2) Starting something (DNSserver) that is already started is a no-op
> 3) Don't start cluster services outside of the cluster
>
> 3 is the root problem in your case
>
>
> Thank you for your prompt reply.  It sounds as though Pacemaker is operating
> in the way that you expect in this situation.
>
> Your description of Pacemaker behavior
>
> 2) Starting something (DNSserver) that is already started is a no-op
>
>
> differs from behavior described in the documentation

No, it doesn't.
The cluster _is_ trying to start the resource (we stopped it on the
old host and are trying to start it on the new one), however the named
init script is simply ignoring the request because named is already
running.

Also this behaviour by the named script is mandated by the LSB standard.
Which is why I said #3 was the problem you need to fix

> -If the first resource is (re)started while the then resource is running,
> the then resource will be stopped and restarted.
>
> (
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacemaker_Explained/s-resource-ordering.html#_mandatory_ordering
> Section 6.3.1)
>
> Is there a place that I can/should report this discrepancy between actual
> behavior and behavior described in the documentation?
>
> Thank you.
>
>
>
>
> Dec 14 15:32:28 host1 snmpd[5296]: Connection from UDP:
> [192.168.0.129]:51000->[192.168.0.93]
>
> Dec 14 15:32:40 host1 lrmd: [8733]: info: rsc:ClusterIP:5: start
>
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO: ip -f inet addr add
> 192.168.0.225/32 brd 192.168.0.225 dev eth1
>
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO: ip link set eth1 up
>
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO:
> /usr/lib64/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/se
>
> nd_arp-192.168.0.225 eth1 192.168.0.225 auto not_used not_used
>
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation
> ClusterIP_start_0 (call=5, rc=0, cib-update=10, co
>
> nfirmed=true) ok
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:ClusterIP:6: monitor
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:DNSserver:7: start
>
> Dec 14 15:32:41 host1 lrmd: [9601]: WARN: For LSB init script, no additional
> parameters are needed.
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output:
> (DNSserver:start:stdout) Starting named:
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output:
> (DNSserver:start:stdout) named: already running
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output:
> (DNSserver:start:stdout) [  OK
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output:
> (DNSserver:start:stdout) ]#015
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output:
> (DNSserver:start:stdout)
>
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation
> DNSserver_start_0 (call=7, rc=0, cib-update=11, co
>
> nfirmed=true) ok
>
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:DNSserver:8: monitor
>
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation
> ClusterIP_monitor_1000 (call=6, rc=0, cib-update=1
>
> 2, confirmed=false) ok
>
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation
> DNSserver_monitor_1000 (call=8, rc=0, cib-update=1
>
> 3, confirmed=false) ok
>
>
>
> Are there errors in my config that are keeping the restart from happening?
>
>
> Thanks in advance.
>
>
>
> -Neal
>
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>