[Pacemaker] Ordered resource is not restarting after migration if it's already started on new host

Mon Dec 17 03:23:15 UTC 2012

On Sat, Dec 15, 2012 at 10:58 AM, Neal Peters <nealppeters at gmail.com> wrote:
> Hello-
>
> I'm running Pacemaker v. 1.1 (pacemaker-1.1.7-6.el6.x86_64) on CentOS 6.3 and am observing behavior on my systems that differs from the behavior described in the manual.
>
> Basically, the desired behavior (and the behavior described in Pacemaker Explained Section 6.3.1) is that when a "first" resource in an ordered set is moved to a host where the "then" resource is already running, the "then" resource will be restarted.
>
> From Pacemaker Explained 6.3.1 Mandatory Ordering:
> -If the first resource is (re)started while the then resource is running, the then resource will be stopped and restarted.
>
> I am not seeing this behavior however.  I am seeing that the "then" resource is left running.
>
>
> I have 2 servers running a fairly basic setup that is fairly close to the one described in the Clusters from Scratch document.  Config follows:
>
> node host2
> node host1
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>         params ip="192.168.0.225" cidr_netmask="32" \
>         op monitor interval="1s" \
>         meta target-role="Started"
> primitive DNSserver lsb:named \
>         op monitor interval="1s"
> colocation ip-with-DNSserver inf: DNSserver ClusterIP
> order DNS-server-after-ip inf: ClusterIP DNSserver
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \
>         last-lrm-refresh="1355268791"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="102"
>
> When the DNSserver resource is migrated from one node to the other and named is already started on the other node (for whatever reason), named is not restarted

1) Ordering constraints are behaving as expected, DNSserver is started
after ClusterIP
2) Starting something (DNSserver) that is already started is a no-op
3) Don't start cluster services outside of the cluster

3 is the root problem in your case

>
> Dec 14 15:32:28 host1 snmpd[5296]: Connection from UDP: [192.168.0.129]:51000->[192.168.0.93]
> Dec 14 15:32:40 host1 lrmd: [8733]: info: rsc:ClusterIP:5: start
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO: ip -f inet addr add 192.168.0.225/32 brd 192.168.0.225 dev eth1
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO: ip link set eth1 up
> Dec 14 15:32:40 host1 IPaddr2(ClusterIP)[9542]: INFO: /usr/lib64/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/se
> nd_arp-192.168.0.225 eth1 192.168.0.225 auto not_used not_used
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation ClusterIP_start_0 (call=5, rc=0, cib-update=10, co
> nfirmed=true) ok
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:ClusterIP:6: monitor
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:DNSserver:7: start
> Dec 14 15:32:41 host1 lrmd: [9601]: WARN: For LSB init script, no additional parameters are needed.
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output: (DNSserver:start:stdout) Starting named:
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output: (DNSserver:start:stdout) named: already running
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output: (DNSserver:start:stdout) [  OK
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output: (DNSserver:start:stdout) ]#015
> Dec 14 15:32:41 host1 lrmd: [8733]: info: RA output: (DNSserver:start:stdout)
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation DNSserver_start_0 (call=7, rc=0, cib-update=11, co
> nfirmed=true) ok
> Dec 14 15:32:41 host1 lrmd: [8733]: info: rsc:DNSserver:8: monitor
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation ClusterIP_monitor_1000 (call=6, rc=0, cib-update=1
> 2, confirmed=false) ok
> Dec 14 15:32:41 host1 crmd[8736]:     info: process_lrm_event: LRM operation DNSserver_monitor_1000 (call=8, rc=0, cib-update=1
> 3, confirmed=false) ok
>
>
> Are there errors in my config that are keeping the restart from happening?
>
> Thanks in advance.
>
>
> -Neal
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org