[Pacemaker] Trouble with ordering
Serge Dubrouski
sergeyfd at gmail.com
Sun Oct 2 01:18:09 UTC 2011
On Sat, Oct 1, 2011 at 2:49 PM, Gerald Vogt <vogt at spamcop.net> wrote:
> On 01.10.11 04:53, Serge Dubrouski wrote:
> > Technically, I don't want the cluster to control the service in the
> > meaning of starting and stopping. The cluster controls the IP
> addresses
> > and moves them between nodes. The dns service resource is supposed to
> > provide a check that the dns service is working on the node and
> migrate
> > the service and most important the IP address if it becomes
> > unresponsive.
> >
> > I didn't look at the concept of clones, yet. Maybe I took a
> completely
> > wrong approach to what I am trying to do.
> >
> >
> > I think that clones is rally good solution for this situation. You can
> > configure BIND as a clone service with different configuration though.
> > One node will be master another slave. You can also have a floating VIP
> > tied up to any of the nodes but collocated with the running BIND.If BIND
> > dies for some reason, pacemaker will move your IP to the survived node.
> > You can addsending additional alarms.
>
> Thanks a lot! Just learned a couple of things.
>
I'm glad it helped.
>
> I have removed my own script. Installed yours and set it up. Configured
> a clone.
>
> primitive bind ocf:heartbeat:named ...
> clone bind-clone bind
>
> Then bind is kept running on all nodes and is only shutdown if it fails.
> If necessary named is restarted. Great.
>
> Then I colocate my ip resources with the clone:
>
> colocation ns1-ip-bind inf: nsi1-ip bind-clone
> colocation ns2-ip-bind inf: nsi2-ip bind-clone
>
> Thus the service IP addresses only run on nodes where bind is active. If
> bind fails on a node the ip address is moved.
>
> Two notes (regarding the latest version on github):
>
> 1. You expect rndc and host to be in $PATH. At the same time the path to
> named can be configured. I think consequently, the same should apply to
> rndc and host as they are bind utils.
>
> On our CentOS servers we run the latest version of bind, compiled from
> source and installed in a custom path which is added in /etc/profile.
> For some reason /etc/profile doesn't seem to apply to the ocf scripts
> thus the script doesn't find rndc or host unless I extend PATH manually
> at the beginning of the script.
>
We had some discussion around this and finally decided to leave it up to
sysadmin ti make sure that both tools are available in PATH. One
can always create a couple of symlink to cover it.
>
> 2. In the stop function you call "rndc stop" to stop the daemon.
> However, if the daemon hangs, rndc will hang. Thus pacemaker runs into a
> timeout and kills the ocf script, leading to a failed stop.
>
You didn't read the code carefully again. Yes it does exactly what you want
or at least it's supposed to:
if ! $RNDC stop >/dev/null; then
kill `cat ${OCF_RESKEY_named_pidfile}`
fi
if [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then
# Allow 2/3 of the action timeout for the orderly shutdown
# (The origin unit is ms, hence the conversion)
timeout=$((OCF_RESKEY_CRM_meta_timeout/1500))
else
timeout=20
fi
while named_status ; do
if [ $timeout -ge ${OCF_RESKEY_named_stop_timeout} ]; then
break
else
sleep 1
timeout=$((timeout++))
fi
done
*#If still up*
* if named_status 2>&1; then*
* ocf_log err "named is still up! Killing";*
* kill -9 `cat ${OCF_RESKEY_named_pidfile}`*
* fi*
> I think the ocf script should have its own timeout and abort the rndc
> call if it takes too long and then try to kill the server.
>
See above.
>
> To test send a STOP signal to named and wait...
>
>
> But otherwise, great script.
>
> Thanks!
>
> Gerald
>
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
--
Serge Dubrouski.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111001/acf3039f/attachment.htm>
More information about the Pacemaker
mailing list