[Pacemaker] node status does not change even if pacemakerd dies
Kazunori INOUE
inouekazu at intellilink.co.jp
Wed Feb 13 09:14:04 UTC 2013
Hi Andrew,
Yes, please see attached pacemaker.conf. It controls only pacemakerd.
Furthermore, I'm examining pacemaker-corosync.conf (it's a prototype) which
also controls corosync now.
This job starts corosync service before starting of pacemakerd, and stops
corosync service after the stop of pacemakerd. [1]
- pacemaker-corosync.conf
17
18 pre-start script
19 modprobe softdog soft_margin=60
20 service corosync start [1]
21 end script
22
23 post-start script
24 touch $LOCK_FILE
25 pidof $prog > /var/run/$prog.pid
26 end script
27
28 post-stop script
29 rm -f $LOCK_FILE
30 rm -f /var/run/$prog.pid
31
32 pidof crmd && killall -q -9 corosync
33 pidof crmd || service corosync stop [1]
34 end script
Line 32 is a somewhat tricky design.
When only pacemakerd disappeared, corosync is terminated immediately.
By doing so, the machine reboots by watchdog of corosync. (since we
want to poweroff/reset the machine *certainly* in this case.)
Best Regards,
Kazunori INOUE
(13.02.08 10:03), Andrew Beekhof wrote:
> On Tue, Jan 22, 2013 at 9:09 PM, Kazunori INOUE
> <inouekazu at intellilink.co.jp> wrote:
>>
>> Hi Andrew,
>>
>> I understood that pacemakerd was not killed by OOM Killer.
>> However, because process failure may occur under the unexpected
>> circumstances, we let Upstart manage pacemakerd.
>
> This is an excellent idea.
> Do you have an upstart job for pacemaker that we can include in the source?
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
# pacemaker - High-Availability cluster resource manager
#
# Starts pacemakerd
stop on runlevel [016]
kill timeout 3600
respawn
env prog=pacemakerd
env LOCK_FILE=/var/lock/subsys/pacemaker
script
[ -f /etc/sysconfig/pacemaker ] && {
. /etc/sysconfig/pacemaker
}
exec $prog
end script
post-start script
touch $LOCK_FILE
pidof $prog > /var/run/$prog.pid
end script
post-stop script
rm -f $LOCK_FILE
rm -f /var/run/$prog.pid
end script
-------------- next part --------------
# pacemaker - High-Availability cluster resource manager
#
# Starts pacemakerd
stop on runlevel [016]
kill timeout 3600
env prog=pacemakerd
env LOCK_FILE=/var/lock/subsys/pacemaker
script
[ -f /etc/sysconfig/pacemaker ] && {
. /etc/sysconfig/pacemaker
}
exec $prog
end script
pre-start script
modprobe softdog soft_margin=60
service corosync start
end script
post-start script
touch $LOCK_FILE
pidof $prog > /var/run/$prog.pid
end script
post-stop script
rm -f $LOCK_FILE
rm -f /var/run/$prog.pid
pidof crmd && killall -q -9 corosync
pidof crmd || service corosync stop
end script
More information about the Pacemaker
mailing list