[Pacemaker] node status does not change even if pacemakerd dies
Kazunori INOUE
inouekazu at intellilink.co.jp
Thu Apr 11 09:24:11 UTC 2013
Hi Andrew,
(13.03.01 11:10), Andrew Beekhof wrote:
> On Wed, Feb 13, 2013 at 8:14 PM, Kazunori INOUE
> <inouekazu at intellilink.co.jp> wrote:
>> Hi Andrew,
>>
>> Yes, please see attached pacemaker.conf. It controls only pacemakerd.
>
> I've pushed up the basic one in
> https://github.com/beekhof/pacemaker/commit/4bd8ac3
>
> Once you're happy with the pacemaker-corosync.conf version, let me
> know and we can update it.
>
I attached two upstart job files for pacemaker.
- pacemaker.conf.in
It's basic job. I reviewed setting.
Please replace it with mcp/pacemaker.upstart.
- pacemaker-corosync.conf.in
Since jobs were added to Corosycn(*), this job uses them.
* https://github.com/corosync/corosync/commit/ca389c3c598105223f30e2e760f92aa105e1c9b3
----
Best regards,
Kazunori INOUE
>>
>> Furthermore, I'm examining pacemaker-corosync.conf (it's a prototype) which
>> also controls corosync now.
>> This job starts corosync service before starting of pacemakerd, and stops
>> corosync service after the stop of pacemakerd. [1]
>>
>> - pacemaker-corosync.conf
>> 17
>> 18 pre-start script
>> 19 modprobe softdog soft_margin=60
>> 20 service corosync start [1]
>> 21 end script
>> 22
>> 23 post-start script
>> 24 touch $LOCK_FILE
>> 25 pidof $prog > /var/run/$prog.pid
>> 26 end script
>> 27
>> 28 post-stop script
>> 29 rm -f $LOCK_FILE
>> 30 rm -f /var/run/$prog.pid
>> 31
>> 32 pidof crmd && killall -q -9 corosync
>> 33 pidof crmd || service corosync stop [1]
>> 34 end script
>>
>> Line 32 is a somewhat tricky design.
>> When only pacemakerd disappeared, corosync is terminated immediately.
>> By doing so, the machine reboots by watchdog of corosync. (since we
>> want to poweroff/reset the machine *certainly* in this case.)
>>
>> Best Regards,
>> Kazunori INOUE
>>
>>
>> (13.02.08 10:03), Andrew Beekhof wrote:
>>> On Tue, Jan 22, 2013 at 9:09 PM, Kazunori INOUE
>>> <inouekazu at intellilink.co.jp> wrote:
>>>>
>>>> Hi Andrew,
>>>>
>>>> I understood that pacemakerd was not killed by OOM Killer.
>>>> However, because process failure may occur under the unexpected
>>>> circumstances, we let Upstart manage pacemakerd.
>>>
>>> This is an excellent idea.
>>> Do you have an upstart job for pacemaker that we can include in the source?
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
# pacemaker - High-Availability cluster resource manager
#
# Starts pacemakerd
stop on runlevel [0123456]
kill timeout 3600
respawn
env prog=pacemakerd
env rpm_sysconf=@sysconfdir@/sysconfig/pacemaker
env rpm_lockfile=@localstatedir@/lock/subsys/pacemaker
env deb_sysconf=@sysconfdir@/default/pacemaker
env deb_lockfile=@localstatedir@/lock/pacemaker
script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
exec $prog
end script
post-start script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
[ -z "$LOCK_FILE" -a -d @sysconfdir@/sysconfig ] && LOCK_FILE="$rpm_lockfile"
[ -z "$LOCK_FILE" -a -d @sysconfdir@/default ] && LOCK_FILE="$deb_lockfile"
touch $LOCK_FILE
pidof $prog > @localstatedir@/run/$prog.pid
end script
post-stop script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
[ -z "$LOCK_FILE" -a -d @sysconfdir@/sysconfig ] && LOCK_FILE="$rpm_lockfile"
[ -z "$LOCK_FILE" -a -d @sysconfdir@/default ] && LOCK_FILE="$deb_lockfile"
rm -f $LOCK_FILE
rm -f @localstatedir@/run/$prog.pid
end script
-------------- next part --------------
# pacemaker-corosync - High-Availability cluster
#
# Starts Corosync cluster engine and Pacemaker cluster manager.
kill timeout 3600
env prog=pacemakerd
env rpm_sysconf=@sysconfdir@/sysconfig/pacemaker
env rpm_lockfile=@localstatedir@/lock/subsys/pacemaker
env deb_sysconf=@sysconfdir@/default/pacemaker
env deb_lockfile=@localstatedir@/lock/pacemaker
script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
exec $prog
end script
pre-start script
# setup the software watchdog which corosync uses in post-stop.
# rewrite according to environment.
modprobe softdog soft_margin=60
start corosync
# if you use corosync-notifyd, uncomment the line below.
#start corosync-notifyd
# give it time to fail.
sleep 2
pidof corosync || { exit 1; }
end script
post-start script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
[ -z "$LOCK_FILE" -a -d @sysconfdir@/sysconfig ] && LOCK_FILE="$rpm_lockfile"
[ -z "$LOCK_FILE" -a -d @sysconfdir@/default ] && LOCK_FILE="$deb_lockfile"
touch $LOCK_FILE
pidof $prog > @localstatedir@/run/$prog.pid
end script
post-stop script
[ -f "$rpm_sysconf" ] && . $rpm_sysconf
[ -f "$deb_sysconf" ] && . $deb_sysconf
[ -z "$LOCK_FILE" -a -d @sysconfdir@/sysconfig ] && LOCK_FILE="$rpm_lockfile"
[ -z "$LOCK_FILE" -a -d @sysconfdir@/default ] && LOCK_FILE="$deb_lockfile"
rm -f $LOCK_FILE
rm -f @localstatedir@/run/$prog.pid
# when pacemakerd disappeared unexpectedly, a machine is rebooted
# by the watchdog of corosync.
pidof crmd && killall -q -9 corosync
stop corosync || true
# if you use corosync-notifyd, uncomment the line below.
#stop corosync-notifyd || true
end script
More information about the Pacemaker
mailing list