[Pacemaker] node status does not change even if pacemakerd dies

Mon Mar 4 04:13:16 EST 2013

Hi Andrew,

> -----Original Message-----
> From: Andrew Beekhof [mailto:andrew at beekhof.net]
> Sent: Friday, March 01, 2013 11:11 AM
> To: The Pacemaker cluster resource manager
> Cc: shimazakik at intellilink.co.jp
> Subject: Re: [Pacemaker] node status does not change even if pacemakerd
dies
> 
> On Wed, Feb 13, 2013 at 8:14 PM, Kazunori INOUE
> <inouekazu at intellilink.co.jp> wrote:
> > Hi Andrew,
> >
> > Yes, please see attached pacemaker.conf. It controls only pacemakerd.
> 
> I've pushed up the basic one in
> https://github.com/beekhof/pacemaker/commit/4bd8ac3
> 
> Once you're happy with the pacemaker-corosync.conf version, let me
> know and we can update it.
> 

That's great. I'll do that.

Best Regards,
Kazunori INOUE

> >
> > Furthermore, I'm examining pacemaker-corosync.conf (it's a prototype)
which
> > also controls corosync now.
> > This job starts corosync service before starting of pacemakerd, and
stops
> > corosync service after the stop of pacemakerd. [1]
> >
> > - pacemaker-corosync.conf
> >   17
> >   18  pre-start script
> >   19      modprobe softdog soft_margin=60
> >   20      service corosync start               [1]
> >   21  end script
> >   22
> >   23  post-start script
> >   24      touch $LOCK_FILE
> >   25      pidof $prog > /var/run/$prog.pid
> >   26  end script
> >   27
> >   28  post-stop script
> >   29      rm -f $LOCK_FILE
> >   30      rm -f /var/run/$prog.pid
> >   31
> >   32      pidof crmd && killall -q -9 corosync
> >   33      pidof crmd || service corosync stop  [1]
> >   34  end script
> >
> > Line 32 is a somewhat tricky design.
> > When only pacemakerd disappeared, corosync is terminated immediately.
> > By doing so, the machine reboots by watchdog of corosync. (since we
> > want to poweroff/reset the machine *certainly* in this case.)
> >
> > Best Regards,
> > Kazunori INOUE
> >
> >
> > (13.02.08 10:03), Andrew Beekhof wrote:
> >> On Tue, Jan 22, 2013 at 9:09 PM, Kazunori INOUE
> >> <inouekazu at intellilink.co.jp> wrote:
> >>>
> >>> Hi Andrew,
> >>>
> >>> I understood that pacemakerd was not killed by OOM Killer.
> >>> However, because process failure may occur under the unexpected
> >>> circumstances, we let Upstart manage pacemakerd.
> >>
> >> This is an excellent idea.
> >> Do you have an upstart job for pacemaker that we can include in the
source?
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org