[Pacemaker] OCF exit code 8 triggers WARN message
Dejan Muhamedagic
dejanmm at fastmail.fm
Wed Oct 5 20:35:52 UTC 2011
On Fri, Sep 16, 2011 at 05:33:48PM +0200, Lars Ellenberg wrote:
> On Fri, Sep 16, 2011 at 05:02:52PM +0200, Dejan Muhamedagic wrote:
> > Hi Thilo,
> >
> > On Fri, Sep 16, 2011 at 04:41:59PM +0200, Thilo Uttendorfer wrote:
> > > Hi,
> > >
> > > I experience a lot of "WARN" log entries in several pacemaker cluster setups:
> > >
> > > Sep 16 11:53:21 server01 lrmd: [23946]: WARN: Managed res1:0:monitor process
> > > 26489 exited with return code 8.
> > >
> > > That's because multi state resources like DRBD have some special return
> > > codes. "8" means OCF_RUNNING_MASTER which should not trigger a warning. The
> > > folowing patch in cluster-clue solved this issue:
> > >
> > > -------------
> > > diff -u lib/clplumbing/proctrack.c lib/clplumbing/proctrack.c.patched
> > >
> > > --- lib/clplumbing/proctrack.c 2011-09-16 15:48:25.000000000 +0200
> > > +++ lib/clplumbing/proctrack.c.patched 2011-09-16 15:51:43.000000000 +0200
> > > @@ -271,7 +271,7 @@
> > >
> > > if (doreport) {
> > > if (deathbyexit) {
> > > - cl_log((exitcode == 0 ? LOG_INFO : LOG_WARNING)
> > > + cl_log(((exitcode == 0 || exitcode == 8) ? LOG_INFO :
> > > LOG_WARNING)
> > > , "Managed %s process %d exited with return
> > > code %d."
> > > , type, pid, exitcode);
> > > }else if (deathbysig) {
> > > -------------
> >
> > I did consider this before but was worried that a process
> > different from OCF RA instance could exit with such a code. Code
> > 7 (not running) also belongs to this category. Anyway, we should
> > probably add this patch.
>
> Hm...
> As lrmd is not the sole users of that proctrack interface,
> and not everything lrmd does is a monitor operation,
True, but I'd say that exit codes 7 and 8 are a rarity. And even
if occasionally we log a message with info severity instead of
warning, that wouldn't be such a big deal, IMO.
> can we add an other loglevel flag there, e.g. PT_LOG_OCF_MONITOR,
> and base "degradation" of log level for "expected exit codes" on that?
Nothing against it, of course, if it doesn't complicate things
too much :)
Cheers,
Dejan
P.S. Moving to a more proper list (lest we lose this again).
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list