[ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

Tue Apr 3 20:54:12 EDT 2018

On Tue, Apr 3, 2018 at 10:18 PM, Jehan-Guillaume de Rorthais <
jgdr at dalibo.com> wrote:

> On Tue, 3 Apr 2018 09:58:50 +1000
> Andrew Beekhof <abeekhof at redhat.com> wrote:
> > On Fri, Mar 30, 2018 at 8:36 PM, Jehan-Guillaume de Rorthais <
> > jgdr at dalibo.com> wrote:
> > > On Thu, 29 Mar 2018 09:32:41 +1100
> > > Andrew Beekhof <abeekhof at redhat.com> wrote:
> > > > On Thu, Mar 29, 2018 at 8:07 AM, Jehan-Guillaume de Rorthais <
> > > > jgdr at dalibo.com> wrote:
> [...]
> > > > Though by now there is surely a decent library for logging to files
> with
> > > > sub-second timestamps - if we could incorporate that into libqb and
> have
> > > > corosync use it too...
> > >
> > > In my opinion, this is neither the role of libqb
> >
> >
> > libqb has the logging library that pacemaker and corosync use.
> > it is absolutely where this change should happen
>
> I meant that this could be handled 100% by some external dedicated daemon,
> eg.
> syslog or journalctl.
>
> I was thinking about code simplification.
>
> [...]
>
> > > > then we could consider 1 log per daemon.
> > > > In which case, the outcome of the PREFIX-SUFFIX discussion above
> could
> > > > instead be used for /var/log/pacemaker/SUFFIX
> > >
> > > I think the best would be to have one log for Corosync, one log for
> > > Pacemaker (and all its sub-process/childs).
> > >
> > > Another good path toward understandable log file would be to hide what
> > > process is speaking. Experienced user will still know that "LOG:
> setting
> > > failcount to 3" comes from CRMd and "DEBUG1: failcount setted to 3"
> comes
> > > from attrd.
> > >
> > > However, this would probably be a mess...because again, the cause
> might be
> > > logged AFTER the effects/reaction :/
> >
> > why?  i've never seen that be the case
>
> Please find in attachment a demonstration of such behavior I found last
> week.
> Note that this comes from a Sles 12 SP1 using Pacemaker 1.1.13...People
> there
> were not able to upgrade the servers before we built the PoC together.
>
> First column is the order in the log file. Second column is how I would
> expect
> the messages to appear in the log.
>
> Eg. I would expect L.11
>
>   "pengine: notice: process_pe_message: Calculated Transition 29: [...]"
>
> Before CRMd begin to process it at L.6-10.
>
> Another exemple, I would expect LRMd L.35:
>
>   "lrmd:  notice: log_finished:  finished - rsc:pgsqld action:notify"
>
> Before the CRMd receive the result L.26...
>

No, none of these are out of order.

>
> Maybe this is something fixed in 1.1.18 or 2.0.0, I just couldn't find
> commit
> messages related to this when searching through them quickly.
>
> > > Maybe the solution is to log only messages from CRMD, where all the
> > > orchestration comes from. Everything else might go to some debug level
> if
> > > needed.
> >
> > sorry, that is a terrible idea
>
> I was throwing random ideas as I'm not familiar with internal architecture.
> Maybe it should be pacemakerd to gather messages from all other messages
> and
> spit them to stderr so they are captured by journald or redirected to a
> file...
>
> Regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180404/52d99ada/attachment-0002.html>