[Pacemaker] [Problem] The attrd does not sometimes stop.
Andrew Beekhof
andrew at beekhof.net
Mon Jan 16 12:42:32 UTC 2012
On Mon, Jan 16, 2012 at 11:30 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> On Mon, Jan 16, 2012 at 11:27 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>> I know I could just apply the patch and be done, but I'd like to
>> understand this so it works for the right reason.
>>
>> On Mon, Jan 16, 2012 at 7:30 PM, Lars Ellenberg
>> <lars.ellenberg at linbit.com> wrote:
>>> On Mon, Jan 16, 2012 at 04:46:58PM +1100, Andrew Beekhof wrote:
>>>> > Now we proceed to the next mainloop poll:
>>>> >
>>>> >> poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, events=POLLIN|POLLPRI}], 3, -1
>>>> >
>>>> > Note the -1 (infinity timeout!)
>>>> >
>>>> > So even though the trigger was (presumably) set,
>>>> > and the ->prepare() should have returned true,
>>>> > the mainloop waits forever for "something" to happen on those file descriptors.
>>>> >
>>>> >
>>>> > I suggest this:
>>>> >
>>>> > crm_trigger_prepare should set *timeout = 0, if trigger is set.
>>>> >
>>>> > Also think about this race: crm_trigger_prepare was already
>>>> > called, only then the signal came in...
>>>> >
>>>> > diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
>>>> > index 2e8b1d0..fd17b87 100644
>>>> > --- a/lib/common/mainloop.c
>>>> > +++ b/lib/common/mainloop.c
>>>> > @@ -33,6 +33,13 @@ static gboolean
>>>> > crm_trigger_prepare(GSource * source, gint * timeout)
>>>> > {
>>>> > crm_trigger_t *trig = (crm_trigger_t *) source;
>>>> > + /* Do not delay signal processing by the mainloop poll stage */
>>>> > + if (trig->trigger)
>>>> > + *timeout = 0;
>>>> > + /* To avoid races between signal delivery and the mainloop poll stage,
>>>> > + * make sure we always have a finite timeout. Unit: milliseconds. */
>>>> > + else
>>>> > + *timeout = 5000; /* arbitrary */
>>>> >
>>>> > return trig->trigger;
>>>> > }
>>>> >
>>>> >
>>>> > This scenario does not let the blocked IPC off the hook, though.
>>>> > That is still possible, both for blocking send and blocking receive,
>>>> > so that should probably be fixed as well, somehow.
>>>> > I'm not sure how likely this "stuck in blocking IPC" is, though.
>>>>
>>>> Interesting, are you sure you're in the right function though?
>>>> trigger and signal events don't have a file descriptor... wouldn't
>>>> these polls be for the IPC related sources and wouldn't they be
>>>> setting their own timeout?
>>>
>>> http://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#GSourceFuncs
>>>
>>> iiuc, mainloop does something similar to (oversimplified):
>>> timeout = -1; /* infinity */
>>> for s in all GSource
>>> tmp_timeout = -1;
>>> s->prepare(s, &tmp_timeout)
>>> if (tmp_timeout >= 0 && tmp_timeout < timeout)
>>> timeout = tmp_timeout;
>>>
>>> poll(GSource fd set, n, timeout);
>>
>> I'm looking at the glib code again now, and it still looks to me like
>> the trigger and signal sources do not appear in this fd set.
>> Their setup functions would have to have called g_source_add_poll()
>> somewhere, which they don't.
>>
>> So I'm still not seeing why its a trigger or signal sources' fault
>> that glib is doing a never ending call to poll().
>> poll() is going to get called regardless of whether our prepare
>> function returns true or not.
>>
>> Looking closer, crm_trigger_prepare() returning TRUE results in:
>> ready_source->flags |= G_SOURCE_READY;
>>
>> which in turn causes:
>> context->timeout = 0;
>>
>> which is essentially what adding
>> if (trig->trigger)
>> *timeout = 0;
>>
>> to crm_trigger_prepare() was intended to achieve.
>>
>> Shouldn't the fd, ipc or wait sources (who do call g_source_add_poll()
>> and could therefor cause poll() to block forever) have a sane timeout
>> in their prepare functions?
>> Or is it because the signal itself is interrupting some essential part
>> of G_CH_prepare_int() and friends?
>>
>>>
>>> for s in all GSource
>>> if s->check(s)
>>> s->dispatch(s, ...)
>>>
>>> And at some stage it also orders by priority, of course.
>>>
>>> Also compare with the comment above /* Sigh... */ in glue G_SIG_prepare().
>>>
>>> BTW, the mentioned race between signal delivery and mainloop already
>>> doing the poll stage could potentially be solved by using
>>
>> Again, since nothing related to the signal source ever appears in the
>> call to poll(), I'm not seeing where the race comes from.
>> Or am I missing something obvious?
>>
>>> cl_signal_set_interrupt(SIGTERM, 1),
>>
>> This, combined with
>>
>> /*
>> * If we don't set this on, then the mainloop poll(2) call
>> * will never be interrupted by this signal - which sort of
>> * defeats the whole purpose of a signal handler in a
>> * mainloop program
>> */
>> cl_signal_set_interrupt(signal, TRUE);
>>
>> looks more relevant.
>> But I can't escape the feeling that calling this just masks the
>> underlying "why is there a never-ending call to poll() in the first
>> place" issue.
>> G_CH_prepare_int() and friends /should/ be setting timeouts so that
>> poll() can return and any sources created by g_idle_source_new() can
>> execute.
>
> Actually, thinking further, I'm pretty convinced that poll() with an
> infinite timeout is the default mode of operation for mainloops with
> cluster-glue's IPC and FD sources.
> And that this is not a good thing :)
Far too late, brain shutting down.
...not a good thing, because it breaks the idle stuff, but most of all
because it requires /all/ external events to come out of that poll()
call.
Thats why its important that "the mainloop poll(2) call will never be
interrupted by this signal" behaviour had to be fudged around by the
call to siginterrupt().
>>> which would mean we can condense the prepare to
>>> if (trig->trigger)
>>> *timeout = 0;
>>> return trig->trigger;
>>>
>>> Glue (and heartbeat) code base is not that, let's say, involved,
>>> because someone had been paranoid.
>>> But because someone had been paranoid for a reason ;-)
Not sure this is an excellent example of that to be honest.
>>>
>>> Cheers,
>>>
>>> --
>>> : Lars Ellenberg
>>> : LINBIT | Your Way to High Availability
>>> : DRBD/HA support and consulting http://www.linbit.com
>>>
>>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list