[Pacemaker] booth is the state of "started" on pacemaker before booth write ticket info in cib.

Jiaju Zhang jjzhang at suse.de
Mon Jan 21 02:30:34 EST 2013


Hi Yuichi,

On Fri, 2013-01-18 at 17:02 +0900, Yuichi SEINO wrote:
> Hi Jiaju,
> 
> I try fixing this issue by reverting a commit. What do you think about it?
> https://github.com/jjzhang/booth/pull/48

Moving the while setup stage before daemonizing seems not to be a sane
solution. setup_ticket() needs to get the latest ticket information by
communicating with other nodes. Currently it was there and using TCP,
but long term and sane solution would be to move it to the main poll(),
asynchronously waiting for catch-up result. Before catching-up was
ready, booth can still response, it can participate in Paxos as a
non-voting member.

To fix this issue, how do you think if we remove the stale ticket
information in the CIB once booth was starting? We already have the APIs
in pacemaker.c which can clear the ticket information in the CIB. This
step is reasonable because the tickets at that moment is really stale
data.

About the implementation, I have not thought it in very detail but one
idea that came into my mind is that maybe we can expand lockfile() (or
some wrapper to lockfile()) to let it do more things, not only record
the daemon pid, but also record daemon starting status, like "starting",
"started", thus, the controld RA can read that status and return more
precise result.

I'll have Xia to look into this problem in more detail.

Thanks,
Jiaju






More information about the Pacemaker mailing list