[Pacemaker] Behavior of booth when the fail-over in nodes and in sites is caused at the same time

Jiaju Zhang jjzhang at suse.de
Thu Jun 21 04:22:09 EDT 2012


On Thu, 2012-06-21 at 16:40 +0900, Yuichi SEINO wrote:
> Hi Jiaju,
> 
> I have a question about booth.
> I structure 2 sites and 1 arbitrator. And each site consist of 2
> node(ACT node and STB node).
> Firstly, I kill corosync in one node. This node have a booth and a ticket.
> Then booth is fail-over to STB node. The site including this node
> always have a ticket.
> However, an another site is granted at the same time.  Therefore, 2
> sites was granted.
> 
> I guess that sites cause fail-over because of a ticket was out of
> expire date in synchronized timing.
> However, I not see any reason to be granted 2 sites.  Are you correct
> this behavior?
> 
> A following information is each site of ticket information after this
> case was caused.
> 
> siteA:<ticket_state id="ticketA" owner="2" expires="1340255934"
> ballot="2" granted="true" last-granted="1340255142"/>
> siteB:<ticket_state id="ticketA" owner="2" expires="1340255933"
> ballot="2" granted="true" last-granted="1340255441"/>
> siteC(arbitrator):<ticket_state id="ticketA" owner="2"
> expires="1340255934" ballot="2" granted="false"/>

I think I've not quite understood what happened there. From this ticket
information, Both siteA and siteB think ticket owner is "2", which is
consistent, so why you say ticket was granted on two sites?
For siteC, it has not gotten the last-granted for the time being, it
should get this information shortly, or even for some reason siteC
cannot get it, because the algorithm is majority-based, so it is no harm
and allowable;)

Or I have not gotten the full picture of this problem? If so, you can
report a bug to bugzilla.novell.com, attaching the full logs there, then
it will be good for us to investigate;)

Thanks,
Jiaju





More information about the Pacemaker mailing list