[Pacemaker] [Question]About the recovery procedure from the state that a node was divided.

Mon Nov 15 08:30:46 UTC 2010

Hi Andrew,

> > If there is not a procedure of Step3, I think that the bug that I reported before is easy to
> occur.
> > &#65533;* http://developerbugs.linux-foundation.org/show_bug.cgi?id=2508
> >
> > I think that this bug influences that a procedure of step3 is necessary.
> 
> Hopefully we'll get that bug fixed soon :-)
> 
> >
> >
> >> Hope that answers your question.
> >
> > Thanks.
> > If a procedure of Step3 is not necessary, I think that it is splendid.
> 
> That is my goal.  If there are any bugs that require this, lets make
> sure we get them fixed.

All right.
I wait for the revision of the bug.
And I report it if there is a problem for your goal.

Best Regards,
Hideo Yamauchi.

--- Andrew Beekhof <andrew at beekhof.net> wrote:

> On Mon, Nov 15, 2010 at 9:09 AM,  <renayama19661014 at ybb.ne.jp> wrote:
> > Hi Andrew,
> >
> > Thank you for comment.
> >
> >> > &#65533;Step3) Make "/var/lib/heartbeat/crm/" clean.
> >> > &#65533; &#65533; &#65533; &#65533;Make it clean in all nodes
> >> > &#65533;Step4) Start all four nodes.
> >> > &#65533;Step5) Send cib information to a cluster.
> >> > &#65533;Step6) A cluster is rebuilt.
> >> >
> >> >
> >> > We do not want to take the second method.
> >> > Because, all resources stop when we take second method.
> >> >
> >> > Is not there a problem in the first method that we took?
> >>
> >> Step 3 should not be necessary, but otherwise there is nothing wrong
> >> with the first method.
> >> That usage is essentially what it was designed for.
> >
> > Really?
> >
> > If there is not a procedure of Step3, I think that the bug that I reported before is easy to
> occur.
> > &#65533;* http://developerbugs.linux-foundation.org/show_bug.cgi?id=2508
> >
> > I think that this bug influences that a procedure of step3 is necessary.
> 
> Hopefully we'll get that bug fixed soon :-)
> 
> >
> >
> >> Hope that answers your question.
> >
> > Thanks.
> > If a procedure of Step3 is not necessary, I think that it is splendid.
> 
> That is my goal.  If there are any bugs that require this, lets make
> sure we get them fixed.
> 
> > I examine a problem a little more and report it.
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> > --- Andrew Beekhof <andrew at beekhof.net> wrote:
> >
> >> On Thu, Nov 4, 2010 at 2:44 AM, &#65533;<renayama19661014 at ybb.ne.jp> wrote:
> >> > Hi All,
> >> >
> >> > We tested it about the recovery procedure from the state that a node was divided.
> >> > (As for four nodes, three nodes are active, and one node is constitution of the standby.)
> >> >
> >> > It is the restoration from a state divided by two nodes that we set in
> >> no-quorum-policy="freeze".
> >> >
> >> > The resource keeps a state as is after it was divided in the case of freeze setting.
> >> > (We tested it using special RA to evade that recognition of the division of the node of ccm
> >> was late
> >> > in Heartbeat.)
> >> >
> >> >
> >> > We confirmed some patterns to recovery.
> >> > And we thought that the next method was desirable.
> >> >
> >> > * The first method. (By this method, all resources do not stop.)
> >> > &#65533;Step1) Stop all the divided nodes of the one side.
> >> > &#65533;Step2) Break off the problem that a node divided.(For example, change a network
> card.)
> >> > &#65533;Step3) Make "/var/lib/heartbeat/crm/" clean.
> >> > &#65533; &#65533; &#65533; &#65533;Make it clean in the node that stopped.
> >> > &#65533;Step4) Start two nodes that stopped.
> >> > &#65533;Step5) A cluster is rebuilt.
> >> >
> >> > * The second method. (But, all resources stop when we take this method)
> >> > &#65533;Step1) Stop all four nodes.
> >> > &#65533;Step2) Break off the problem that a node divided.(For example, change a network
> card.)
> >> > &#65533;Step3) Make "/var/lib/heartbeat/crm/" clean.
> >> > &#65533; &#65533; &#65533; &#65533;Make it clean in all nodes
> >> > &#65533;Step4) Start all four nodes.
> >> > &#65533;Step5) Send cib information to a cluster.
> >> > &#65533;Step6) A cluster is rebuilt.
> >> >
> >> >
> >> > We do not want to take the second method.
> >> > Because, all resources stop when we take second method.
> >> >
> >> > Is not there a problem in the first method that we took?
> >>
> >> Step 3 should not be necessary, but otherwise there is nothing wrong
> >> with the first method.
> >> That usage is essentially what it was designed for.
> >>
> >> Hope that answers your question.
> >>
> >> >
> >> > Is there a method to recommend by a recovery method of the division from freeze setting as
> >> community?
> >> >
> >> > Best Regards,
> >> > Hideo Yamauchi.
> >> >
> >> >
> >> > _______________________________________________
> >> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> >
> >> > Project Home: http://www.clusterlabs.org
> >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >> >
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >>
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>