[Pacemaker] Cluster split brain on vmware VSphere

Torresani, Roberto roberto.torresani at unitn.it
Wed Jun 9 10:11:09 UTC 2010


Well... it seem to be SOLVED!!!
Thank you Dejan.
In the next few days I will load the cluster and then see how it behaves.

I simply raise the token value to 10000 msec, leave all the others to the defaults.

Thank you again.
Regards,
Roberto

 

> -----Original Message-----
> From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm] 
> Sent: Tuesday, June 08, 2010 6:42 PM
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
> 
> Hi,
> 
> On Mon, Jun 07, 2010 at 02:57:57PM +0200, Torresani, Roberto wrote:
> > Sorry for have choosen the wrong ml... 
> 
> That's no problem. There's just better chance of getting help on
> the other list.
> 
> > Here the corosync.conf used by one cluster, the other one is
> > just the same provided by the epel repository packages.
> > 
> > I will try to raise the token value to 10000 as you suggest. Is
> > there a theoretical or a best practice to set this value ?
> 
> No, but 5000 should be OK for most. Ultimately, it depends on
> your network. I forgot what was exactly the case here, but it
> seems like you had some heavy processing (backup?) which used
> most of resources. That may be really hard to predict. You can
> use sar or similar to monitor the load.
> 
> Thanks,
> 
> Dejan
> 
> > I will keep you informed as it goes, and open a thread on the
> > corosync ml if necessary.
> > 
> > Thank you.
> > 
> > 
> > # Please read the corosync.conf.5 manual page
> > compatibility: whitetank
> > 
> > totem {
> >         version: 2
> >         secauth: off
> >         threads: 0
> >         token:          1000
> >         hold: 180
> >         token_retransmits_before_loss_const: 20
> >         join:           60
> >         consensus:      4800
> >         vsftype:        none
> >         max_messages:   20
> >         interface {
> >                 ringnumber: 0
> >                 bindnetaddr: 192.168.206.0
> >                 mcastaddr: 226.94.1.1
> >                 mcastport: 5405
> >         }
> > }
> > 
> > logging {
> >         fileline: off
> >         to_stderr: yes
> >         to_logfile: yes
> >         to_syslog: yes
> >         logfile: /tmp/corosync.log
> >         debug: off
> >         timestamp: on
> >         logger_subsys {
> >                 subsys: AMF
> >                 debug: off
> >         }
> > }
> > 
> > amf {
> >         mode: disabled
> > }
> > 
> > aisexec {
> >     user:  root
> >     group: root
> > }
> > 
> > service {
> >     name: pacemaker
> >     ver: 0
> > }
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
t=Pacemaker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
t=Pacemaker
> 



More information about the Pacemaker mailing list