[Pacemaker] Cluster split brain on vmware VSphere
Koch, Sebastian
Sebastian.Koch at netzwerk.de
Fri Jun 11 14:30:21 UTC 2010
Hi,
i read your entry and you wrote that you used some adopted xen stonith scripts to get it up and running under vmware. Could you share your expierences and hjow you solved thatr issue?
Thanks in advance.
Sebastian Koch
-----Ursprüngliche Nachricht-----
Von: Torresani, Roberto [mailto:roberto.torresani at unitn.it]
Gesendet: Freitag, 11. Juni 2010 15:22
An: The Pacemaker cluster resource manager
Betreff: Re: [Pacemaker] Cluster split brain on vmware VSphere
> -----Original Message-----
> From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm]
> Sent: Wednesday, June 09, 2010 2:23 PM
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
>
> Hi,
>
> On Wed, Jun 09, 2010 at 12:11:09PM +0200, Torresani, Roberto wrote:
> > Well... it seem to be SOLVED!!!
> > Thank you Dejan.
> > In the next few days I will load the cluster and then see
> how it behaves.
> >
> > I simply raise the token value to 10000 msec, leave all the
> others to the defaults.
>
> You should also raise the consensus value to 12000. corosync
> would even refuse to start in this case.
Yes, I simply leave corosync to determine the value as 1.2*token=12000
Thank you again
Best regards
Roberto
>
> Thanks,
>
> Dejan
>
> >
> > Thank you again.
> > Regards,
> > Roberto
> >
> >
> >
> > > -----Original Message-----
> > > From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm]
> > > Sent: Tuesday, June 08, 2010 6:42 PM
> > > To: The Pacemaker cluster resource manager
> > > Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
> > >
> > > Hi,
> > >
> > > On Mon, Jun 07, 2010 at 02:57:57PM +0200, Torresani,
> Roberto wrote:
> > > > Sorry for have choosen the wrong ml...
> > >
> > > That's no problem. There's just better chance of getting help on
> > > the other list.
> > >
> > > > Here the corosync.conf used by one cluster, the other one is
> > > > just the same provided by the epel repository packages.
> > > >
> > > > I will try to raise the token value to 10000 as you suggest. Is
> > > > there a theoretical or a best practice to set this value ?
> > >
> > > No, but 5000 should be OK for most. Ultimately, it depends on
> > > your network. I forgot what was exactly the case here, but it
> > > seems like you had some heavy processing (backup?) which used
> > > most of resources. That may be really hard to predict. You can
> > > use sar or similar to monitor the load.
> > >
> > > Thanks,
> > >
> > > Dejan
> > >
> > > > I will keep you informed as it goes, and open a thread on the
> > > > corosync ml if necessary.
> > > >
> > > > Thank you.
> > > >
> > > >
> > > > # Please read the corosync.conf.5 manual page
> > > > compatibility: whitetank
> > > >
> > > > totem {
> > > > version: 2
> > > > secauth: off
> > > > threads: 0
> > > > token: 1000
> > > > hold: 180
> > > > token_retransmits_before_loss_const: 20
> > > > join: 60
> > > > consensus: 4800
> > > > vsftype: none
> > > > max_messages: 20
> > > > interface {
> > > > ringnumber: 0
> > > > bindnetaddr: 192.168.206.0
> > > > mcastaddr: 226.94.1.1
> > > > mcastport: 5405
> > > > }
> > > > }
> > > >
> > > > logging {
> > > > fileline: off
> > > > to_stderr: yes
> > > > to_logfile: yes
> > > > to_syslog: yes
> > > > logfile: /tmp/corosync.log
> > > > debug: off
> > > > timestamp: on
> > > > logger_subsys {
> > > > subsys: AMF
> > > > debug: off
> > > > }
> > > > }
> > > >
> > > > amf {
> > > > mode: disabled
> > > > }
> > > >
> > > > aisexec {
> > > > user: root
> > > > group: root
> > > > }
> > > >
> > > > service {
> > > > name: pacemaker
> > > > ver: 0
> > > > }
> > > > _______________________________________________
> > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > > Project Home: http://www.clusterlabs.org
> > > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > Bugs:
> > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> > t=Pacemaker
> > >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs:
> > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> > t=Pacemaker
> > >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
t=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
t=Pacemaker
>
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list