[Pacemaker] change sbd watchdog timeout in a running cluster

Tue Mar 26 17:22:07 UTC 2013

Hello Lars

what timeout you recommend me

Thanks a lot

2013/3/26 Lars Marowsky-Bree <lmb at suse.com>

> On 2013-03-26T17:13:34, emmanuel segura <emi2fast at gmail.com> wrote:
>
> > Hello Lars
> >
> > Because we have a vm(suse 11) cluster on a esx cluster, as datastore we
> are
> > using a netapp in cluster, the last night we had a netapp failover, no
> > problem with other vm servers, but all vm in cluster with pacemaker+sbd
> get
> > has rebooted
> >
> > This beacuse the watchdog time is 5 seconds
>
> To protect against that, you should use multiple disks. As long as the
> majority of them remains within the latency limits, you will not
> experience a fail-over.
>
> Admittedly, 5s is on the short side for these. But 90s for watchdog
> means you'll end up with 120+ seconds for msgwait, meaning all
> fail-overs will be delayed accordingly. That's not going to be helpful.
>
> And yes, you need to increase stonith-timeout to be approx. 50% larger
> than msgwait, at least.
>
>
>
> Regards,
>     Lars
>
> --
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix
> Imendörffer, HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>


-- 
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130326/0a2fe5e9/attachment.htm>