[Pacemaker] failcount always resets at 15m mark regardless of cluster-recheck-interval

Thu May 22 20:30:15 EDT 2014

Thanks for the reply,  I figured it out.  I was setting those resources
like this:

pcs resource defaults cluster-rescheck-interval=15s

But that wasn't getting applied to existing resources.  Setting it
explicitly for my pre-existing resource like this fixed the problem:

pcs resource update my_resource cluster-rescheck-interval=15s

On Thu, May 22, 2014 at 3:27 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> On 22 May 2014, at 5:36 pm, David Nguyen <d_k_nguyen at yahoo.com> wrote:
>
> > Hi all,
> >
> > I'm having the following problem.  I have the following settings for
> testing purposes:
> >
> > migration-threshold=1
> > failure-timeout=15s
> > cluster-recheck-interval=30s
> >
> > and verified those are in the running config via cibadmin --query
>
> can we see that output?
>
> >
> > The issue is that even with failure-timeout and cluster-recheck-interval
> set, I've noticed that failcount resets at the default value of minutes.
> >
> > The way I tested this was to force a resource failure on both nodes (2
> node cluster), then watch syslog and sure enough, the service rights itself
> after the 15minute mark.
> >
> > May 22 00:09:22 sac-prod1-ops-web-09 crmd[16843]:   notice:
> do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [
> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
> >
> > May 22 00:24:22 sac-prod1-ops-web-09 crmd[16843]:   notice:
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [
> input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
> >
> >
> > Any ideas what I'm doing wrong here?  I would like failcount to reset
> much faster
> >
> >
> > My setup:
> >
> > 2 node centos6.5
> > pacemaker-1.1.10-14.el6_5.3.x86_64
> > corosync-1.4.1-17.el6_5.1.x86_64
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140522/d1539cc6/attachment-0003.html>