<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jun 25, 2014 at 1:28 AM, Andrew Beekhof <span dir="ltr"><<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><br>
> SO it seems at midnight the resource already was with a failcount of 2 (perhaps caused by problems happened weeks ago..?) and then at 03:38 got a timeout on monitoring its state and was relocated...<br>
><br>
> pacemaker is at 1.1.6-1.27.26<br>
<br>
</div>I don't think the automatic reset was part of 1.1.6.<br>
The documentation you're referring to is probably SLES12 specific.<br>
<div class=""><br>
> and I see this list message that seems related:<br>
> <a href="http://oss.clusterlabs.org/pipermail/pacemaker/2012-August/015076.html" target="_blank">http://oss.clusterlabs.org/pipermail/pacemaker/2012-August/015076.html</a><br>
><br>
> Is it perhaps only a matter of setting meta parameter<br>
> failure-timeout<br>
> as explained in High AvailabilityGuide:<br>
> <a href="https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha/book_sleha.html#sec.ha.config.hawk.rsc" target="_blank">https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha/book_sleha.html#sec.ha.config.hawk.rsc</a><br>
><br>
> in particular<br>
> 5.3.6. Specifying Resource Failover Nodes<br>
> ...<br>
> 4. If you want to automatically expire the failcount for a resource, add the failure-timeout meta attribute to the resource as described in Procedure 5.4: Adding Primitive Resources, Step 7 and enter a Value for the failure-timeout.<br>
> ..<br>
> ?<br></div></blockquote></div></div><div class="gmail_extra"><br></div><div class="gmail_extra">Yes, your are right. it seems that starting from here:</div><div class="gmail_extra"><a href="https://www.suse.com/it-it/documentation/sles11/">https://www.suse.com/it-it/documentation/sles11/</a><br>
</div><div class="gmail_extra">or here</div><div class="gmail_extra"><a href="https://www.suse.com/documentation/sles11/">https://www.suse.com/documentation/sles11/</a><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">
the SLES 11 html links for "SUSE Linux Enterprise High Availability Extension Guide" erroneously point to SLES 12 anyway...</div><div class="gmail_extra">Tried to select "feedback" button at bottom but it doesn't work (at least on my chrome browser on Fedora 20) for niether the italy one not the english one... </div>
<div class="gmail_extra"><br></div><div class="gmail_extra">Going through pdf docments I already downloaded before, I still have this for SLES 11 SP2 as the system in object</div><div class="gmail_extra"><br></div><div class="gmail_extra">
"</div><div class="gmail_extra">5.3.5 Specifying Resource Failover Nodes<br></div><div class="gmail_extra">...</div><div class="gmail_extra"><div class="gmail_extra">A resource will be automatically restarted if it fails. If that cannot be achieved on the</div>
<div class="gmail_extra">current node, or it fails N times on the current node, it will try to fail over to another</div><div class="gmail_extra">node. You can define a number of failures for resources (a migration-threshold),</div>
<div class="gmail_extra">after which they will migrate to a new node. If you have more than two nodes in your</div><div class="gmail_extra">cluster, the node a particular resource fails over to is chosen by the High Availability</div>
<div class="gmail_extra">software.</div><div class="gmail_extra">However, you can specify the node a resource will fail over to by proceeding as follows:</div><div class="gmail_extra"><div class="gmail_extra">1 Configure a location constraint for that resource as described in Procedure 5.6,</div>
<div class="gmail_extra">“Adding or Modifying Locational Constraints” (page 86).</div><div class="gmail_extra">2 Add the migration-threshold meta attribute to that resource as described in</div><div class="gmail_extra">Procedure 5.3, “Adding or Modifying Meta and Instance Attributes” (page 82) and</div>
<div class="gmail_extra">enter a Value for the migration-threshold. The value should be positive and less that</div><div class="gmail_extra">INFINITY.</div><div class="gmail_extra">3 If you want to automatically expire the failcount for a resource, add the</div>
<div class="gmail_extra">failure-timeout meta attribute to that resource as described in Procedure 5.3,</div><div class="gmail_extra">“Adding or Modifying Meta and Instance Attributes” (page 82) and enter a Value</div><div class="gmail_extra">
for the failure-timeout.</div><div class="gmail_extra">4 If you want to specify additional failover nodes with preferences for a resource,</div><div class="gmail_extra">create additional location constraints.</div><div class="gmail_extra">
"</div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">So the question remains about "failure-timeout" parameter and/or other methods to solve/mitigate what I described in my first message.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">Thanks,</div><div class="gmail_extra">Gianluca</div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div></div></div></div>