<div dir="ltr">Hi,<div><br></div><div>Thanks for the help Andrew! It turns out that I mistakenly started the f5 agent&#39;s unix service on all three nodes before adding its resource to pacemaker, and this was causing the above errors. Once I ensured that only one service was brought up (on the node on which I added it as a resource to pacemaker), I didn&#39;t see these errors anymore and failover worked fine as well.</div>

<div><br></div><div>Regards,</div><div>Vijay</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jul 2, 2014 at 12:57 AM, Andrew Beekhof <span dir="ltr">&lt;<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">1.1.6 is really too old<br>

in any case, rc=5 &#39;not installed&#39; means we cant find an init script of that name in /etc/init.d<br>

<div><div class="h5"><br>

On 2 Jul 2014, at 2:07 pm, Vijay B &lt;<a href="mailto:os.vbvs@gmail.com">os.vbvs@gmail.com</a>&gt; wrote:<br>

<br>

&gt; Hi,<br>

&gt;<br>

&gt; I&#39;m puppetizing resource deployment for pacemaker and corosync, and as part of it, am creating a resource on one of three nodes of a cluster. The problem is that I&#39;m seeing RecurringOp errors during resource creation, which are probably not allowing failover a resource. The resource creation seems to go through fine, but these recurringOp errors always result after resource creation (I&#39;m pasting outputs of two different commands below):<br>


&gt;<br>

&gt;<br>

&gt; ***************************<br>

&gt; vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo crm status<br>

&gt; ============<br>

&gt; Last updated: Wed Jul  2 03:52:30 2014<br>

&gt; Last change: Wed Jul  2 03:38:20 2014 via cibadmin on precise64b<br>

&gt; Stack: cman<br>

&gt; Current DC: precise64b - partition with quorum<br>

&gt; Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c<br>

&gt; 3 Nodes configured, unknown expected votes<br>

&gt; 3 Resources configured.<br>

&gt; ============<br>

&gt;<br>

&gt; Online: [ precise64b precise64c precise64a ]<br>

&gt;<br>

&gt;  f5-lbaas-agent-10.6.143.121_resource (lsb:f5-lbaas-agent-10.6.143.121):      Started precise64c<br>

&gt;  f5-lbaas-agent-10.6.143.122_resource (lsb:f5-lbaas-agent-10.6.143.122):      Started precise64b<br>

&gt;  f5-lbaas-agent-10.6.143.123_resource (lsb:f5-lbaas-agent-10.6.143.123):      Started precise64b<br>

&gt;<br>

&gt; Failed actions:<br>

&gt;     f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64b, call=2, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64b, call=3, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64c, call=7, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64c, call=8, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64a, call=2, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64a, call=3, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64a, call=4, rc=5, status=complete): not installed<br>

&gt;     f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64a, call=5, rc=5, status=complete): not installed<br>

&gt; vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$<br>

&gt;<br>

&gt;<br>

&gt; ***************************<br>

&gt;<br>

&gt; vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo crm_verify -L -V<br>

&gt; crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring action f5-lbaas-agent-10.6.143.121_resource-start-10 wth name: &#39;start&#39;<br>

&gt; crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring action f5-lbaas-agent-10.6.143.121_resource-stop-10 wth name: &#39;stop&#39;<br>

&gt; crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring action f5-lbaas-agent-10.6.143.122_resource-start-10 wth name: &#39;start&#39;<br>

&gt; crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring action f5-lbaas-agent-10.6.143.122_resource-stop-10 wth name: &#39;stop&#39;<br>

&gt; Errors found during check: config not valid<br>

&gt; vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$<br>

&gt; ***************************<br>

&gt;<br>

&gt;<br>

&gt; What do these errors signify? I found one email exchange on a pacemaker ML that suggested that we shouldn&#39;t be using start intervals and timeouts, and same with stop, since that would mean that pacemaker would attempt to restart the resource every x seconds, timeout every y seconds, and repeat that. (Link: <a href="http://lists.linbit.com/pipermail/drbd-user/2011-September/016938.html" target="_blank">http://lists.linbit.com/pipermail/drbd-user/2011-September/016938.html</a>)<br>


&gt;<br>

&gt; My understanding was that the start interval would apply in case of restart attempts upon detection of a resource as being down. Nevertheless, I removed these parameters and created a third resource (the first two, I created with these parameters), and I still see the same monitor related errors for the third resource (f5-lbaas-agent-10.6.143.123_resource_monitor_0) in the sudo crm status command output. I don&#39;t however understand why this resource doesn&#39;t show up in the crm_verify -L -V output.<br>


&gt;<br>

&gt; Here are the two CLIs I use to create the resources:<br>

&gt;<br>

&gt; sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op monitor interval=&quot;$mon_interval&quot; timeout=&quot;$mon_timeout&quot; op start interval=&quot;$start_interval&quot; timeout=&quot;$start_timeout&quot; op stop interval=&quot;$stop_interval&quot; timeout=&quot;$stop_timeout<br>


&gt;<br>

&gt;<br>

&gt; sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op monitor interval=&quot;$mon_interval&quot; timeout=&quot;$mon_timeout&quot;<br>

&gt;<br>

&gt;<br>

&gt; The bottom-line is that if I halt the VM running any of these resources, the resource isn&#39;t failing over to another VM. I&#39;m not sure what the exact cause is - any help would be greatly appreciated!<br>

&gt;<br>

&gt;<br>

&gt; Thanks,<br>

&gt; Regards,<br>

&gt; Vijay<br>

</div></div>&gt; _______________________________________________<br>

&gt; Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

&gt; <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

&gt;<br>

&gt; Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

&gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

&gt; Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

<br>_______________________________________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br></blockquote></div><br></div>