[Pacemaker] process not getting started after a failure

Wed Jul 23 01:27:07 UTC 2014

On 22 Jul 2014, at 7:01 pm, ESWAR RAO <eswar7028 at gmail.com> wrote:

> 
> Hi All,
> 
> I have a 3 node cluster (node1,node2,node3).
> 
> The oc_pluginhandler resource is running in clone mode on 2 nodes as:
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> primitive oc_pluginhandler lsb:pluginhandler \
>         meta allow-migrate="true" migration-threshold="10" failure-timeout="300s" \
>         op monitor interval="5s" timeout="120s"
> clone oc_nvp_app_clone_pluginhandler oc_pluginhandler \
>         meta clone-max="2" globally-unique="false" interleave="true"
> location nvp_prefer_node_pluginhandler oc_nvp_app_clone_pluginhandler -inf: node3
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> Jul 17 16:32:12 node2 lrmd: [12219]: info: rsc:oc_pluginhandler:1 monitor[53] (pid 32144)
> Jul 17 16:32:12 node2 lrmd: [12219]: info: operation monitor[53] on oc_pluginhandler:1 for client 12222: pid 32144 exited with return code 7
> (mapped from 3)
> 
> Jul 17 17:32:17 node2 lrmd: [12219]: info: rsc:oc_pluginhandler:1 monitor[53] (pid 12412)
> Jul 17 17:32:17 node2 lrmd: [12219]: info: operation monitor[53] on oc_pluginhandler:1 for client 12222: pid 12412 exited with return code 7
> (mapped from 3)
> 
> Jul 17 18:32:19 node2 lrmd: [12219]: info: rsc:oc_pluginhandler:1 monitor[53] (pid 25174)
> Jul 17 18:32:19 node2 lrmd: [12219]: info: operation monitor[53] on oc_pluginhandler:1 for client 12222: pid 25174 exited with return code 7
> (mapped from 3)
> 
> 
> For some reason even though pluginhandler is not running on node2, the node2 didn't intimate the  DC node(which was node1) .

I dont understand what you mean here.

> So the process didn't get started on node2.
> crm status shows that the process is running.

But we can't see that.
Can you show us the commands you ran, their outputs, the cluster status and what you find unexpected please?

> 
> If I kill other processes on node2, then they are getting restarted correctly.
> 
> Can someone please help how can I debug why that resource wasn't getting restarted??????
> 
> Thanks
> Eswar
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140723/3c58947c/attachment-0004.sig>