<div dir="ltr"><div><div>Every 15-18 minutes one of my resources gets stopped on one node and then is restarted shortly after. <br><br></div>In the DC log I can see the following error lines. <br><br>Dec 28 15:04:09 app01 pengine: [8618]: debug: clone_rsc_colocation_rh: Pairing resOCFS:1 with groupOcfs2Mgmt:0<br>
Dec 28 15:04:09 app01 pengine: [8618]: debug: native_assign_node: Assigning app02 to resOCFS:1<br>Dec 28 15:04:09 app01 pengine: [8618]: ERROR: color_instance: Pre-allocation failed: got app02 instead of app01<br>Dec 28 15:04:09 app01 pengine: [8618]: info: native_deallocate: Deallocating resOCFS:1 from app02<br>
Dec 28 15:04:09 app01 pengine: [8618]: debug: clone_rsc_colocation_rh: Pairing resOCFS:0 with groupOcfs2Mgmt:0<br>Dec 28 15:04:09 app01 pengine: [8618]: debug: native_assign_node: Assigning app02 to resOCFS:0<br>Dec 28 15:04:09 app01 pengine: [8618]: debug: clone_rsc_colocation_rh: Pairing resOCFS:1 with groupOcfs2Mgmt:1<br>
Dec 28 15:04:09 app01 pengine: [8618]: debug: clone_rsc_colocation_rh: Pairing resOCFS:1 with groupOcfs2Mgmt:1<br>Dec 28 15:04:09 app01 pengine: [8618]: debug: native_assign_node: All nodes for resource resOCFS:1 are unavailable, unclean or shutting down (app01: 1, -1000000)<br>
Dec 28 15:04:09 app01 pengine: [8618]: debug: native_assign_node: Could not allocate a node for resOCFS:1<br>Dec 28 15:04:09 app01 pengine: [8618]: info: native_color: Resource resOCFS:1 cannot run anywhere<br><br></div><div>
This plays out before every stop event of OCFS. <br></div><div><br></div><div>Here is the cib. <br><br>primitive VirtualIP0 ocf:heartbeat:IPaddr2 \<br>        params ip=&quot;10.121.12.30&quot; \<br>        op monitor interval=&quot;10s&quot; \<br>
        meta target-role=&quot;Started&quot;<br>primitive resDLM ocf:pacemaker:controld<br>primitive resDrbdShared0 ocf:linbit:drbd \<br>        params drbd_resource=&quot;shared0&quot; \<br>        operations $id=&quot;resDrbd-operations&quot; \<br>
        op monitor interval=&quot;20&quot; role=&quot;Master&quot; timeout=&quot;20&quot; notify=&quot;true&quot; \<br>        op monitor interval=&quot;30&quot; role=&quot;Slave&quot; timeout=&quot;20&quot; notify=&quot;true&quot;<br>
primitive resJboss lsb:jboss4 \<br>        op monitor interval=&quot;120s&quot; timeout=&quot;150s&quot; \<br>        op start interval=&quot;0&quot; timeout=&quot;150s&quot; \<br>        op stop interval=&quot;0&quot; timeout=&quot;150s&quot;<br>
primitive resO2CB ocf:pacemaker:o2cb<br>primitive resOCFS ocf:heartbeat:Filesystem \<br>        params device=&quot;/dev/drbd/by-res/shared0&quot; directory=&quot;/data&quot; fstype=&quot;ocfs2&quot; \<br>        op monitor interval=&quot;120s&quot; timeout=&quot;40&quot; \<br>
        op start interval=&quot;0&quot; timeout=&quot;60&quot; \<br>        op stop interval=&quot;0&quot; timeout=&quot;60&quot;<br>group groupOcfs2Mgmt resDLM resO2CB<br>ms msDrbdShared0 resDrbdShared0 \<br>        meta resource-stickines=&quot;100&quot; notify=&quot;true&quot; interleave=&quot;true&quot; master-max=&quot;2&quot; target-role=&quot;Started&quot;<br>
clone cloneJboss resJboss \<br>        meta interleave=&quot;true&quot; ordered=&quot;true&quot; is-managed=&quot;false&quot; target-role=&quot;Started&quot;<br>clone cloneOCFS resOCFS \<br>        meta interleave=&quot;true&quot; ordered=&quot;true&quot; target-role=&quot;Started&quot; is-managed=&quot;true&quot;<br>
clone cloneOcfs2Mgmt groupOcfs2Mgmt \<br>        meta interleave=&quot;true&quot; target-role=&quot;Started&quot;<br>location locVirtualIP0 VirtualIP0 9001: app01<br>colocation colDRBD inf: cloneOcfs2Mgmt msDrbdShared0:Master<br>
colocation colOcfs2 inf: cloneOCFS cloneOcfs2Mgmt<br>order ordDRBD inf: msDrbdShared0:promote cloneOcfs2Mgmt:start<br>order ordOcfs2 inf: cloneOcfs2Mgmt:start cloneOCFS:start<br>property $id=&quot;cib-bootstrap-options&quot; \<br>
        dc-version=&quot;1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff&quot; \<br>        cluster-infrastructure=&quot;openais&quot; \<br>        expected-quorum-votes=&quot;2&quot; \<br>        stonith-enabled=&quot;false&quot; \<br>
        no-quorum-policy=&quot;ignore&quot; \<br>        last-lrm-refresh=&quot;1356702541&quot;<br>rsc_defaults $id=&quot;rsc-options&quot; \<br>        resource-stickiness=&quot;0&quot;<br>op_defaults $id=&quot;op-options&quot; \<br>
        timeout=&quot;20s&quot;<br><br></div><div>I first suspected wrong network name resolution but /etc/hosts is correct and no duplicate names. <br clear="all"></div><div><div><div><div><br>-- <br>Hälsningar / Greetings<br>
<br>Stefan Midjich<br>[De omnibus dubitandum]
</div></div></div></div></div>